Extract columns where header matches a given string


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract columns where header matches a given string
# 8  
Old 03-17-2011
here's the output I get - looks fine to me - unless I'm missing something in your explanation:
Code:
I Name 104069 104069 109706 109706 113889 113889
M 10080_CO B B B B B B
M 10068_CO A A A A A A
M 12187_ND B B B B B B
M GGA_0061 A B B B A A
M GGA_0013 A B A B B B
M GGA_0024 A A A B A A
M GGA_0025 B B B B B B

# 9  
Old 03-18-2011
Interesting. that it's worked for you both, but not for me. I've tried to run your examples as shell scripts or using the awk -F option and I get really weird results if any results at all? However, when I tried to run the stand alone awk code that sk1418 gave as a second example it works fine on the small test example. I'll try to run this on the larger file and look into why the scripts aren't working when I use .awk and .sh
Thank you very much for your help!

---------- Post updated 03-18-11 at 03:30 PM ---------- Previous update was 03-17-11 at 06:17 PM ----------

Hi,

I've been mucking around with your solutions to my problem today and have noticed that in both cases the output is ordered from greatest to smallest. Is there anyway to keep the original order from file1.txt? It's not exactly clear to me where the sorting of id's occurs?

Thanks again.
# 10  
Old 03-18-2011
hi, check my output in previous post, the output did keep the file1 order, didn't it? what did u mean " the output is ordered from greatest to smallest."?
# 11  
Old 03-18-2011
Could this help you?
Code:
 awk 'NR==FNR{a[$1]++;next} {if(FNR==1){for(i=1;i<=NF;i++){if(a[$i]){printf $i" ";b[i]=$i}}}else{printf "\n";for(j=1;j<=NF;j++){if(b[j]) {printf $j" "}}}}END {printf "\n"}' file1 file2


Last edited by pravin27; 03-18-2011 at 06:57 PM..
# 12  
Old 03-18-2011
Hi,
Given the following input files:

file1.txt

Code:
41109297 
41109706 
43162207
41109808
41109377
41110441
41111192
43163011
43162367

file2.txt

Code:
I Name    41109297 41109297 41109706 41109706 41110441 41110441 41111192 41111192 41112086 41112086 41113889 41113889 41114003 41114003 41114656 41114656 41115162 41115162 41115561 41115561 41115979 41115979 41116248 41116248 41130607 41130607 41130611 41130611 41131240 41131240 41132167 41132167 41133800 41133800 41134462 41134462 41134623 41134623 42135335 42135335 42137664 42137664 42143490 42143490 42144170 42144170 42144339 42144339 42144650 42144650 42145389 42145389 42146088 42146088 42146090 42146090 42146879 42146879 42148154 42148154 43161219 43161219 43162207 43162207 43163011 43163011 43163878 43163878 43164830 43164830 43165768 43165768 43166228 43166228 43166330 43166330 43167557 43167557 43180900 43180900 43181675 43181675 43182287 43182287 43184255 43184255 43184401 43184401 
M 1080_COI    B B B B B B B B B B B B B B B B B B B B B B B B 0 0 B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B 
M 10668_CO    B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B 
M 1218_ND    B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B 
M 1546_CY    B B B B B B B B B B B B B B B B B B B B B B B B 0 0 B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B 
M 1626_ND    B B B B B B B B B B B B B B B B B B B B B B B B 0 0 B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B 
M 1637_ND    B B B B B B B B B B B B B B B B B B B B B B B B A A B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B 
M 5831_ND2    A A A A A A A A A A A A A A A A A A A A A A A A 0 0 A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A 
M 8472_CO2    B B B B B B B B B B B B B B B B B B B B B B B B 0 0 B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B B 
M GGal006    A A A A A B A B A B A A A A A B A B A A A B A B A A A B A B A A A B B B B B A B A A B B A B A A A B A B A A A A B B A B A B A B A A B B A B A B B B A B A B A A A B A B A B A A

I get the following output:

@sk1418 & vgersh99

Code:
41109297 41109297 41109706 41109706 41110441 41110441 41111192 41111192 43162207 43162207 43163011 43163011 
B B B B B B B B B B B B 
B B B B B B B B B B B B 
B B B B B B B B B B B B 
B B B B B B B B B B B B 
B B B B B B B B B B B B 
B B B B B B B B B B B B 
A A A A A A A A A A A A 
B B B B B B B B B B B B 
A A A A A B A B A B A A

@pravin27
Code:
41110441 41110441 41111192 41111192 43162207 43162207 43163011 43163011 
B B B B B B B B 
B B B B B B B B 
B B B B B B B B 
B B B B B B B B 
B B B B B B B B 
B B B B B B B B 
A A A A A A A A 
B B B B B B B B 
A B A B A B A A

In the first case it seems like it's using file2 to order the output? Any thoughts on how I can keep the output in the order of file1?

Last edited by vgersh99; 03-18-2011 at 06:49 PM.. Reason: fixed the code tagging.
# 13  
Old 03-18-2011
Quote:
Originally Posted by flotsam
Hi,

I've been mucking around with your solutions to my problem today and have noticed that in both cases the output is ordered from greatest to smallest. Is there anyway to keep the original order from file1.txt? It's not exactly clear to me where the sorting of id's occurs?

Thanks again.
file1 defines the list of columns to be matched.
file2's header lines defines the column names.
The script simply matches the the list from file1 to the header names in file2 in WHATEVER order the columns are defined in files2.

There's no "sorting" that's being done. The 'valid/matched' columns from file2 are being output in the SAME original order they appeared in file2.
# 14  
Old 03-18-2011
Thanks vgersh99,

I think I realised that and pointed it out in the last line of my post. I hadn't anticipated the problem when I originally posted, but require output to match the order of file1
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk add all columns if column 1 name matches

Hi - I want to add all columns if column1 name matches. TOPIC1 5 1 4 TOPIC2 3 2 1 TOPIC3 7 2 5 TOPIC1 6 3 3 TOPIC2 4 1 3 TOPIC3 9 5 4 . . . . . . . . . . . . Result should look like TOPIC1 11 4 7 TOPIC2 7 3 4 (1 Reply)
Discussion started by: oraclermanpt
1 Replies

2. UNIX for Beginners Questions & Answers

Matches columns from two different files in shell script

Hi friends, i want to compare first columns from two different files ,if equal print the file2's second column else print the zero.Please help me... file1: a b c d efile2: a 1 c 20 e 30 desired output: 1 0 20 0 30 Please use CODE tags as required by forum rules! Please post in... (1 Reply)
Discussion started by: bhaskar illa
1 Replies

3. UNIX for Beginners Questions & Answers

Extract the whole set if a pattern matches

Hi, I have to extract the whole set if a pattern matches.i have a file called input.txt input.txt ------------ CREATE TABLE ABC ( A, B, C ); CREATE TABLE XYZ ( X, Y, Z, P, Q ); (6 Replies)
Discussion started by: raju2016
6 Replies

4. UNIX for Dummies Questions & Answers

Print Matches to New Columns

Hi all, I have a problem that I'm struggling to resolve. I have two files that look like this: File 1 654654654 3 987987987 2 321321321 1 File 2 14NS0064 654654654 14NS0054 654654654 14NS0032 654654654 14NS0090 987987987 14NS0093 987987987 14NS0056 321321321 As you may notice,... (2 Replies)
Discussion started by: winkleman
2 Replies

5. Shell Programming and Scripting

Blocks of text in a file - extract when matches...

I sat down yesterday to write this script and have just realised that my methodology is broken........ In essense I have..... ----------------------------------------------------------------- (This line really is in the file) Service ID: 12345 ... (7 Replies)
Discussion started by: Bashingaway
7 Replies

6. Shell Programming and Scripting

Extract columns based on header

Hi to all, I have two files. File1 has no header, two columns: sample1 A sample2 B sample3 B sample4 C sample5 A sample6 D sample7 D File2 has a header, except for the first 3 columns (chr,start,end). "sample1" is the header for the 4th ,5th ,6th columns, "sample2" is the header... (4 Replies)
Discussion started by: aec
4 Replies

7. Shell Programming and Scripting

Merge two columns from two files into one if another column matches

I have two text files that look something like this: A:B:C 123 D:E:F 234 G:H:I 345 J:K:L 123 M:N:O 456 P:Q:R 567 A:B:C 456 D:E:F 567 G:H:I 678 J:K:L 456 M:N:O 789 P:Q:R 890 I want to find the line where the first column matches and then combine the second columns into a single... (8 Replies)
Discussion started by: pbluescript
8 Replies

8. Shell Programming and Scripting

Need awk help to print specific columns with as string in a header

awk experts, I have a big file of 4000 columns with header. Would like to print the columns with string value of "Commands" in header. File has "," separator. This file is on ESX host with Bash. Thanks, Arv (21 Replies)
Discussion started by: arv_cds
21 Replies

9. Shell Programming and Scripting

Joining columns from two files, if the key matches

I am trying to join/paste columns from two files for the rows with matching first field. Any help will be appreciated. Files can not be sorted and may not have all rows in both files. Thanks. File1 aaa 111 bbb 222 ccc 333 File2 aaa sss mmmm ccc kkkk llll ddd xxx yyy Want to... (1 Reply)
Discussion started by: sk_sd
1 Replies

10. Shell Programming and Scripting

Extract if pattern matches

Hi All, I have an input below. I tried to use the awk below but it seems that it ;s not working. Can anybody help ? My concept here is to find the 2nd field of the last occurrence of such pattern " ** XXX ccc ccc cc cc ccc 2007 " . In this case, the 2nd field is " XXX ". With this "XXX" term... (20 Replies)
Discussion started by: Raynon
20 Replies
Login or Register to Ask a Question