Find and sort by first column value


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Find and sort by first column value
# 1  
Old 01-22-2015
Find and sort by first column value

Hi,
I have two text files
file 1 with N lines

Code:
AAAAA	2.092290E-12
BBBBB	1.727740E-07
CCCCC	9.608710E-17
DDDDD	0.000000E+00
EEEEE	0.000000E+00
FFFFF	0.000000E+00
GGGGG	0.000000E+00
HHHHH	0.000000E+00
IIIII	3.300320E-04
...

The text in the first column is unique for each row and alphabetically sorted A->Z.

file 2 with M lines (M>=N)

Code:
AAAAA	text1	5.07822E-02
DDDDD	text2	8.45965E-03
CCCCC	text3	4.33704E-03
BBBBB	text4	0.00000E+00
EEEEE	text3	5.05173E+00
GGGGG	text4	2.83088E-03
...

The text in the first column is unique for each row.

What I would like to obtain is file 3 containing only the rows of file 2 with an "column1 entry" in file 1 and sorted as they appear in file 1.
If the entry is not present I would like to have a warning message (as below).

file 3 with N lines

Code:
AAAAA	text1	5.07822E-02
BBBBB	text4	0.00000E+00
CCCCC	text3	4.33704E-03
DDDDD	text2	8.45965E-03
EEEEE	text3	5.05173E+00
FFFFF	NOT	FOUND
...

Do you have any suggestion?

Many thanks,
# 2  
Old 01-22-2015
Hello f_o_555,

Could you please try following and let me know if this helps.
Code:
 awk 'FNR==NR{X[$1]=$0;next} ($1 in X){print X[$1]} !($1 in X){print $1 OFS "NOT" OFS "FOUND."}' OFS="\t" file2 file1

Output will be as follows.
Code:
AAAAA   text1   5.07822E-02
BBBBB   text4   0.00000E+00
CCCCC   text3   4.33704E-03
DDDDD   text2   8.45965E-03
EEEEE   text3   5.05173E+00
FFFFF   NOT     FOUND.
GGGGG   text4   2.83088E-03
HHHHH   NOT     FOUND.
IIIII   NOT     FOUND.

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 3  
Old 01-22-2015
Thank you, Ravinder, it works, but not for all files.
Sometimes I get only

Code:
AAAAA
BBBBB
CCCCC
DDDDD
EEEEE

It may be an issue with the formatting, which I'm currently investigating. Although there is no evident difference...
# 4  
Old 01-22-2015
Hello f_o_555,

I am not sure how you have tried running command with other files, but please make sure about command like first file should have 3 fields and 2nd passed file should have 2 fields etc to give you the requested output.
Like in following example.
Code:
awk 'FNR==NR{X[$1]=$0;next} ($1 in X){print X[$1]} !($1 in X){print $1 OFS "NOT" OFS "FOUND."}' OFS="\t" file2 file1

Where file2 has 3 fields and file1 has 2 fields. Let me know if you have any queries and post the error with complete input please incase you have queries, will try to fix the same.

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 5  
Old 01-22-2015
It seems to work now...I'll keep an eye and see if error appears. Thanks!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Use sort to sort numerical column

How to sort the following output based on lowest to highest BE? The following sort does not work. $ sort -t. -k1,1n -k2,2n bfd.txt BE31.116 0s 0s DOWN DAMP BE31.116 0s 0s DOWN DAMP BE31.117 0s 0s ... (7 Replies)
Discussion started by: sand1234
7 Replies

2. Shell Programming and Scripting

awk to find maximum and minimum from column and store in other column

Need your support for below. Please help to get required output If column 5 is INV then only consider column1 and take out duplicates/identical rows/values from column1 and then put minimum value of column6 in column7 and put maximum value in column 8 and then need to do subtract values of... (7 Replies)
Discussion started by: as7951
7 Replies

3. UNIX for Dummies Questions & Answers

Custom sort on a column

Hello all, How do I achieve this? I have A, B and A/B in different variables in a file in col2. I want them to sort in such a way, that the variables appear together, and within a variable, the data is sorted in the order A,B and then A/B. If I sort on the second column, the order becomes A,... (6 Replies)
Discussion started by: senhia83
6 Replies

4. Shell Programming and Scripting

Sort on column

How to sort based on the 4 the column . The input data has a header and output needs to be sorted based on the 4th column rbcid. I tried below code but not getting results sort -u -t'|' -k4,4r file1 > file2 time|tourit|nofdays|rbcid|blank|type|value|nill|valuedesc|name... (6 Replies)
Discussion started by: samrat dutta
6 Replies

5. UNIX for Dummies Questions & Answers

Sort command in one column and not effect to another column

If my data is numerical : 1 = 101 2 = 102 3 = 104 4 = 104 7 = 103 8 = 103 9 = 105 I need the result like below: 1 = 101 2 = 102 3 = 103 4 = 103 7 = 104 8 = 104 9 = 105 (4 Replies)
Discussion started by: GeodusT
4 Replies

6. Shell Programming and Scripting

Find lines with matching column 1 value, retain only the one with highest value in column 2

I have a file like: I would like to find lines lines with duplicate values in column 1, and retain only one based on two conditions: 1) keep line with highest value in column 3, 2) if column 3 values are equal, retain the line with the highest value in column 4. Desired output: I was able to... (3 Replies)
Discussion started by: pathunkathunk
3 Replies

7. UNIX for Dummies Questions & Answers

Sort on one column only

Hello, I am running on AIX.I have a question about sorting in UNIX. if my file is something like this: a c b d a b b c a a I want to sort on column 1 only. The following statement does not seem to work, it still considers the rest of the line in the sorting results: sort... (2 Replies)
Discussion started by: gio001
2 Replies

8. Shell Programming and Scripting

sort on second column only based on first column

I have an input file like this... AAAlkalines Energizer AAAlkalines Energizer AAAlkalines Energizer AAAlkalines Sunlight AAAlkalines Sunlight AAAlkalines Sunlight AAAlkalines Energizer AAAlkalines Energizer AAAlkalines Energizer AAASalines ... (7 Replies)
Discussion started by: malcomex999
7 Replies

9. Shell Programming and Scripting

find expression with awk in only one column, and if it fits, print whole column

Hi. How do I find an expression with awk in only one column, and if it fits, then print that whole column. 1 apple oranges 2 bannanas pears 3 cats dogs 4 hesaid shesaid echo "which number:" read NUMBER (user inputs number 2 for this example) awk " /$NUMBER/ {field to search is field... (2 Replies)
Discussion started by: glev2005
2 Replies

10. Shell Programming and Scripting

Question about sort specific column and print other column at the same time !

Hi, This is my input file: ali 5 usa abc abu 4 uk bca alan 6 brazil bac pinky 10 utah sdc My desired output: pinky 10 utah sdc alan 6 brazil bac ali 5 usa abc abu 4 uk bca Based on the column two, I want to do the descending order and print out other related column at the... (3 Replies)
Discussion started by: patrick87
3 Replies
Login or Register to Ask a Question