File merging using first column as the ref


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting File merging using first column as the ref
# 1  
Old 02-02-2012
File merging using first column as the ref

I had two files 1.txt 2.txt. I want a 3rd file(o/p) 3.txt like below based on the common elements from the first coulmns of 1.txt and 2.txt.

1.txt

Code:
 
11
12
13
14
15
16
17
18
19
20
21

2.txt

Code:
 
14      b1
15      b2
14      d1
16      b3
17      b4
18      b5
15      d2
19      b6
20      b7
19      d3
22      b8

3.txt

Code:
 
14      b1
14      d1
15      b2
15      d2
16      b3
17      b4
18      b5
19      b6
19      d3
20      b7

That i used to get it by using
Code:
 fgrep -f 1.txt 2.txt > 3.txt

Note: I think 'nawk' will not work as each element in first column of the first file is pointing to multiple entries in second file.
but recently i got the below error.

"fgrep: could not allocate memory for wordlist"

Can anyone give the script which can perform this task?

Last edited by vbe; 02-02-2012 at 09:33 AM.. Reason: code tags for code...
# 2  
Old 02-02-2012
Try...
Code:
$ awk 'FNR==NR{a[$1]=1;next}$1 in a' 1.txt 2.txt|sort -n
14      b1
14      d1
15      b2
15      d2
16      b3
17      b4
18      b5
19      b6
19      d3
20      b7

# 3  
Old 02-04-2012
File merging using first column as the ref

Thank you its working fine. But its taking long time.

Cant we do fast using perl or any other script?
# 4  
Old 02-04-2012
So your file 1.txt is sorted and big, but 2.txt if not sorted and may be big:
Code:
join 1.txt <(sort 2.txt)

Use temp file (or named pipe) to hold sorted 2.txt file if your system doesn't support process substitution. Or
Code:
sort 2.txt |join 1.txt -


Last edited by binlib; 02-04-2012 at 10:02 PM..
# 5  
Old 02-06-2012
File merging using first column as the ref

Hi,

Join coomand is not working as both the files contain more than 10M lines.

Pls suggest some other script?
# 6  
Old 02-06-2012
Maybe you can have a look at this thread (see post #4) an adapt the code to your case.
Note that your files must be pre-sorted.
# 7  
Old 02-06-2012
"Join coomand is not working as both the files contain more than 10M lines."
Care to support your claim with evidence?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Map snps into a ref gene file

I have the following data set about the snps ID txt file POS ID 78599583 rs987435 33395779 rs345783 189807684 rs955894 33907909 rs6088791 75664046 rs11180435 218890658 rs17571465 127630276 rs17011450 90919465 rs6919430 and a gene... (7 Replies)
Discussion started by: marwah
7 Replies

2. Shell Programming and Scripting

Merging middle column in a file

I want to merge files as described below: File 1 a;1;abc b;2;def c;3;xyz d;4;pqr e;5;mno File 2 b;41 d;77 output a;1;abc (9 Replies)
Discussion started by: Prabal Ghura
9 Replies

3. UNIX for Dummies Questions & Answers

File merging based on column patterns

Hello :) I am in this situation: Input: two tab-delimited files, `File1` and `File2`. `File2` (`$2`) has to be parsed by patterns found in `File1` (`$1`). Expected output: tab-delimited file, `File3`. `File3` has to contain the same rows as `File2`, plus the corresponding value in... (5 Replies)
Discussion started by: dovah
5 Replies

4. UNIX for Dummies Questions & Answers

How to generate one long column by merging two separate two columns in a single file?

Dear all, I have a simple question. I have a file like below (separated by tab): col1 col2 col3 col4 col5 col6 col7 21 66745 rs1234 21 rs5678 23334 0.89 21 66745 rs2334 21 rs9978 23334 0.89 21 66745 ... (4 Replies)
Discussion started by: forevertl
4 Replies

5. UNIX for Dummies Questions & Answers

merging rows into new file based on rows and first column

I have 2 files, file01= 7 columns, row unknown (but few) file02= 7 columns, row unknown (but many) now I want to create an output with the first field that is shared in both of them and then subtract the results from the rest of the fields and print there e.g. file 01 James|0|50|25|10|50|30... (1 Reply)
Discussion started by: A-V
1 Replies

6. Shell Programming and Scripting

Merging rows with same column 1 value

I have the following space-delimited input: 1 11.785710 117.857100 1 15 150 1 20 200 1 25 250 3 2.142855 21.428550 3 25 250 22 1.071435 10.714350 The first field is the ID number, the second field is the percentage of the total points that the person has and the third column is the number... (3 Replies)
Discussion started by: mdlloyd7
3 Replies

7. Shell Programming and Scripting

Append file from ref file AWK

FILE1 abc:xxx:abc:123:wer:AAA:12 csf:xxx:123:aeg:sar:BBB:13 asq:yer:321:wsa:qqq:CCC:14 FILE2 AAA:12:SET1:R1 AAA:12:SSS1:RR1 AAA:11:SET4:R3 BBB:13:SET2:R2 OUTPUT abc:xxx:abc:123:wer:AAA:12:SET1:R1:SSS1:RR1 csf:xxx:123:aeg:sar:BBB:13:SET2:R2::... (4 Replies)
Discussion started by: greycells
4 Replies

8. Shell Programming and Scripting

File merging using first column as the ref

I had two files 1.txt 2.txt. I want a 3rd file(o/p) 3.txt like below (using awk) 1.txt 11 a1 12 a2 13 a3 14 a4 15 a5 16 a6 17 a7 18 a8 19 a9 20 a10 2.txt 14 b1 15 b2 16 b3 (8 Replies)
Discussion started by: p_sai_ias
8 Replies

9. Shell Programming and Scripting

Extracting a column from a file and merging with other file using awk

Hi All: I have following files: File 1: <header> text... text .. text .. text .. <\header> x y z ... File 2: <header> text... text .. text .. (4 Replies)
Discussion started by: mrn006
4 Replies

10. Shell Programming and Scripting

Reading a path (including ref to shell variable) from file

Hi! 1. I have a parameter file containing path to log files. For this example both paths are the same, one is stated directly and the second using env variables. /oracle/admin/orcl/bdump/:atlas:trc:N ${ORACLE_BASE}/admin/${ORACLE_SID}/bdump/:${ORACLE_SID}:trc:N 2. I try to parse the path... (1 Reply)
Discussion started by: lojzev
1 Replies
Login or Register to Ask a Question