merge files along with file names (awk)?


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers merge files along with file names (awk)?
# 1  
Old 11-04-2011
merge files along with file names (awk)?

Dear programmers,

I have a question about conditionally merging multiple files and having their file names in the first column.

Input files:
Code:
file.1.extension file.2.extension file.3.extension file.4.extension ... file.1000.extension

where each file looks like this (with multiple lines):
Code:
CHR         SNP         BP    NMISS       BETA       
  22   rs1006771          2      370    0.09654

The desired output is:

Code:
filename           CHR         SNP         BP    NMISS       BETA       
file.1.extension  22   rs1006771          2      370    0.09654   
file.2.extension  22   rs1490345          2      NA      0.04256
file.2.extension  22   rs1345544          2      NA      0.03245

I have tried using:

Code:
for i in `fileID`
do
awk '{if($5 < 0.01) printf $0"\n"}' file.${i}.extension >> betaout
done

which works perfectly, but does not give me the first column of my desired output.

And I tried put ${i} inside the awk command.... still not working (syntax error).

I guess, this is the point where I call for help desperately... So, please, HELP.

Wei

Last edited by radoulov; 11-04-2011 at 05:25 PM.. Reason: Code tags, please!
# 2  
Old 11-04-2011
awk provides the builtin variable FILENAME for that:

Code:
print FILENAME, $0 ...

This User Gave Thanks to radoulov For This Post:
# 3  
Old 11-04-2011
what is this 'fileID'? Is it a file or a command? I'm assuming a file.

Shell variables don't work inside awk. awk has a special FILENAME variable you can use instead.

You don't have to reopen betaout 5,000 times to process 5,000 files. You can do it once.

You don't have to store the entire list of files in a shell variable -- that's actually dangerous, if the file's big enough, bits will be lopped off the end.

Code:
while read i
do
        awk '{if($5 < 0.01) printf $0"\n"}' file.${i}.extension
done < fileID >> betaout

This User Gave Thanks to Corona688 For This Post:
# 4  
Old 11-04-2011
simple as that! Thank you radoulov!!!

I guess a while loop is better in this case as I do need to process more than quite a few thousand files, so thank you Corona688!
# 5  
Old 11-04-2011
If you have a few thousand files, there's an even faster way:

Code:
<fileID awk '{ printf("file.%s.extension\n", $1); }' | xargs awk '{if($5 < 0.01) print FILENAME, $0 }' >> betaout

Once you're processing more than a few lines, it's faster to use awk than a shell language, so we let awk translate the names for us, feeding them into xargs, which will run 'awk '{...}' file1 file2 file3 file4 file5 ...' for as many files as it can safely cram in one commandline, over and over, until it's used up all the filenames. Thus can reduce the number of times awk needs to be run manyfold, making it more efficient.
# 6  
Old 11-04-2011
If the total bytes of the filenames don't exceed your system's ARG_MAX limit,
you could even avoid the explicit shell loop:

Code:
awk 'your_awk_code_here'  file1 file2 ... filen

 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Merge two files on awk/shell

Hi, i have two files like these: FILE 1 00:0f:35:1b:0c:00 1402691094.750049000 00:0f:35:1b:0c:00 1402691087.474893000 44:d3:ca:fd:a2:08 1402691091.865127000 30:e4:db:c1:df:de 1402691090.192464000 FILE 2_ 00:0F:35 Cisco Systems, Inc 30:E4:DB Cisco Systems, Inc I need a file 3, that... (5 Replies)
Discussion started by: bertiko
5 Replies

2. Shell Programming and Scripting

Exclude certain file names while selectingData files coming in different names in a file name called

Data files coming in different names in a file name called process.txt. 1. shipments_yyyymmdd.gz 2 Order_yyyymmdd.gz 3. Invoice_yyyymmdd.gz 4. globalorder_yyyymmdd.gz The process needs to discard all the below files and only process two of the 4 file names available ... (1 Reply)
Discussion started by: dsravanam
1 Replies

3. Shell Programming and Scripting

awk merge two files

file1 AAA3:WWW1:DDD1:XXX8:DDD2:XXX9 AAA6:WWW2:FFF1:XXX130:FFF1:XXX104:FFF1:XXX16 AAA7:WWW3:ZZZ1:XXX4:ZZZ2:XXX5:ZZZ3:XXX6:ZZZ4:XXX7file2 XXX8:EEE1:EEE2 XXX9:KKK1:KKK2 XXX130:OOO1:OOO2 XXX104:PPP1:PPP2 XXX16:RRR1:RRR1 XXX4:UUU1:UUU2 XXX5:III1:III2 XXX7:JJJ1:JJJ2Result... (2 Replies)
Discussion started by: vikus
2 Replies

4. UNIX for Dummies Questions & Answers

Merge files with file names added

I want to merge several files with identical format: file 1: rs3094315 0.0006105222804 0.9528743638 rs3131972 -0.05461465109 0.3139864854 rs3115860 -0.06041530955 0.3195499498 file 2: rs2073813 -0.06039552152 0.2956527097 rs11564776 -0.1864266568 ... (4 Replies)
Discussion started by: luoruicd
4 Replies

5. Shell Programming and Scripting

merge two files with awk

I have two file like follows. I want to merge them according the first field of file1. The separator of file1 is tab, while the second one is ",". I tried some code, but can't get the results. File1: AABB 6072 28 5922 BBCC 316 147 162 CCDD 907 71 231 File2: CCDD,hTRBV12-4,hTRBJ2-3,319895... (7 Replies)
Discussion started by: xshang
7 Replies

6. Shell Programming and Scripting

Merge files using AWK

I want to merge data from 2nd file to 1st file based on 1st column File1 ==== data1,12,comp1 data1,13,comp2 data3,14,, File2 ==== data1,11,host1,lit data2,11,host2,lit3 data3,11,host3,lit4 Required Ouput (5 Replies)
Discussion started by: greycells
5 Replies

7. Shell Programming and Scripting

merge two files into one file use awk

Hi, guys. I have one question: I have two files: passwd and shadow (the number of records in these files are not equal)the contents of them are below: passwd: ************** ftp:x:24:24: sshd:x:71:65: uucp:x:10:14: brownj:x:5005:1000: sherrys: x :5006:1000: ... ************* ... (2 Replies)
Discussion started by: daikeyang
2 Replies

8. Shell Programming and Scripting

Merge files of differrent size with one field common in both files using awk

hi, i am facing a problem in merging two files using awk, the problem is as stated below, file1: A|B|C|D|E|F|G|H|I|1 M|N|O|P|Q|R|S|T|U|2 AA|BB|CC|DD|EE|FF|GG|HH|II|1 .... .... .... file2 : 1|Mn|op|qr (2 Replies)
Discussion started by: shashi1982
2 Replies

9. Shell Programming and Scripting

merge two files using awk

Hi Guys, I wonder whether is possible to merge two files using awk. I have two files one with 7 columns and another one with 9 columns and the first column on both files is identical so will be my key to merge the files. Any ideas.Thanks in advance. Harby. (2 Replies)
Discussion started by: hariza
2 Replies

10. Shell Programming and Scripting

Merge two files whose names are given in other file

Hi, I have a pointer file ptr.txt. There may be any number of files mentioned in the ptr.txt file eg: cat ptr.txt /home/abc.txt /home/pqr.txt /home/xyz.txt I have to read this pointer file and merge the files given in the pointer file so that final file say... (1 Reply)
Discussion started by: harshada
1 Replies
Login or Register to Ask a Question