|
|||||||
| Forums | Search Forums | Register | Forum Rules | Man Pages | Albums | FAQ | Members | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
|
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
Match ids and print original file
Hello,
I have two files Original: ( 5000 entries) Chr Position chr1 879108 chr1 881918 chr1 896874 ... and a file with allele freq ( 2000 entries) Chr Position MAF chr1 881918 0.007 chr1 979748 0.007 chr1 1120377 0.007 chr1 1178925 0.036 I would like the original file matched with the allele freq and print out the output file with 5000 entries. Chr Position MAF chr1 879108 NULL chr1 881918 0.007 chr1 896874 NULL ... Any help is appreciated. Thank you. Last edited by nans; 03-08-2013 at 03:59 AM.. |
| Sponsored Links | ||
|
|
#2
|
|||
|
|||
|
what's the matching point between both files? your post doesn't clear the requirement..can you please mentioned something that is actually needed?
|
| Sponsored Links | ||
|
|
#3
|
|||
|
|||
|
The common column with both the files is the "position" which is the second column.
|
|
#4
|
|||
|
|||
|
if you are looking for something like matching between both files based on 2nd columne, Code:
awk 'FNR==NR && NR>2 { a[$2]=$2; next } { if( $2 in a) { print } }' original allelefreq
chr1 881918 0.007 |
| Sponsored Links | |
|
|
#5
|
|||
|
|||
|
Thank you but that only prints the positions which match with the original file.
The desired output is to print all 5000 entries from the original file whether or not it has a 3rd value. Eg: chr1 12345 0.07 chr1 6789 NULL chr1 13456 0.78 ..... chr22 465546 0.12 chr22 6757657 NULL |
| Sponsored Links | |
|
|
#6
|
|||
|
|||
|
reverse the filename order then Code:
awk 'FNR==NR && NR>2 { a[$2]=$2; next } { if( $2 in a) { print } }' allelefreq originaland let me know if this what you wanted |
| Sponsored Links | |
|
|
#7
|
|||
|
|||
|
Well, this gives me exactly all the entries common with original and allele freq file without the MAF values
chr1 979748 chr1 1120377 chr1 1178925 chr1 1222958 |
| Sponsored Links | ||
|
![]() |
| Tags |
| all entries, matching data, print |
| Thread Tools | Search this Thread |
| Display Modes | |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Match and print columns in second file | newpro | Shell Programming and Scripting | 3 | 04-23-2012 06:16 PM |
| Match values/IDs from column and text files | ad23 | UNIX for Dummies Questions & Answers | 1 | 02-23-2012 05:18 PM |
| print when column match with other file | attila | Shell Programming and Scripting | 1 | 02-17-2012 02:06 AM |
| uuencode then uudecode; results don't match original 33% of the time. | charles_n_may | UNIX for Advanced & Expert Users | 6 | 05-12-2010 10:56 PM |
| awk: read file 1, search file 2, sum on match, print | Bubnoff | Shell Programming and Scripting | 6 | 01-30-2010 07:16 PM |
|
|