08-19-2011
merging files and adding special columns
Hi everyone,
I got a problem with merging files and hoped one of you would have an idea how to approach this issue. I tried it with awk, but didn't get far. This is what I have:
I got 40 files looking like the ones below. All have three columns but the number of rows differs (20000 to 50000).
eg. file1
chromosome position_on_chromosome file1 |
chr1 62138 x |
chr1 631246 x |
chr1 1238847 x |
chr1 1238854 x |
....
eg. file2
chromosome position_on_chromosome file2 |
chr1 238398 x |
chr1 533005 x |
chr1 631246 x |
chr1 657484 x |
chr1 1281185 x |
chr1 1448761 x |
....
I would now need to merge them according to their genome coordinates (ie 'chromosome' and 'position_on_chromosome' -both infos together give the coordinates). All coordinates (column 1 & 2) should be listed, if present in one file or in all files (=complete list). The third columns of the original files should be added after each other.
This is how it should look like:
chromosome position_on_chromosome file1 file2 (and all other files 'file3' 'file4' etc) |
chr1 62138 x e |
chr1 238398 e x |
chr1 533005 e x |
chr1 631246 x x |
chr1 657484 e x |
chr1 1238847 x e |
chr1 1238854 x e |
chr1 1281185 e x |
chr1 1448761 e x |
.....
A bit complicated to explain, but I hope you got what I mean
Any help would be greatly appreciated!
Edit note: ...just saw now, that it doesn't leave the space in the output table for those 'x' which are empty. Replaced the space (empty cell in the table) with a 'e' for clarification.
Last edited by TuAd; 08-19-2011 at 05:08 PM..
Reason: ...had some pasting issues, so corrected issues
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
hi
i need to select a few columns of two txt files and write it to a new file. there is one common field for both of these files.
plz help me in this
thanks in advance (4 Replies)
Discussion started by: kolvi
4 Replies
2. UNIX for Dummies Questions & Answers
Hello!
I wan't to extract columns from two files and later combine them for plotting with gnuplot. If the files file1 and file2 look like:
fiile1:
a, 0.62,x
b, 0.61,x
file2:
a, 0.43,x
b, 0,49,x
The desired output is
a 0.62 0.62
b 0.61 0.49
Thank you in advance! (2 Replies)
Discussion started by: kingkong
2 Replies
3. Shell Programming and Scripting
Hi,
I want to select columns from multiple files and combine them in one file. The files are simulation-data-files with 23 columns each and about 50 rows. I now use:
cut -f 11 Sweep?wing-30?scale=0.?0?fan2?.txt | pr -3 | awk '{printf("\n%s\t%s\t%s",$1,$2,$3)}' > ../Data_Processed/output.txtI... (1 Reply)
Discussion started by: isgoed
1 Replies
4. Shell Programming and Scripting
Hello,
I have a number of tab delimited data files consists of two columns. Like that:
File1
800.000000 0.002744
799.000000 0.002517
798.000000 0.002836
797.000000 0.002553
FIle2
800.000000 0.000261
799.000000 0.000001
798.000000 0.000551
797.000000 0.000275
File3... (19 Replies)
Discussion started by: erden
19 Replies
5. UNIX for Dummies Questions & Answers
Hi,
I have two text files that I would like to merge/join. I would like to join them if the first columns of both text files match and the second column of the first text file matches the third column of the second text file.
Example input:
First file:
1334 10 0 0 1 5.2
1334 12 0 0 1 4.5... (4 Replies)
Discussion started by: evelibertine
4 Replies
6. Shell Programming and Scripting
I have two files.
FileA.txt
30910 rs7468327
36587 rs10814410
91857 rs9408752
105797 rs1133715
146659 rs2262038
152695 rs2810979
181843 rs3008128
182129 rs3008131
192118 rs3008170
FileB.txt
30910 1.9415219673 0
36431 1.3351312477 0.0107191428
36587 1.3169171182... (2 Replies)
Discussion started by: genehunter
2 Replies
7. Shell Programming and Scripting
Hi.
I have 2 files of below format.
File1
AA~1~STEVE~3.1~4.1~5.1
AA~2~DANIEL~3.2~4.2~5.2
BB~3~STEVE~3.3~4.3~5.3
BB~4~TIM~3.4~4.4~5.4
File 2
AA~STEVE~AA STEVE WORKS at AUTO COMPANY
AA~DANIEL~AA DANIEL IS A ELECTRICIAN
BB~STEVE~BB STEVE IS A COOK
I want to match 1st and 3rd... (2 Replies)
Discussion started by: crypto87
2 Replies
8. Shell Programming and Scripting
Hello guys,
I have 2 CSV files which goes like this:
CSV1:
Breaking.csv:
UTF-8
"Name","Description","Occupation","Email"
"Walter White","","Chemistry Teacher","w.w@bb.com"
"Jessie Pinkman","","Junkie","j.p@bb.com"
"Hank Schrader","","DEA Agent","h.s@bb.com"
CSV2:
Bad.csv... (7 Replies)
Discussion started by: jeffreybsu
7 Replies
9. Shell Programming and Scripting
Hello,
I have a tab delim file that looks like this
CHROM POS ID REF ALT ID HGVS_C HGVS_P
1 17319011 rs2076603 G A NM_022089.3,NM_001141973.2,NM_001141974.2 c.1815C>T,c.1800C>T,c.1800C>T p.Pro605Pro,p.Pro600Pro,p.Pro600Pro
1 20960230 rs45530340 ... (3 Replies)
Discussion started by: nans
3 Replies
10. Shell Programming and Scripting
I have two files, file1 and file2 who have identical number of rows and columns. However, the script is supposed to be used for for different files and I cannot know the format in advance. Also, the number of columns changes within the file, some rows have more and some less columns (they are... (13 Replies)
Discussion started by: maya3
13 Replies