To merge different sizes txt files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting To merge different sizes txt files
# 1  
Old 05-22-2014
To merge different sizes txt files

Hi,

I have to .txt files that look like

Code:
"baseMean" "log2FoldChange" "lfcSE" "stat" "pvalue" "padj"
"c104215_g2_i4" 202.057864855455 5.74047973414006 1.14052672909697 5.03318299141063 4.8240223910525e-07 0.00234905721174879
"c91544_g1_i1" 373.123487095726 5.62496675850204 1.15060014539303 4.88872418539511 1.01491573830736e-06 0.00234905721174879
"c104937_g1_i2" 127.674619831286 5.06648438344161 1.16615265871181 4.34461504297243 1.39520111651921e-05 0.0183546569529002
"c105753_g1_i3" 134.024403708584 4.97002237479055 1.17052222688412 4.24598718472911 2.17633069924804e-05 0.0207469473745466
"c108287_g1_i4" 116.154777394681 4.94489963165783 1.17057311887466 4.22434066862194 2.39641321103628e-05 0.0207469473745466
"c103430_g2_i1" 113.778003847288 4.90828138271733 1.17197935474572 4.18802717201639 2.81389833386134e-05 0.0216545109559152
"c83301_g1_i1" 103.09657725435 4.73959088799424 1.17819090507568 4.0227698818383 5.75176873742156e-05 0.0355851897968018
"c99520_g2_i1" 79.96763602061 4.35095490449958 1.19150491788504 3.65164661864985 0.000260564268983195 0.0751945052907338
"c69876_g1_i1" 552.790165445229 -3.97960824220711 1.11639782163909 -3.56468649890793 0.000364291348238595 0.0901100670678753

and

Code:
Hit	Name	signature_desc	Ontology_term
c48374_g1_i2	PF02874,PF00006,PF00306	ATP synthase alpha/beta family, beta-barrel domain,ATP synthase alpha/beta family, nucleotide-binding domain,ATP synthase alpha/beta chain, C terminal domain	GO:0005524,GO:0015992,GO:0046034,GO:0015991,GO:0016820,GO:0033178	
c99520_g2_i1	PF10168,PF00487,PF00173	Nuclear pore component,Fatty acid desaturase,Cytochrome b5-like Heme/Steroid binding domain	GO:0020037,GO:0006629	
c105882_g1_i3	PF03638	Tesmin/TSO1-like CXC domain, cysteine-rich domain		
c83301_g1_i1	PF01694	Rhomboid family	GO:0004252,GO:0016021	
c94400_g1_i1	PF01419	Jacalin-like lectin domain		
c55961_g1_i1	PF00030	Beta/Gamma crystallin		
c104646_g2_i1	PF00217	ATP:guanido phosphotransferase, C-terminal catalytic domain	GO:0016301,GO:0016772	
c103430_g2_i1	PF02991	Autophagy protein Atg8 ubiquitin like		
c104937_g1_i2	PF13499,PF04377	Arginine-tRNA-protein transferase, C terminus,EF-hand domain pair	GO:0004057,GO:0016598,GO:0005509

They are different sizes (the first one is longer).

What I need to do is to "add" the information from the second file to the first file just keeping the rows which ID is in both of them. They have in common the ID's from each one first column. And I need to keep the rows sorted as in the first file.

I guess that probably I can do it with awk but I honestly don't know how to do it.

Can anyone help me?
Thank you for your time.

Alicia
# 2  
Old 05-22-2014
Try this:

Code:
awk -F'\t' '
FNR==1  {
   if(++file==2) {
       OFS=FS=" "
       print $0,heading
   } else {
      qt="\""
      heading = qt $2 qt " " qt $3 qt " " qt $4 qt
   }
   next
}
file==1 {k=qt $1 qt; name[k]=qt $2 qt;sig[k]=qt $3 qt;term[k]=qt $4 qt}
file==2 && ($1 in name) { print $0,name[$1],sig[$1],term[$1] }' file2 file1

# 3  
Old 05-22-2014
Hi, try something like:
Code:
awk 'NR==FNR{A["\"" $1 "\""]; next} $1 in A' file2 file1

Output:
Code:
"c104937_g1_i2" 127.674619831286 5.06648438344161 1.16615265871181 4.34461504297243 1.39520111651921e-05 0.0183546569529002
"c103430_g2_i1" 113.778003847288 4.90828138271733 1.17197935474572 4.18802717201639 2.81389833386134e-05 0.0216545109559152
"c83301_g1_i1" 103.09657725435 4.73959088799424 1.17819090507568 4.0227698818383 5.75176873742156e-05 0.0355851897968018
"c99520_g2_i1" 79.96763602061 4.35095490449958 1.19150491788504 3.65164661864985 0.000260564268983195 0.0751945052907338



---
Or, to add the extra information at the end, try:
Code:
awk 'NR==FNR{i="\"" $1 "\""; $1=x; A[i]=$0; next} $1 in A{print $0, A[i]}' FS='\t' file2 FS=" " file1


Last edited by Scrutinizer; 05-22-2014 at 09:17 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Compare two txt files,mismatches will be in new txt files

Hi, Below are the sample data for txt files. txt file 1 Partnumber|catgroup_id 10001082|46016 10001093|4680 10001093|386003 10001093|463004 10003251|683 10003251|63005 10003252|463005 10003252|4683 10003260|463005 10003260|4683 10003264|4683 10003264|463005 13420000|67... (5 Replies)
Discussion started by: Ankita Talukdar
5 Replies

2. Shell Programming and Scripting

How to merge some files with diffrent sizes into one excel file using shell?

Hii I have these files , and I want to merge them in an excel file each file have two columns file1 title1 1 1 2 2 3 3 file2 title2 5 5 6 6 7 7 8 8 9 9 (10 Replies)
Discussion started by: maryre89
10 Replies

3. Shell Programming and Scripting

Listing Files and Sizes on FTP server

Need assistance in getting File size for the List of files using perl script . I have writtern 2 codes. One of them gives me the list of files and 2nd one give me the size for only 1 file. I dont know how to club both of them to get the list of files with its size . #!/usr/bin/perl -w... (11 Replies)
Discussion started by: ajayram_arya
11 Replies

4. UNIX for Dummies Questions & Answers

Need help combining txt files w/ multiple lines into csv single cell - also need data merge

:confused:Hello -- i just joined the forums. I am a complete noob -- only about 1 week into learning how to program anything... and starting with linux. I am working in Linux terminal. I have a folder with a bunch of txt files. Each file has several lines of html code. I want to combine... (2 Replies)
Discussion started by: jetsetter
2 Replies

5. Shell Programming and Scripting

Using csh / awk / sed to compare database sizes in a txt file

Hello, I have an output file showing database sizes across the 3 environments that I use (LIVE, TEST & DEVELOPMENT). I am trying to write a script that lets me know if the size of a db on one environment is different to its corresponding db on the other environments. Here is an example... (4 Replies)
Discussion started by: stevie_g
4 Replies

6. Shell Programming and Scripting

Comparing sizes in percentages of 2 files in bash

Hi guys, I hope you can enlight me with a script I'm doing for Solaris 10. Script goes like this: #!/usr/bin/bash fechahoy=`perl /export/home/info/John/fechamod.pl` fechayer=`perl /export/home/info/John/fecha.pl` echo $fechahoy echo $fechayer DAT1=`ssh ivt@blahblah ls -la... (1 Reply)
Discussion started by: sr00t
1 Replies

7. Shell Programming and Scripting

merge two two txt files into one file based on one column

Hi, I have file1.txt and file2.txt and would like to create file3.txt based on one column in UNIX Eg: file1.txt 17328756,0000786623.pdf,0000786623 20115537,0000793892.pdf,0000793892 file2.txt 12521_74_4.zip,0000786623.pdf 12521_15_5.zip,0000793892.pdf Desired Output ... (5 Replies)
Discussion started by: techmoris
5 Replies

8. Solaris

list files .txt and .TXT in one command

Dear experts, In a directory i have both *.TXT and *.txt files. I have a script- for file in `ls *.txt`; do mv $file /tmp/$file How to list both *.txt and*.TXT file in one command so that script will move both .txt or .TXT whatever it find. br//purple (4 Replies)
Discussion started by: thepurple
4 Replies

9. Shell Programming and Scripting

merge txt question

time to ask the folks here say I have two files, with each has 1 column, is there any way i merge these two files and put the column in file2 as column2 in the new merged file? how about I have 3 files and put the content in file2 as column2 and content in file3 as column3? Thanks! e.g. ... (3 Replies)
Discussion started by: fedora
3 Replies

10. Shell Programming and Scripting

Join - files of different sizes

I am trying to join to files with dramatically different sizes (file 1: 1 column - 9000 rows, file 2: 13 cols, 26 million rows). I can't seem to get join to work. I have check to ensure that there are matches and have tried subsamples that work. The -a filenum flag lists all the rows, so join is... (4 Replies)
Discussion started by: annelie
4 Replies
Login or Register to Ask a Question