Sponsored Content
Top Forums Shell Programming and Scripting join two files based on one column Post 302630929 by newpro on Thursday 26th of April 2012 01:39:15 PM
Old 04-26-2012
join two files based on one column

Hi All,

I am trying to join to files based on one common column.

Cat File1
Code:
ID	HID
Ab_1	23
Cd	45
df     22
Vv	33

Cat File2
Code:
ID	pval
Ab_1	0.3
Cd	10
Vv	0.0444

(file1 has 18,000 rows and file 2 have between 4,000 to 8,000 rows)
Desired output:
Cat Fileout
Code:
HID	pval
23	0.3
10	45
33	0.0444

By searching the forum, I came up with this script:
Code:
awk ' FNR == NR { ab[$1] = $2 } FNR != NR { cd[$1] = $2 } END { for (a in ab)  if (a in cd) print ab[a],cd[a] }' FS='\t' OFS='\t' File1 File2 >Fileout

It seems to work fine in the beginning. But when I used it in large file with 15,000 rows it is giving errors (eg. missing some rows).

I am a beginner in scripting. I am not sure if there is error in above script or if there is a better way to do this. Any suggestion will be very helpful.

Thank you for your time,

NP
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

"Join" or "Merge" more than 2 files into single output based on common key (column)

Hi All, I have working (Perl) code to combine 2 input files into a single output file using the join function that works to a point, but has the following limitations: 1. I am restrained to 2 input files only. 2. Only the "matched" fields are written out to the "matched" output file and... (1 Reply)
Discussion started by: Katabatic
1 Replies

2. UNIX for Dummies Questions & Answers

Join 2 files using first column

Hi, I'm trying to compare the first column of two files (tab or whitespace delimited, either way's fine, I`ve got both) and print the lines that are identical for the first column of both files. Something like this: File1 AAA 26 49 7 27 36 33 46 75 73 69 AAAAA 4 10 4 7 10 18 21... (2 Replies)
Discussion started by: vanesa1230
2 Replies

3. Shell Programming and Scripting

Join multiple files based on 1 common column

I have n files (for ex:64 files) with one similar column. Is it possible to combine them all based on that column ? file1 ax100 20 30 40 ax200 22 33 44 file2 ax100 10 20 40 ax200 12 13 44 file2 ax100 0 0 4 ax200 2 3 4 (9 Replies)
Discussion started by: quincyjones
9 Replies

4. Shell Programming and Scripting

join files based on a common field

Hi experts, Would you please help me with this? I have several files and I need to join the forth field of them based on the common first field. here's an example... first file: 280346 39.88 -75.08 547.8 280690 39.23 -74.83 538.7 280729 40.83 -75.08 499.2 280907 40.9 -74.4 507.8... (5 Replies)
Discussion started by: GoldenFire
5 Replies

5. Shell Programming and Scripting

join rows based on the column values

Hi, Please help me to convert the input file to a new one. input file: -------- 1231231231 3 A 4561223343 0 D 1231231231 1 A 1231231231 2 A 1231231231 4 D 7654343444 2 A 4561223343 1 D 4561223343 2 D the output should be: -------------------- 1231231231 3#1#2 A 4561223343 0 D... (3 Replies)
Discussion started by: vsachan
3 Replies

6. UNIX for Dummies Questions & Answers

How to use the the join command to join multiple files by a common column

Hi, I have 20 tab delimited text files that have a common column (column 1). The files are named GSM1.txt through GSM20.txt. Each file has 3 columns (2 other columns in addition to the first common column). I want to write a script to join the files by the first common column so that in the... (5 Replies)
Discussion started by: evelibertine
5 Replies

7. UNIX for Dummies Questions & Answers

How to join 2 .txt files based on a common column?

Hi all, I'm trying to join two .txt file tab delimitated based on a common column. File 1 transcript_id gene_id length effective_length expected_count TPM FPKM IsoPct comp1000201_c0_seq1 comp1000201_c0 337 183.51 0.00 0.00 0.00 0.00 comp1000297_c0_seq1 ... (1 Reply)
Discussion started by: alisrpp
1 Replies

8. UNIX for Dummies Questions & Answers

Join files by second column

I have file input file1 1/1/2013 A 553.0763397 96 16582 1/1/2013 B 459.8333588 195 11992 1/2/2013 A 844.2973022 306 19555 1/2/2013 B 833.9300537 457 20165 1/3/2013 A 563.6917419 396 13879 1/3/2013 B 632.0749969 169 ... (1 Reply)
Discussion started by: radius
1 Replies

9. UNIX for Dummies Questions & Answers

Join 2 files based on certain column

I have file input1.txt 11103|11|OTTAWA|City|AA|CAR|0|0|1|-1|0|8526|2014-09-07 23:00:14 11103|11|OTTAWA|City|BB|TRAIN|0|0|2|-2|6|6359|2014-09-07 23:00:14 11104|11|CANADA|City|CC|CAR|0|0|2|-2|0|5947|2014-09-07 23:00:14 11104|11|CANADA|City|DD|TRAIN|0|0|2|-2|1|4523|2014-09-07 23:00:14... (5 Replies)
Discussion started by: radius
5 Replies

10. Shell Programming and Scripting

Join columns across multiple lines in a Text based on common column using BASH

Hello, I have a file with 2 columns ( tableName , ColumnName) delimited by a Pipe like below . File is sorted by ColumnName. Table1|Column1 Table2|Column1 Table5|Column1 Table3|Column2 Table2|Column2 Table4|Column3 Table2|Column3 Table2|Column4 Table5|Column4 Table2|Column5 From... (6 Replies)
Discussion started by: nv186000
6 Replies
pnmscalefixed(1)					      General Commands Manual						  pnmscalefixed(1)

NAME
pnmscalefixed - scale a portable anymap quickly, but less accurate DESCRIPTION
pnmscalefixed is the same thing as pnmscale except that it uses fixed point arithmetic internally instead of floating point, which makes it run faster. In turn, it is less accurate and may distort the image. Use the pnmscale man page with pnmscalefixed. This man page only describes the difference. pnmscalefixed uses fixed point 12 bit arithmetic. By contrast, pnmscale uses floating point arithmetic which on most machines is probably 24 bit precision. This makes pnmscalefixed run faster (30% faster in one experiment), but the imprecision can cause distortions at the right and bottom edges. The distortion takes the following form: One pixel from the edge of the input is rendered larger in the output than the scaling factor requires. Consequently, the rest of the image is smaller than the scaling factor requires, because the overall dimensions of the image are always as requested. This distortion will usually be very hard to see. pnmscalefixed with the -verbose option tells you how much distortion there is. The amount of distortion depends on the size of the input image and how close the scaling factor is to an integral 1/4096th. If the scaling factor is an exact multiple of 1/4096, there is no distortion. So, for example doubling or halving an image causes no dis- tortion. But reducing it or enlarging it by a third would cause some distortion. To consider an extreme case, scaling a 100,000 row image down to 50,022 rows would create an output image with all of the input squeezed into the top 50,000 rows, and the last row of the input copied into the bottom 22 rows of output. pnmscalefixed could probably be modified to use 16 bit or better arithmetic without losing anything. The modification would consist of a single constant in the source code. Until there is a demonstrated need for that, though, the Netpbm maintainer wants to keep the safety cushion afforded by the original 12 bit precision. pnmscalefixed does not have pnmscale 's -nomix option. 18 November 2000 pnmscalefixed(1)
All times are GMT -4. The time now is 01:29 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy