common entries between files based on 1st column


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting common entries between files based on 1st column
# 1  
Old 09-27-2012
common entries between files based on 1st column

Hi,

I am trying to get the common entries from 2 files based on 1st field.. However when I try to do in perl I am getting blank output.. How can I do this in awk?

Code:
open(BUFF1, "my_genes");
open(BUFF3, "rawcounts");
#open(WRBUFF,">result_rawcounts");
while($line =<BUFF1>)
{
        chomp($line);
        $line =~ s/ //g;
        @array = split(/\t/,$line);
        chomp($array[0]);
        $hash{$array[0]} = 1;
        $hash1{$array[0]} = $line;
}
while($line =<BUFF3>)
{
        chomp($line);
        $line =~ s/ //g;
        @array = split(/\t/,$line);
        chomp($array[0]);
        #if(exists($hash1{uc($array[0])}))
        if($hash{$array[0]} == 1)
        {
              print WRBUFF $line."\t".$hash1{$array[0]}."\n";
        }

}

input file1
Code:
chr4:134468319-134473843        311
chr7:13796245-13837410  312
chr7:13733505-13779637  313
chr7:13909676-13989588  314
chr7:13623966-13670807  315
chr7:14222402-14254870  316
chr10:33857721-33879475 317
chr15:44671855-44805829 318

input file2
Code:
chr1:3005029-3005778    CUFF.1  -       -       CUFF.1  -       -       -       -       0.611392        0.258405        0.964379        OK
chr1:3023637-3024298    CUFF.2  -       -       CUFF.2  -       -       -       -       0.616067        0.212809        1.01933 OK
chr1:3066843-3068300    CUFF.3  -       -       CUFF.3  -       -       -       -       0.312212        0.140913        0.483512        OK

Now I want to pick only rows from file 2 which have a key value(1st column) similar to the 1st column in file 1

Thanks,
# 2  
Old 09-27-2012
You're getting what you asked for. There are no lines in file1 starting with "chr1:".
# 3  
Old 09-27-2012
Hi,

I just pasted the sample file.. But I do have common ids from column 1 in 2 files.. I tried to debug the perl code and I see that my split function is not working..

Thanks,
# 4  
Old 09-27-2012
Post representative data, then.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Join columns across multiple lines in a Text based on common column using BASH

Hello, I have a file with 2 columns ( tableName , ColumnName) delimited by a Pipe like below . File is sorted by ColumnName. Table1|Column1 Table2|Column1 Table5|Column1 Table3|Column2 Table2|Column2 Table4|Column3 Table2|Column3 Table2|Column4 Table5|Column4 Table2|Column5 From... (6 Replies)
Discussion started by: nv186000
6 Replies

2. Shell Programming and Scripting

Paste columns based on common column: multiple files

Hi all, I've multiple files. In this case 5. Space separated columns. Each file has 12 columns. Each file has 300-400K lines. I want to get the output such that if a value in column 2 is present in all the files then get all the columns of that value and print it side by side. Desired output... (15 Replies)
Discussion started by: genome
15 Replies

3. Shell Programming and Scripting

Sum column values based in common identifier in 1st column.

Hi, I have a table to be imported for R as matrix or data.frame but I first need to edit it because I've got several lines with the same identifier (1st column), so I want to sum the each column (2nd -nth) of each identifier (1st column) The input is for example, after sorted: K00001 1 1 4 3... (8 Replies)
Discussion started by: sargotrons
8 Replies

4. UNIX for Dummies Questions & Answers

How to join 2 .txt files based on a common column?

Hi all, I'm trying to join two .txt file tab delimitated based on a common column. File 1 transcript_id gene_id length effective_length expected_count TPM FPKM IsoPct comp1000201_c0_seq1 comp1000201_c0 337 183.51 0.00 0.00 0.00 0.00 comp1000297_c0_seq1 ... (1 Reply)
Discussion started by: alisrpp
1 Replies

5. Shell Programming and Scripting

common entries of first column in 2 or 3 files:kindly check

Hi all, I have 3 files with such data first files second file third file I have to find common entries of first column in two ways 1) between 2 files (2 Replies)
Discussion started by: manigrover
2 Replies

6. Shell Programming and Scripting

join files based on a common field

Hi experts, Would you please help me with this? I have several files and I need to join the forth field of them based on the common first field. here's an example... first file: 280346 39.88 -75.08 547.8 280690 39.23 -74.83 538.7 280729 40.83 -75.08 499.2 280907 40.9 -74.4 507.8... (5 Replies)
Discussion started by: GoldenFire
5 Replies

7. Shell Programming and Scripting

Join multiple files based on 1 common column

I have n files (for ex:64 files) with one similar column. Is it possible to combine them all based on that column ? file1 ax100 20 30 40 ax200 22 33 44 file2 ax100 10 20 40 ax200 12 13 44 file2 ax100 0 0 4 ax200 2 3 4 (9 Replies)
Discussion started by: quincyjones
9 Replies

8. Shell Programming and Scripting

Merging 2 files based on a common column

Hi All, I do have 2 files file 1 has 4 tab delimited columns 234 a c dfgyu 294 b g fih 302 c h jzh 328 z c san 597 f g son File 2 has 2 tab delimted columns 234 23 302 24 597 24 I want to merge file 2 with file 1 based on the data common in both files which is the first column so... (6 Replies)
Discussion started by: Lucky Ali
6 Replies

9. Shell Programming and Scripting

"Join" or "Merge" more than 2 files into single output based on common key (column)

Hi All, I have working (Perl) code to combine 2 input files into a single output file using the join function that works to a point, but has the following limitations: 1. I am restrained to 2 input files only. 2. Only the "matched" fields are written out to the "matched" output file and... (1 Reply)
Discussion started by: Katabatic
1 Replies

10. Shell Programming and Scripting

merge rows based on a common column

Hi guys, Please guide me if you have a solution to this problem. I have tried paste -s but it's not giving the desired output. I have a file with the following content- A123 box1 B345 bat2 C431 my_id A123 service C431 box1 A123 my_id I need two different outputs- OUTPUT1 A123... (6 Replies)
Discussion started by: smriti_shridhar
6 Replies
Login or Register to Ask a Question