Sponsored Content
Top Forums Shell Programming and Scripting Grab unique record from different files on a condition Post 302596804 by jacobs.smith on Wednesday 8th of February 2012 10:09:35 AM
Old 02-08-2012
Quote:
Originally Posted by birei
Hi jacobs.smith,

Other solution using perl. I think it should work with any number of input files. Give it a try. It could be more efficient, but I struggled a little to get it, so if it works, I will be happy for that:
Code:
$ cat 1.txt 
chr1 100 200
chr1 300 400
chr1 350 467
chr1 450 700
chr2 500 600
chr2 345 765
chr3 101 300
chr3 132 456
$ cat 2.txt
chr1 156 199
chr1 165 230
chr1 201 299
chr1 525 600
chr2 800 1000
chr2 534 676
chr2 200 400
chr2 100 200
chr3 200 400
chr3 500 600
chr3 400 700
$ cat script.pl
use warnings;
use strict;

die qq[Usage: perl $0 <input-files-1> <input-file-2> ...\n] unless @ARGV > 0;

my (@data);

while ( <> ) {
        my @f = split;
        next unless @f == 3;
        push @data, [ $ARGV, @f ]; 
}

for my $d ( @data ) {
        if ( grep {
                        $d->[0] ne $_->[0] &&
                        $d->[1] eq $_->[1] &&
                        ($d->[2] < $_->[2] &&
                        $d->[3] > $_->[2]
                                        ||
                        $d->[2] < $_->[3] &&
                        $d->[3] > $_->[3]
                                        ||
                        $d->[2] > $_->[2] &&
                        $d->[3] < $_->[3])

                } @data
        ) { 
                next;
        }

        printf qq[%s\n], join qq[ ], @$d[1..3], $d->[0];
}
$ perl script.pl 1.txt 2.txt
chr1 300 400 1.txt
chr1 350 467 1.txt
chr1 201 299 2.txt
chr2 800 1000 2.txt
chr2 100 200 2.txt
chr3 500 600 2.txt

Regards,
Birei

Thanks a lot Birei. The script works for the two files I have mentioned earlier before. And I even tried using it with 3 files. The 3 files and their output has been given below just for your confirmation and my satisfaction Smilie

Thanks once again

cat 1.txt
Quote:
chr1 100 200
chr1 300 400
chr1 350 467
chr1 450 700
chr2 500 600
chr2 345 765
chr3 101 300
chr3 132 456
cat 2.txt
Quote:
chr1 156 199
chr1 165 230
chr1 201 299
chr1 525 600
chr2 800 1000
chr2 534 676
chr2 200 400
chr2 100 200
chr3 200 400
chr3 500 600
chr3 400 700
cat3.txt
Quote:
chr1 330 420
chr1 50 60
chr1 20 30
chr1 15 20
chr1 220 299
chr2 199 300
chr3 900 1000
chr3 100 200
chr3 110 200
perl newscript.pl 1.txt 2.txt 3.txt
Quote:
chr2 800 1000 2.txt
chr3 500 600 2.txt
chr1 50 60 3.txt
chr1 20 30 3.txt
chr1 15 20 3.txt
chr3 900 1000 3.txt
I find everything to be smooth. Let me know if you see anything.

Thanks
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

ksh scripting: Extract 1 most recent record for unique key

I'm loading multiple delimited files into an Oracle DB using sqlldr on Unix. I would like to get only the most recent record per each unique key. There may be multiple updates for each key, but I only want the most recent one. There is a date column in my delimited files, so I'm using cat to... (2 Replies)
Discussion started by: OPTIMUS_prime
2 Replies

2. Shell Programming and Scripting

Managing sequence to make unique record

Hi Everyone, Using shell script i am getting final file as attached below. In this 4th column value should be unique using any sequence. for instance I've 1_13020_SSGM which is appearing 6 times in file and i should change it like 1_13020_SSGM_1,1_13020_SSGM_2,....1_13020_SSGM_6. Can someone... (4 Replies)
Discussion started by: gehlnar
4 Replies

3. Shell Programming and Scripting

Help with File processing - Adding predefined text to particular record based on condition

I am generating a output: Name Count_1 Count_2 abc 12 12 def 15 14 ghi 16 16 jkl 18 18 mno 7 5 I am sending the output in html email, I want to add the code: <font color="red"> NAME COLUMN record </font> for the Name... (8 Replies)
Discussion started by: karumudi7
8 Replies

4. Shell Programming and Scripting

[AWK script]Counting the character in record and print them in condition

.......... (1 Reply)
Discussion started by: Antonlee
1 Replies

5. Shell Programming and Scripting

compare 2 files and return unique lines in each file (based on condition)

hi my problem is little complicated one. i have 2 files which appear like this file 1 abbsss:aa:22:34:as akl abc 1234 mkilll:as:ss:23:qs asc abc 0987 mlopii:cd:wq:24:as asd abc 7866 file2 lkoaa:as:24:32:sa alk abc 3245 lkmo:as:34:43:qs qsa abc 0987 kloia:ds:45:56:sa acq abc 7805 i... (5 Replies)
Discussion started by: anurupa777
5 Replies

6. Shell Programming and Scripting

Replace string, grab files, rename and move

Hello there! I'm having a lot of trouble writing a script. The script is supposed to: 1) Find all files with the name "Object.mtl" within each folder in the directory: /Users/username/Desktop/convert/Objects 2) Search and replace the string ".bmp" with ".tif" (without the quotations) 3)... (1 Reply)
Discussion started by: Blue Solo
1 Replies

7. Shell Programming and Scripting

Output first unique record in csv file

Hi, I have to output a new csv file from an input csv file with first unique value in the first column. input csv file color product id status green 102 pass yellow 201 hold yellow 202 keep green 101 ok green 103 hold yellow 203 ... (5 Replies)
Discussion started by: Chris LAU
5 Replies

8. UNIX for Dummies Questions & Answers

FTP mget will only grab files not folders

Hey All, first post :rolleyes: So I am writting a script to pull down files from an ftp that will be called from a bat file on windows. This seems pretty straight forward, and grabs all of the "files" in the cd location, but I am running into some permission issue that will not allow me to... (1 Reply)
Discussion started by: mpatton
1 Replies

9. UNIX for Beginners Questions & Answers

Remove footer record in specific condition

Hi Experts, we have a requirement , need your help to remove the footer record in the file. Input file : 1011070375,,21,,NG,NG,asdfsfadf,1011,,30/09/2017,ACI,USD,,0.28,,,,,,,,,,,, 1011070381,,21,,NG,NG,sgfseasdf,1011,,30/09/2017,ACI,GBP,,0.22,,,,,,,,,,,,... (6 Replies)
Discussion started by: KK230689
6 Replies

10. Shell Programming and Scripting

CSV File:Filter duplicate records from column1 & another column having unique record

Hi Experts, I have csv file with 30, 40 columns Pasting just 2 column for problem description. Need to print error if below combination is not present in file check for column-1 (DocumentNumber) and filter columns where value in DocumentNumber field is same. For all such rows, the field... (7 Replies)
Discussion started by: as7951
7 Replies
All times are GMT -4. The time now is 05:18 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy