Help required on Length based lookup Post: 302929135

Sponsored Content

Top Forums Shell Programming and Scripting Help required on Length based lookup Post 302929135 by rramkrishnas on Thursday 18th of December 2014 06:32:57 AM

12-18-2014

Registered User

Help required on Length based lookup

Hi,

I have two files one (abc.txt) is having approx 28k records and another (bcd.txt) on is having 112k records, the length of each files are varried.

I am trying to look up abc.txt file with bcd.txt based on length, where ever abc.txt records are matching with bcd.txt I am successful match the records with bcd, but I am unable to fetch the records which are not matching with bcd.txt.

abc.txt

Code:

bcd.txt

Code:

I want the mismatch in each file are as below:

Code:

abc.txt matches with bcd.txt
120
1201
1203
122
1224
123
12345
abc.txt not matches with bcd.txt
191
1890
1245
 
bcd.txt not matches with abc.txt
199

below is my script which I tried for matching of the records, but its is taking almost 5 hours, and next I am unable to find the mismatch records for both the files.

Code:

awk -F"," 'BEGIN{OFS=","}
{
if(NR==FNR){
a[FNR]=$0;max=FNR;Next}
if(NR!=FNR)
 {
if (FNR==1) print $0;
 for ( i=1;i<=max;i++)
 {  
 tmp = a[i];
 len = length(tmp);
 if(substr($1,1,len) ==tmp)
 {print $0;}
 } #End For
 } #End if
}' abc.txt bcd.txt > abc_matches_bcd.txt;

Please help me on this, this will save a lot of manual work at my end.

Regards,
Ram

rramkrishnas

View Public Profile for rramkrishnas

Find all posts by rramkrishnas

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Need find a file based length

Can some please help me? Want to find files over 35 characters in length? I am running HPUX. Would it be possible with find? Thanks in advance

2. UNIX for Advanced & Expert Users

Clueless about how to lookup and reverse lookup IP addresses under a file!!.pls help

Write a quick shell snippet to find all of the IPV4 IP addresses in any and all of the files under /var/lib/output/*, ignoring whatever else may be in those files. Perform a reverse lookup on each, and format the output neatly, like "IP=192.168.0.1, ...

3. Shell Programming and Scripting

SED based on file lookup

Newb here trying to figure this one out. :confused: I am trying to create a SED (or some other idea) line that will replace the data field if the original text is seen in a separate text file. The lookup file would be line delimted. For example: sed 's/<if in file>/YES/' File structure:...

4. Shell Programming and Scripting

Split strings based on length

Hi All I am very much in need of help splitting strings based on length in Perl. e.g., Input text is : International NOUN Corp. NOUN 's POS Tulsa NOUN Output I want is : International I In Int Inte l al nal onal NOUN Corp. C Co Cor Corp . p. rp. orp. NOUN...

5. UNIX for Dummies Questions & Answers

Sorting words based on length

i need to write a bash script that recive a list of varuables kaka pele ronaldo beckham zidane messi rivaldo gerrard platini i need the program to print the longest word of the list. word in the output appears on a separate line and word order in the output is in the order Llachsicografi costs....

6. UNIX for Dummies Questions & Answers

Length of a segment based on coordinates

Hi, I would like to have the length of a segment based on coordinates of its parts. Example input file: chr11 genes_good3.gtf aggregate_gene 1 100 gene1 chr11 genes_good3.gtf exonic_part 1 60 chr11 genes_good3.gtf exonic_part 70 100 chr11 genes_good3.gtf aggregate_gene 200 1000 gene2...

7. Shell Programming and Scripting

Append spaces the rows to make it into a required fixed length file

I want to make a script to read row by row and find its length. If the length is less than my required length then i hav to append spaces to that paritucular row. Each row contains special characters, spaces, etc. For example my file contains , 12345 abcdef 234 abcde 89012 abcdefgh ...

8. Shell Programming and Scripting

Filtering duplicates based on lookup table and rules

please help solving the following. I have access to redhat linux cluster having 32gigs of ram. I have duplicate ids for variable names, in the file 1,2 are duplicates;3,4 and 5 are duplicates;6 and 7 are duplicates. My objective is to use only the first occurrence of these duplicates. Lookup...

9. Shell Programming and Scripting

Append 0's based on length

I'm having data like this, "8955719","186497034","0001","M","3" "8955719","186497034","0002","M","10" "8955719","186497034","0003","M","10" "8955719","186497034","0004","M","3" "8955723","186499034","0001","M","3" "8955723","186499034","0002","M","10" "8955723","186499034","0003","M","10"...

10. Shell Programming and Scripting

Outputting sequences based on length with sed

I have this file: >ID1 AA >ID2 TTTTTT >ID-3 AAAAAAAAA >ID4 TTTTTTGGAGATCAGTAGCAGATGACAG-GGGGG-TGCACCCC Add I am trying to use this script to output sequences longer than 15 characters: sed -r '/^>/N;{/^.{,15}$/d}' The desire output would be this: >ID4...

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Need find a file based length

Discussion started by: J_ang

2. UNIX for Advanced & Expert Users

Clueless about how to lookup and reverse lookup IP addresses under a file!!.pls help

Discussion started by: choco4202002

3. Shell Programming and Scripting

SED based on file lookup

Discussion started by: sdlennon

4. Shell Programming and Scripting

Split strings based on length

Discussion started by: my_Perl

5. UNIX for Dummies Questions & Answers

Sorting words based on length

Discussion started by: yairpg

6. UNIX for Dummies Questions & Answers

Length of a segment based on coordinates

Discussion started by: fadista

7. Shell Programming and Scripting

Append spaces the rows to make it into a required fixed length file

Discussion started by: Amrutha24

8. Shell Programming and Scripting

Filtering duplicates based on lookup table and rules

Discussion started by: ritakadm

9. Shell Programming and Scripting

Append 0's based on length

Discussion started by: Artlk

10. Shell Programming and Scripting

Outputting sequences based on length with sed

Discussion started by: Xterra