Sponsored Content
Top Forums Shell Programming and Scripting Help required on Length based lookup Post 302929135 by rramkrishnas on Thursday 18th of December 2014 06:32:57 AM
Old 12-18-2014
Help required on Length based lookup

Hi,

I have two files one (abc.txt) is having approx 28k records and another (bcd.txt) on is having 112k records, the length of each files are varried.

I am trying to look up abc.txt file with bcd.txt based on length, where ever abc.txt records are matching with bcd.txt I am successful match the records with bcd, but I am unable to fetch the records which are not matching with bcd.txt.

abc.txt
Code:
191
120
122
123
1234
1245
123456
1890

bcd.txt

Code:
120
1201
1203
121
122
1224
12345
123
199

I want the mismatch in each file are as below:

Code:
abc.txt matches with bcd.txt
120
1201
1203
122
1224
123
12345
abc.txt not matches with bcd.txt
191
1890
1245
 
bcd.txt not matches with abc.txt
199

below is my script which I tried for matching of the records, but its is taking almost 5 hours, and next I am unable to find the mismatch records for both the files.

Code:
awk -F"," 'BEGIN{OFS=","}
{
if(NR==FNR){
a[FNR]=$0;max=FNR;Next}
if(NR!=FNR)
 {
if (FNR==1) print $0;
 for ( i=1;i<=max;i++)
 {  
 tmp = a[i];
 len = length(tmp);
 if(substr($1,1,len) ==tmp)
 {print $0;}
 } #End For
 } #End if
}' abc.txt bcd.txt > abc_matches_bcd.txt;

Please help me on this, this will save a lot of manual work at my end.

Regards,
Ram
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Need find a file based length

Can some please help me? Want to find files over 35 characters in length? I am running HPUX. Would it be possible with find? Thanks in advance (8 Replies)
Discussion started by: J_ang
8 Replies

2. UNIX for Advanced & Expert Users

Clueless about how to lookup and reverse lookup IP addresses under a file!!.pls help

Write a quick shell snippet to find all of the IPV4 IP addresses in any and all of the files under /var/lib/output/*, ignoring whatever else may be in those files. Perform a reverse lookup on each, and format the output neatly, like "IP=192.168.0.1, ... (0 Replies)
Discussion started by: choco4202002
0 Replies

3. Shell Programming and Scripting

SED based on file lookup

Newb here trying to figure this one out. :confused: I am trying to create a SED (or some other idea) line that will replace the data field if the original text is seen in a separate text file. The lookup file would be line delimted. For example: sed 's/<if in file>/YES/' File structure:... (3 Replies)
Discussion started by: sdlennon
3 Replies

4. Shell Programming and Scripting

Split strings based on length

Hi All I am very much in need of help splitting strings based on length in Perl. e.g., Input text is : International NOUN Corp. NOUN 's POS Tulsa NOUN Output I want is : International I In Int Inte l al nal onal NOUN Corp. C Co Cor Corp . p. rp. orp. NOUN... (2 Replies)
Discussion started by: my_Perl
2 Replies

5. UNIX for Dummies Questions & Answers

Sorting words based on length

i need to write a bash script that recive a list of varuables kaka pele ronaldo beckham zidane messi rivaldo gerrard platini i need the program to print the longest word of the list. word in the output appears on a separate line and word order in the output is in the order Llachsicografi costs.... (1 Reply)
Discussion started by: yairpg
1 Replies

6. UNIX for Dummies Questions & Answers

Length of a segment based on coordinates

Hi, I would like to have the length of a segment based on coordinates of its parts. Example input file: chr11 genes_good3.gtf aggregate_gene 1 100 gene1 chr11 genes_good3.gtf exonic_part 1 60 chr11 genes_good3.gtf exonic_part 70 100 chr11 genes_good3.gtf aggregate_gene 200 1000 gene2... (2 Replies)
Discussion started by: fadista
2 Replies

7. Shell Programming and Scripting

Append spaces the rows to make it into a required fixed length file

I want to make a script to read row by row and find its length. If the length is less than my required length then i hav to append spaces to that paritucular row. Each row contains special characters, spaces, etc. For example my file contains , 12345 abcdef 234 abcde 89012 abcdefgh ... (10 Replies)
Discussion started by: Amrutha24
10 Replies

8. Shell Programming and Scripting

Filtering duplicates based on lookup table and rules

please help solving the following. I have access to redhat linux cluster having 32gigs of ram. I have duplicate ids for variable names, in the file 1,2 are duplicates;3,4 and 5 are duplicates;6 and 7 are duplicates. My objective is to use only the first occurrence of these duplicates. Lookup... (4 Replies)
Discussion started by: ritakadm
4 Replies

9. Shell Programming and Scripting

Append 0's based on length

I'm having data like this, "8955719","186497034","0001","M","3" "8955719","186497034","0002","M","10" "8955719","186497034","0003","M","10" "8955719","186497034","0004","M","3" "8955723","186499034","0001","M","3" "8955723","186499034","0002","M","10" "8955723","186499034","0003","M","10"... (3 Replies)
Discussion started by: Artlk
3 Replies

10. Shell Programming and Scripting

Outputting sequences based on length with sed

I have this file: >ID1 AA >ID2 TTTTTT >ID-3 AAAAAAAAA >ID4 TTTTTTGGAGATCAGTAGCAGATGACAG-GGGGG-TGCACCCC Add I am trying to use this script to output sequences longer than 15 characters: sed -r '/^>/N;{/^.{,15}$/d}' The desire output would be this: >ID4... (8 Replies)
Discussion started by: Xterra
8 Replies
All times are GMT -4. The time now is 01:49 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy