matching string in two files of different length


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting matching string in two files of different length
# 1  
Old 05-19-2009
matching string in two files of different length

Dear all,
I have the following problem (it originates in the domain of bio-inf, but it is a general problem).

I have two files of one column each and of different length: a.txt and b.txt.
a.txt contains alphanumeric strings (around 30 digit) and there are 300 rows
b.txt contains alphanumeric strings (around 1000 digit) and there are 16 rows

I want to check (of course for every row) if the string in a.txt is contained by any of the 16 string in b.txt, and if it is the case print the corresponding (long) string of b.txt

I have tried with a "for" cycle and the gawk lines
Code:
for string_b in [... the 16 strings separated by a space  ...]; do 
        awk "{if (match(${string_b},/'\$1'/)){print '${string_b}' else {print 'nulla'}}" < string_a.txt > result
done

but it does not work (and in any case it is not very efficient or smart)

Many thanks! Any help or suggestion is welcome!
# 2  
Old 05-19-2009
Code:
nawk 'FNR==NR {a[$0];next} {for(i in a) if (i ~ $0) print}' a.txt b.txt

# 3  
Old 05-19-2009
Code:
awk '
NR==FNR { key[$0]++ ; next }
{
   for (k in key) {
       if (index($0, k)) {
         print $0
         break
      }
   }
}
' a.txt b.txt

Jean-Pierre.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Using awk to add length of matching characters between field in file

The awk below produces the current output, which will add +1 to $3. However, I am trying to add the length of the matching characters between $5 and $6 to $3. I have tried using sub as a variable to store the length but am not able to do so correctly. I added comments to each line and the... (4 Replies)
Discussion started by: cmccabe
4 Replies

2. Shell Programming and Scripting

Awk: Matching Pattern From other file with length

Hi, I have input file whose first column needs(match.txt) to be matched with the first column of the input file with min & max length as defined in match.txt. But conditions are not matching. Please help on the changes in the code below as for multiple enteries in match.txt complete match.txt will... (3 Replies)
Discussion started by: siramitsharma
3 Replies

3. Shell Programming and Scripting

String Length

Hi All, One of my source file is having Date column and the format of the column is YYYY-MM-DD. As per my business logic I have to check if the date format either YYY-MM-DD or YYYY-M-DD. If any records are in this format then I have print all the records and send those invalid records through... (4 Replies)
Discussion started by: suresh_target
4 Replies

4. Shell Programming and Scripting

Filter (by max length) only lines not matching regex

I have a large file of many pairs of sequences and their headers, which always begin with '>' I'm looking for help on how to retain only sequences (and their headers) below a certain length. So if min length was 10, output would be I can filter by length, but I'm not sure how to exclude... (3 Replies)
Discussion started by: pathunkathunk
3 Replies

5. Shell Programming and Scripting

Help to retrieve data from two files matching a string

Hello Experts, I have come back to this forum after a while now, since require a better way to get my result.. My query is as below.. I have 3 files -- 1 Input file, 2 Data files .. Based on the input file, data has to be retreived matching from two files which has one common key.. For EX:... (4 Replies)
Discussion started by: shaliniyadav
4 Replies

6. Shell Programming and Scripting

Matching string on two files based on match rules.

Hi, How to check if a string on file2 exactly matches with a part or complete string on file1, and return a match indicator based on some match rules. 1) only records on file1 with category A should be matched. for other category, the output match indicator should default to 'N' 2) on file2... (13 Replies)
Discussion started by: effay
13 Replies

7. Shell Programming and Scripting

String with different length

let image that we have string: QQQQQQQ:ABCDE:FFFFFF:GGGGG in second field can be 0 or 5 characters if A exist i need set variable ex: VAR=yes if B exist i need set variable ex: VAR1=yes if C exist i need set variable ex: VAR2=yes etc ... if second field is empty no variable to set if... (4 Replies)
Discussion started by: vikus
4 Replies

8. UNIX for Dummies Questions & Answers

Read a string with leading spaces and find the length of the string

HI In my script, i am reading the input from the user and want to find the length of the string. The input may contain leading spaces. Right now, when leading spaces are there, they are not counted. Kindly help me My script is like below. I am using the ksh. #!/usr/bin/ksh echo... (2 Replies)
Discussion started by: dayamatrix
2 Replies

9. Shell Programming and Scripting

read string, check string length and cut

Hello All, Plz help me with: I have a csv file with data separated by ',' and optionally enclosed by "". I want to check each of these values to see if they exceed the specified string length, and if they do I want to cut just that value to the max length allowed and keep the csv format as it... (9 Replies)
Discussion started by: ozzy80
9 Replies

10. Shell Programming and Scripting

sed problem - replacement string should be same length as matching string.

Hi guys, I hope you can help me with my problem. I have a text file that contains lines like this: 78 ANGELO -809.05 79 ANGELO2 -5,000.06 I need to find all occurences of amounts that are negative and replace them with x's 78 ANGELO xxxxxxx 79... (4 Replies)
Discussion started by: amangeles
4 Replies
Login or Register to Ask a Question