Match entries


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Match entries
# 1  
Old 08-16-2012
Match entries

Hi all,

I have a big file like this kind of data in two columns

Code:
benzene   blood disorders
 nitrozen    air disease


I have another file like this in different columns

Code:
AXY  benzene, nitrogen,  oxygen, not available, sulfur
VTY  nitrogen, notavailable, sulfur

I want from second columnonwards of second file it should match the first column of first file so that and if mathced right in fornt of it whatever is written.

Exoected output is

Code:
AXY  benzene (blood disorders), nitrogen (air disease),  oxygen, not available, sulfur.
VTY  nitrogen (air disease), notavailable, sulfur

# 2  
Old 08-17-2012
It doesn’t help that your file has mixed commas and spaces as separators.

The sub and gsub commands below are needed to deal with these extra spaces:

Code:
awk 'NR==FNR{lookup[$1]=$2;for(i=3;i<=NF;i++)lookup[$1]=lookup[$1]" "$i;next}
{ sub(/ /,",");
 printf $1" ";
 for(i=2;i<=NF;i++) {
    key=$i;
    gsub(/^  */,"",key);
    if(i>2) printf ",";
    printf "%s", $i;
    if (key in lookup) printf " (%s)", lookup[key]
 }
 printf "\n"}' fileA FS="," fileB

# 3  
Old 09-04-2012
Request to check

Hi

Thanks for reply. But output in thsi case is just same as input of second file(file B) of prgramm

I checked following

Code:
bash-3.2$ awk 'NR==FNR{lookup[$1]=$2;for(i=3;i<=NF;i++)lookup[$1]=lookup[$1]" "$i;next}
{ sub(/ /,",");
 printf $1" ";
 for(i=2;i<=NF;i++) {
    key=$i;
    gsub(/^  */,"",key);
    if(i>2) printf ",";
    printf "%s", $i;
    if (key in lookup) printf " (%s)", lookup[key]
 }
 printf "\n"}' drugbankdrugs FS="," drugbanktest >drugabnktest1
bash-3.2$

Kindly guide
# 4  
Old 09-04-2012
Code:
 awk 'NR==FNR{x=$1;print x;$1=y;a[x]=$0;next}{x=$1;printf x" ";$1=y;n=split($0,b,",");for(i=1;i<=n;i++){sub(" ","",b[i]);if(a[b[i]]){printf "%s (%s),",b[i],a[b[i]]}else{printf "%s ,",b[i]}};printf "\n"}' file1 file2

# 5  
Old 09-04-2012
Request to check

Hi Raj

The output is now combination of both files. so the output file contain file1 on top and file2 at bottom but there is no arrangement done

Thanks
Mani

---------- Post updated at 06:11 AM ---------- Previous update was at 06:09 AM ----------

Here I have attached both files
# 6  
Old 09-04-2012
I use this command to convert your file to a sed script file :
Code:
awk {printf "s/" $1 "/& (" ; for (i=2;i<=NF-1;i++) { printf $i " "};printf $NF")/g\n"} drugs.txt > file_sed

And the new file is :
Code:
s/benzene/& (blood disorders)/g
s/nitrogen/& (air disease)/g

Then with a simple sed :
Code:
sed -f file_sed drugbankdrugs.txt > file_output

# 7  
Old 09-04-2012
Hi

I am not able to convert into sed file as it shows following error, kindly check
Code:
bash-3.2$ awk {printf "s/" $1 "/& (" ; for (i=2;i<=NF-1;i++) { printf $i " "};printf $NF")/g\n"} drugs.txt > file_sed
bash: syntax error near unexpected token `('


Last edited by radoulov; 09-04-2012 at 09:57 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Data match 2 files based on first 2 columns matching only and join if match

Hi, i have 2 files , the data i need to match is in masterfile and i need to pull out column 3 from master if column 1 and 2 match and output entire row to new file I have tried with join and awk and i keep getting blank outputs or same file is there an easier way than what i am... (4 Replies)
Discussion started by: axis88
4 Replies

2. Shell Programming and Scripting

awk to print match or non-match and select fields/patterns for non-matches

In the awk below I am trying to output those lines that Match between file1 and file2, those Missing in file1, and those missing in file2. Using each $1,$2,$4,$5 value as a key to match on, that is if those 4 fields are found in both files the match, but if those 4 fields are not found then missing... (0 Replies)
Discussion started by: cmccabe
0 Replies

3. Shell Programming and Scripting

awk to update file based on partial match in field1 and exact match in field2

I am trying to create a cronjob that will run on startup that will look at a list.txt file to see if there is a later version of a database using database.txt as the source. The matching lines are written to output. $1 in database.txt will be in list.txt as a partial match. $2 of database.txt... (2 Replies)
Discussion started by: cmccabe
2 Replies

4. Shell Programming and Scripting

Display match or no match and write a text file to a directory

The below bash connects to a site, downloads a file, searches that file based of user input - could be multiple (all that seems to work). What I am not able to figure out is how to display on the screen match found or no match found" and write a file to a directory (C:\Users\cmccabe\Desktop\wget)... (4 Replies)
Discussion started by: cmccabe
4 Replies

5. Shell Programming and Scripting

Match pattern1 in file, match pattern2, substitute value1 in line

not getting anywhere with this an xml file contains multiple clients set up with same tags, different values. I need to parse the file for client foo, and change the value of tag "64bit" from false to true. cat clients.xml <Client type"FIX"> <ClientName>foo</ClientName>... (3 Replies)
Discussion started by: jack.bauer
3 Replies

6. Shell Programming and Scripting

Match first column and separate entries

Hi I have 2 big files containing following information: file 1 12345 345634 217341 87234693 8236493 file 2: 12345 1237 (6 Replies)
Discussion started by: kaav06
6 Replies

7. Shell Programming and Scripting

Match first column entries precisely and fetch whatever in front of it

Hi all I have 2 files: first file AABC TTYP JKBH CVBN NHJK KJHM Second file is AABC,XCYU,JUHD Alllele1 GACXT It is approved study TTYP,JKBH Allele2 PPRD It is clinical trial study JKBH Allele2 PPRD ... (5 Replies)
Discussion started by: Priyanka Chopra
5 Replies

8. Shell Programming and Scripting

Match the word or words and fetch the entries

Hi all, I have 7 words Now I have 1 file which contain data in large number of rows and columns and 6th column contain any of these words or may be more than one words among above 7 words: I want script should search for the above mentioned 7 words in the 6th column ... (9 Replies)
Discussion started by: manigrover
9 Replies

9. UNIX and Linux Applications

mysql query all entries which 'dont' match

I am trying to query a list of hosts and extract all entries which 'dont' match. SELECT LOGS.host, GOODLIST.host FROM LOGS,db.GOODLIST WHERE (LOGS.host <> GOODLIST.host)When I use this query, it is very very slow. Matching the host with the GOODLIST.host works great and fast but when I use <>... (1 Reply)
Discussion started by: hazno
1 Replies

10. Shell Programming and Scripting

sed command to match log entries

Hello, I would like to write a sed comman dwith pattern matching .to match those record which have some character(userd id) present in it. Example : Logfile contents -- 127.0.0.1 - - "POST /cgi-bin/cgi-module-example.pl HTTP/1.1" 200 1866 127.0.0.1 - - "POST... (4 Replies)
Discussion started by: jambesh
4 Replies
Login or Register to Ask a Question