Match the word or words and fetch the entries


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Match the word or words and fetch the entries
# 1  
Old 08-07-2012
Match the word or words and fetch the entries

Hi all,

I have 7 words

Quote:
CAD
CD
HT
RA
T1D
T2D
BD
Now I have 1 file which contain data in large number of rows and columns

and 6th column contain any of these words or may be more than one words among above 7 words:


Quote:
CHRM1 P11229 Pirenzepine DAP000492 Peptic ulcer disease Approved T2D
CHRM1 P11229 Glycopyrrolate DAP001116 Anesthetic Approved T2D
CHRM1 P11229 Clidinium DAP001117 Abdominal/stomach pain Approved T2D
CHRM1 P11229 Dicyclomine DAP001118 Irritable bowel syndrome Approved T2D
CHRM1 P11229 Ethopropazine DAP001119 Parkinson's disease Approved T2D
CHRM1 P11229 Cycrimine DAP001120 Parkinson's disease Approved T2D
CHRM1 P11229 Benztropine DAP001121 Parkinson's disease Approved T2D,HT
CHRM1 P11229 Trihexyphenidyl DAP001122 Parkinson's disease Approved T2D,CD
CHRM1 P11229 Propantheline DAP001123 Excessive sweating (hyperhidrosis) Approved T2D,T1D
CHRM1 P11229 Oxyphenonium DAP001124 Spasm Approved T2D

I want

script should search for the above mentioned 7 words in the 6th column

if it will find it shuld separate the whole row and put in a different file so that
I shuld get actually get 7 files in output:

each file for entries for each word.

so output for T2D file

Quote:
CHRM1 P11229 Pirenzepine DAP000492 Peptic ulcer disease Approved
CHRM1 P11229 Glycopyrrolate DAP001116 Anesthetic Approved
CHRM1 P11229 Clidinium DAP001117 Abdominal/stomach pain Approved
CHRM1 P11229 Dicyclomine DAP001118 Irritable bowel syndrome Approved
CHRM1 P11229 Ethopropazine DAP001119 Parkinson's disease Approved
CHRM1 P11229 Cycrimine DAP001120 Parkinson's disease Approved
CHRM1 P11229 Benztropine DAP001121 Parkinson's disease Approved
CHRM1 P11229 Trihexyphenidyl DAP001122 Parkinson's disease Approved
CHRM1 P11229 Propantheline DAP001123 Excessive sweating (hyperhidrosis) Approved
CHRM1 P11229 Oxyphenonium DAP001124 Spasm Approved
Output for T1D file

Quote:
CHRM1 P11229 Propantheline DAP001123 Excessive sweating (hyperhidrosis) Approved
In the saem way for HT file and others.

Kindly let me know scripting regarding this.
# 2  
Old 08-07-2012
Code:
awk '{n=split($NF,a,",");$NF=x;for(i=1;i<=n;i++){print > a[i]}}'  inputfile

Here i had assumed at least on of the 7 words appear in the said column
# 3  
Old 08-07-2012
Thanks for the reply.

the output is blank file

Quote:
bash-3.2$ awk '{n=split($NF,a,",");$NF=x;for(i=1;i<=n;i++){print > a[i]}}' sara
ttdnewdruggene4.txt > sarattdnewdruggene5.txt
-bash-3.2$ cat sarattdnewdruggene5.txt
-bash-3.2$
I m wondering hw the script will reconise 7 words also.

as u can see 6th column contain ateast one of the 7 words no doubt.
# 4  
Old 08-07-2012
Code:
awk -F'\t' 'FNR==NR{a[$0]=1;next} {
gsub(/Approved */,"",$6)
n=split($6,b,",")
$6=""
for(i=1;i<=n;i++)
 if(b[i] in a)
  print $0, "Approved" > "file_" b[i] ".txt"
}' OFS='\t' lookupfile mainfile

# 5  
Old 08-07-2012
In this case as well output is same: It's blank! Please check it. I have put input and output file name in place of lookupfile and mainfile

Code:
bash-3.2$ awk -F'\t' 'FNR==NR{a[$0]=1;next} {
> gsub(/Approved */,"",$6)
> n=split($6,b,",")
> $6=""
> for(i=1;i<=n;i++)
>  if(b[i] in a)
>   print $0, "Approved" > "file_" b[i] ".txt"
> }' OFS='\t' lookupfile mainfile
awk: cmd. line:8: fatal: cannot open file `lookupfile' for reading (No such file
 or directory)
-bash-3.2$ awk -F'\t' 'FNR==NR{a[$0]=1;next} {
gsub(/Approved */,"",$6)
n=split($6,b,",")
$6=""
for(i=1;i<=n;i++)
 if(b[i] in a)
  print $0, "Approved" > "file_" b[i] ".txt"
}' OFS='\t'  sarattdnewdruggene4.txt >sarattdnewdruggene5.txt
-bash-3.2$

# 6  
Old 08-07-2012
Smilie Put those 7 words in "lookupfile" and replace "mainfile" with your inputfile...
# 7  
Old 08-07-2012
Thanks. I did this. But, still output file is blankSmilie

Code:
-bash-3.2$ awk -F'\t' 'FNR==NR{a[$0]=1;next} {
gsub(/Approved */,"",$6)
n=split($6,b,",")
$6=""
for(i=1;i<=n;i++)
 if(b[i] in a)
  print $0, "Approved" > "file_" b[i] ".txt"
}' OFS='\t' lookupfie  sarattdnewdruggene4.txt >sarattdnewdruggene5.txt
-bash-3.2$ awk -F'\t' 'FNR==NR{a[$0]=1;next} {
gsub(/Approved */,"",$6)
n=split($6,b,",")
$6=""
for(i=1;i<=n;i++)
 if(b[i] in a)
  print $0, "Approved" > "file_" b[i] ".txt"
}' OFS='\t' lookupfile  sarattdnewdruggene4.txt >sarattdnewdruggene6.txt
-bash-3.2$


Last edited by Franklin52; 08-07-2012 at 06:46 AM.. Reason: Please use code tags for data and code samples
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to dynamically fetch lines after a match?

Hi Friends, How to fetch current hour data from a log file, given below? I want all lines after the match "Wed Aug 13 16:" I have tried below command, but not working. If I put exact string, then it is working. cat /iscp/user/monitor/ORA_errors |awk '/`date +%h" "%d" "%h`/,printed==999 {... (7 Replies)
Discussion started by: suresh3566
7 Replies

2. Shell Programming and Scripting

Gawk gensub, match capital words and lowercase words

Hi I have strings like these : Vengeance mitt Men Vengeance gloves Women Quatro Windstopper Etip gloves Quatro Windstopper Etip gloves Girls Thermobite hooded jacket Thermobite Triclimate snow jacket Boys Thermobite Triclimate snow jacket and I would like to get the lower case words at... (2 Replies)
Discussion started by: louisJ
2 Replies

3. Shell Programming and Scripting

Match first column entries precisely and fetch whatever in front of it

Hi all I have 2 files: first file AABC TTYP JKBH CVBN NHJK KJHM Second file is AABC,XCYU,JUHD Alllele1 GACXT It is approved study TTYP,JKBH Allele2 PPRD It is clinical trial study JKBH Allele2 PPRD ... (5 Replies)
Discussion started by: Priyanka Chopra
5 Replies

4. Shell Programming and Scripting

Match columns and fetch whatever in front of it

Hi Solved these kind of issues using these codes But these are not wrking for my attached files can anybody check........ awk 'NR==FNR{X++;next}{if(X){print}}' file1 file2 awk 'NR==FNR{X=$0;next}{n=split($1,P," ");sub($1,"",$0);for(i=1;i<=n;i++){if(X]){print P,$0}}}' file1 FS="\t" file2 ... (6 Replies)
Discussion started by: Priyanka Chopra
6 Replies

5. Shell Programming and Scripting

Fetch entries in front of specific word till next word

Hi all I have following file which I have to edit for research purpose file:///tmp/moz-screenshot.png body, div, table, thead, tbody, tfoot, tr, th, td, p { font-family: &quot;Liberation Sans&quot;; font-size: x-small; } Drug: KRP-104 QD Drug: Placebo Drug: Metformin|Drug:... (15 Replies)
Discussion started by: Priyanka Chopra
15 Replies

6. Shell Programming and Scripting

Match words and fetch data in front of it in second column

Hi all, I have 2 files one file contain data like this in one column AST3 GSTY4 JST3 second file containign data like this in 2 columns AST3(PAXXX),GSTY4(PAXXY) it is used in diabetes KST4 it is used in blood... (6 Replies)
Discussion started by: manigrover
6 Replies

7. Shell Programming and Scripting

match sentence and word adn fetch similar words in alist

Hi all, I have ot match sentence list and word list anf fetch similar words in a separate file second file with 2 columns So I want the output shuld be 2 columns like this (3 Replies)
Discussion started by: manigrover
3 Replies

8. Shell Programming and Scripting

awk fetch numbers after the word

Hi, I would want to fetch all the numbers after a word the number of characters could very. how can I do that? below is the example of the data and the expected output sample data 03 xxxx occurs 1090 times. 04 aslkja occurs 10 times. I would want to fetch 10 & 1090 separately. (13 Replies)
Discussion started by: ahmedwaseem2000
13 Replies

9. Shell Programming and Scripting

Print all the words after a match word

Hi, I want to print all words till the last word after the match of "ERROR" word. For e.g. I'll get an sqlplus error with e.g. 1 $ ./calltest_fn.ksh var test_var:=test_fn1; calltest_fn.ksh file1 file2 file3 ERROR at line 4: ORA-06550: line 4, column 11: PLS-00201: identifier... (5 Replies)
Discussion started by: dips_ag
5 Replies

10. Shell Programming and Scripting

To fetch specific words from a file

Hi All, I have a file like this,(This is a sql output file) cat query_file 200000029 12345 10001 0.2 0 I want to fetch the values 200000029,10001,0.2 .I tried using the below code but i could get... (2 Replies)
Discussion started by: girish.raos
2 Replies
Login or Register to Ask a Question