match string exactly with awk/sed


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting match string exactly with awk/sed
# 1  
Old 05-12-2011
match string exactly with awk/sed

Hi all,

I have a list that I would like to parse with awk/sed. The list is contains entries such as:
Code:
JournalTitle: Biochemistry
JournalTitle: Biochemistry and cell biology = Biochimie et biologie cellulaire
JournalTitle: Biochemistry and experimental biology
JournalTitle: Biochemistry and molecular biology education : a bimonthly publication of the International Union of Biochemistry and Molecular Biology
JournalTitle: Biochemistry and molecular biology international
JournalTitle: Biochemistry. Biokhimiia
JournalTitle: Biochemistry international
JournalTitle: Biochemistry research international
JournalTitle: Comparative biochemistry and physiology. Biochemistry and molecular biology
JournalTitle: Comparative biochemistry and physiology. Part B, Biochemistry & molecular biology
JournalTitle: Doklady. Biochemistry and biophysics
JournalTitle: Doklady biochemistry : proceedings of the Academy of Sciences of the USSR, Biochemistry section / translated from Russian
JournalTitle: Life sciences. Pt. 2: Biochemistry, general and molecular biology
JournalTitle: The Journal of experimental zoology. Supplement : published under auspices of the American Society of Zoologists and the Division of Comparative Physiology and Biochemistry / the Wistar Institute of Anatomy and Biology

If I want to search for "Biochemistry", I would like it to return this entry only and not any other combinations:
Code:
JournalTitle: Biochemistry

At present what I have is:
Code:
awk '/JournalTitle:/&&/Biochemistry/' J_Medline.txt | awk -F ":" '{print $0}'

but that does not give the desired result (due to my ignorance of awk syntax). Suggestions much appreciated!
# 2  
Old 05-12-2011
Something like this?
Code:
awk '/JournalTitle: Biochemistry/ && NF==2' J_Medline.txt

# 3  
Old 05-12-2011
Thanks for replying. Yes that works, but my issue is a bit a deeper - the regular expression in awk will be supplied from a variable within a script, so it may be "Biochemistry" or "Biochemistry and cell biology". I need awk to return exact match each time, and I have no way of knowing what number of fields are going to be in the regular expression for matching using your one-liner (thinking about this - maybe this can be calculated using echo/wc and then passed into awk expression?) Any ideas/thoughts would be gratefully received!
# 4  
Old 05-12-2011
You can do something like this:
Code:
regex="Biochemistry and cell biology" 
awk -v var="$regex" '$0==var' J_Medline.txt

# 5  
Old 05-12-2011
Quote:
Originally Posted by euval
I need awk to return exact match each time, and I have no way of knowing what number of fields are going to be in the regular expression for matching using your one-liner (thinking about this - maybe this can be calculated using echo/wc and then passed into awk expression?) Any ideas/thoughts would be gratefully received!
Do you want to output exactly the substring matched by your regexp? I.e. should the whole input line be printed if it fully matches your regexp and has no more characters outside of match or may an input line simply contain your regexp along with other characters but you wish to print only the matched part?
# 6  
Old 05-12-2011
sidorenko - "Do you want to output exactly the substring matched by your regexp?"

Yes - that is exactly what I need. Any ideas?
# 7  
Old 05-12-2011
Code:
>gawk -v re="hi baby" 'match($0,re){print substr($0,RSTART,RLENGTH)}'
ahasdf
ahsshi babyshsd
hi baby

replace "hi baby" with the variable you want to pass
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sed - print only the chars that match a given set in a string

For a given string that may contain any ASCII chars, i.e. that matches .*, find and print only the chars that are in a given subset. The string could also have numbers, uppercase, special chars such as ~!@#$%^&*(){}\", whatever a user could type in without going esoteric For simplicity take... (1 Reply)
Discussion started by: naderra
1 Replies

2. Shell Programming and Scripting

Match exact String with sed command

I have a workaround to the problem i m posting, however if someone wants to look at my query and respond ... i will appreciate. This is in reference to this thread -> https://www.unix.com/shell-programming-and-scripting/267630-extract-between-two-exact-matched-strings.html I have data.txt as... (11 Replies)
Discussion started by: mohtashims
11 Replies

3. Shell Programming and Scripting

Need help with sed to match and replace a string

friends I am struck in a situation where I need to comment a line start with space as below in a file root@LOCALHOST * rw LOCALHOST* r I should comment second line only Any help please (16 Replies)
Discussion started by: mallak
16 Replies

4. Shell Programming and Scripting

Sed:- Supported variable replacement after string match?

Hi All, I am trying to replace the variable in the file after the particular match string. It is being replaced if i hardcode the value and with use of "&" with sed. sed -e "s/URL./& http:\\localhost:7223/g" But when am trying to pass the variable it is failing. I tried multiple... (9 Replies)
Discussion started by: sharsour
9 Replies

5. Shell Programming and Scripting

awk : match the string and string with the quotes :

Hi all, Here is the data file: - want to match only lan3 in the output . - not lan3:1 file : OPERATING_SYSTEM=HP-UX LOOPBACK_ADDRESS=127.0.0.1 INTERFACE_NAME="lan3" IP_ADDRESS="10.53.52.241" SUBNET_MASK="255.255.255.192" BROADCAST_ADDRESS="" INTERFACE_STATE=""... (2 Replies)
Discussion started by: rveri
2 Replies

6. UNIX for Dummies Questions & Answers

awk for trimming a string up to the first, then second, then third... match

Hi ! With awk, I would need to trim a string from the beginning up to the first occurrence of "1", then from the beginning up to the second occurrence of "1", then from the beginning up to the third, then the fourth...., then the last occurrence of "1". input: 1aaa1bb1ccccccc dd1e1ffff... (7 Replies)
Discussion started by: beca123456
7 Replies

7. Shell Programming and Scripting

sed or awk command to replace a string pattern with another string based on position of this string

here is what i want to achieve... consider a file contains below contents. the file size is large about 60mb cat dump.sql INSERT INTO `table1` (`id`, `action`, `date`, `descrip`, `lastModified`) VALUES (1,'Change','2011-05-05 00:00:00','Account Updated','2012-02-10... (10 Replies)
Discussion started by: vivek d r
10 Replies

8. Shell Programming and Scripting

problem using sed to match a string

Hi There! I'm stuck with a problem trying to output some sections of a rss feed to my conky program using curl and sed. The rss feed is for tide times and I wish to output the times but not the rest to the conky desktop. To do this I need to pull out the four instances of times that are in... (4 Replies)
Discussion started by: huffpuff
4 Replies

9. Shell Programming and Scripting

How to get part of string in awk from match

Hi, Im an awk noob and I am having trouble trying to get matches. Here is my script: #!/bin/gawk -f BEGIN {} $0 ~ /<a href=".*">.*<\/a>/{print} Ideally I want to be able to get the actual link and print it. In PHP you can do preg_replace and get the match you want by using \\1 where 1... (2 Replies)
Discussion started by: adsyuk
2 Replies

10. Shell Programming and Scripting

sed to match only exact string only in all occurences

Dear Friends, Anybody knows how to match exact lines only in multilinear. Input file: apple orange orange apple apple orange Desired output: fruit orange apple fruit i used the command (1 Reply)
Discussion started by: vasanth.vadalur
1 Replies
Login or Register to Ask a Question