Sponsored Content
Top Forums Shell Programming and Scripting Search for a pattern,extract value(s) from next line, extract lines having those extracted value(s) Post 302674069 by AshwaniSharma09 on Thursday 19th of July 2012 10:27:57 AM
Old 07-19-2012
Thank you very much Corona688, you made my day Smilie.
This script is working fine when I put it in a shell script like this:
Code:
cat temp.sh 
awk -F":" '(!RGX) && /CYTOCHROME C/ && (NF==2) {
        getline
        gsub(/[;, ]*/, "");
        RGX="[" $2 "]"
        FS=" ";
} RGX && ($5 ~ RGX) && /^ATOM/' 1A3R.pdb

But I don't know how to run it on command line directly or by saving it in an AWK script like temp.awk although I use AWK a little bit. Once again thank you very much for the helpSmilie.

Quote:
Originally Posted by Corona688
Whenever you have sed | awk | grep | kitchen | sink, it can probably be done all in one awk. It's a lot more than a glorified 'cut'.

1) Search for a line containing CYTOCHROME C where there's two fields (as delimited by : )
2) Get the next line, clean it up with gsub(strip out " " ";" ","), turn the second field into a regex like [AB]
3) Set field separator to space.
4) For every line thereafter, if the line contains ATOM and the fifth field matches the regex, print the line.

Code:
awk -F":" '(!RGX) && /CYTOCHROME C/ && (NF==2) {
        getline
        gsub(/[;, ]*/, "");
        RGX="[" $2 "]"
        FS=" ";
} RGX && ($5 ~ RGX) && /ATOM/' inputfile

---------- Post updated at 11:24 AM ---------- Previous update was at 11:21 AM ----------

Thanks Vryali for your reply Smilie.

---------- Post updated at 07:57 PM ---------- Previous update was at 11:24 AM ----------

On running the script, some files are giving error. These are few top most lines of 2 files and their respective errors:


Code:
  	 	 	 	 	 	   cat 132L.pdb
 

 HEADER    HYDROLASE(O-GLYCOSYL)                   02-JUN-93   132L 
 TITLE     STRUCTURAL CONSEQUENCES OF REDUCTIVE METHYLATION OF LYSINE 
 TITLE    2 RESIDUES IN HEN EGG WHITE LYSOZYME: AN X-RAY ANALYSIS AT 
 TITLE    3 1.8 ANGSTROMS RESOLUTION 
 COMPND    MOL_ID: 1; 
 COMPND   2 MOLECULE: HEN EGG WHITE LYSOZYME; 
 COMPND   3 CHAIN: A; 
 COMPND   4 EC: 3.2.1.17; 
 COMPND   5 ENGINEERED: YES 
 SOURCE    MOL_ID: 1; 
 SOURCE   2 ORGANISM_SCIENTIFIC: GALLUS GALLUS; 
 SOURCE   3 ORGANISM_COMMON: CHICKEN; 
 :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
 

 cat 1G7H.pdb  
 

 HEADER    HYDROLASE INHIBITOR/HYDROLASE           10-NOV-00   1G7H 
 TITLE     CRYSTAL STRUCTURE OF HEN EGG WHITE LYSOZYME (HEL) COMPLEXED 
 TITLE    2 WITH THE MUTANT ANTI-HEL MONOCLONAL ANTIBODY D1.3(VLW92A) 
 COMPND    MOL_ID: 1; 
 COMPND   2 MOLECULE: ANTI-HEN EGG WHITE LYSOZYME MONOCLONAL ANTIBODY 
 COMPND   3 D1.3; 
 COMPND   4 CHAIN: A; 
 COMPND   5 FRAGMENT: LIGHT CHAIN; 
 COMPND   6 ENGINEERED: YES; 
 COMPND   7 MUTATION: YES; 
 COMPND   8 MOL_ID: 2; 
 COMPND   9 MOLECULE: ANTI-HEN EGG WHITE LYSOZYME MONOCLONAL ANTIBODY 
 COMPND  10 D1.3; 
 COMPND  11 CHAIN: B; 
 COMPND  12 FRAGMENT: HEAVY CHAIN; 
 ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
 

 errors:
 

 awk: cmd. line:6: (FILENAME=132L.pdb FNR=4) fatal: Unmatched [ or [^: /[]/ 
 

 awk: cmd. line:6: (FILENAME=1G7H.pdb FNR=6) fatal: Unmatched [ or [^: /[]/

Is it something with gsub function or following expression? Thanks & Regards
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk: need to extract a line before a pattern

Hello , I need your help to extract a line in a big file , and this line is always 11 lines before a specific pattern . Do you know a way via Awk ? Thanks in advance npn35 (17 Replies)
Discussion started by: npn35
17 Replies

2. Shell Programming and Scripting

Extract pattern from text line

Hi, the text line looks like this: "test1" " " "test2" "test3" "test4" "10" "test 10 12" "00:05:58" "filename.bin" "3.3MB" "/dir/name" "18459" what's the best way to select any of it? So I can for example get only the time or size and so on. I was trying awk -F""" '{print $N}' but... (3 Replies)
Discussion started by: TehOne
3 Replies

3. Shell Programming and Scripting

Extract pattern from text line

The text line has the following formats: what.ever.bla.bla.C01G06.BLA.BLA2 what.ever.bla.bla.C11G33.BLA.BLA2 what.ever.bla.bla.01x03.BLA.BLA2 what.ever.bla.bla.03x05.BLA.BLA2 what.ever.bla.bla.Part01.BLA.BLA2 and other similar ones, I need a way to select the "what.ever.bla.bla" part out... (4 Replies)
Discussion started by: TehOne
4 Replies

4. Shell Programming and Scripting

sed: Find start of pattern and extract text to end of line, including the pattern

This is my first post, please be nice. I have tried to google and read different tutorials. The task at hand is: Input file input.txt (example) abc123defhij-E-1234jslo 456ujs-W-abXjklp From this file the task is to grep the -E- and -W- strings that are unique and write a new file... (5 Replies)
Discussion started by: TestTomas
5 Replies

5. Shell Programming and Scripting

Extract two lines before and after the 'search text'

Hi Guys, I have a situation wherein I need to extract two lines from below the search string. Eg. Current: $ grep "$(date +'%a %b %e')" alert.log Mon Apr 12 03:58:10 2010 Mon Apr 12 12:51:48 2010 $ Here I would like the display to be something like Mon Apr 12... (6 Replies)
Discussion started by: geetap
6 Replies

6. Shell Programming and Scripting

extract specific line if the search pattern is found

Hi, I need to extract <APPNUMBER> tag alone, if the <college> haas IIT Chennai value. college tag value will have spaces embedded. Those spaces should not be suppresses. My Source file <Record><sno>1</sno><empid>E0001</empid><name>Rejsh suderam</name><college>IIT ... (3 Replies)
Discussion started by: Sekar1
3 Replies

7. Shell Programming and Scripting

Extract a pattern from multiple lines in a file

I have a file that has some lines starts with * I want to get these lines, then get the word between "diac" and "lex". ex. file: ;;WORD AlAx *0.942490 diac:Al>ax lex:>ax_1 bw:Al/DET+>ax/NOUN+ gloss:brother pos:noun prc3:0 prc2:0 prc1:0 prc0:Al_det per:na asp:na vox:na mod:na gen:m num:s... (4 Replies)
Discussion started by: Viernes
4 Replies

8. Shell Programming and Scripting

Extract lines that match a pattern

Hi all, I got a file that contains the following content, Actually it is a part of the file content, Installing XYZ XYZA Image, API 18, revision 2 Unzipping XYZ XYZA Image, API 18, revision 2 (1%) Unzipping XYZ XYZA Image, API 18, revision 2 (96%) Unzipping XYZ XYZA Image, API 18,... (7 Replies)
Discussion started by: Kashyap
7 Replies

9. UNIX for Dummies Questions & Answers

Extract fields before search pattern

Hi, I have below file structure and need to display hours, minutes and seconds as different fields. Incase hour or minute field is not there it should default to zero. *** Total elapsed time was 2 hours, 54 minutes and 40 seconds. *** Total elapsed time was 42 minutes and 36 seconds.... (7 Replies)
Discussion started by: fristyguy
7 Replies

10. UNIX for Beginners Questions & Answers

Extract some characters from lines based on pattern

Hi All, i would like to get some help regarding extracting certain characters from a line grepped. blahblah{1:F01IRVTUS30XXXX0000000001}{2:I103IRVTDEF0XXXXN}{4:blah blahblah{1:F01IRVTUS30XXXX0000000001}{2:I103IRVTDEF0XXXXN}{4:blah... (10 Replies)
Discussion started by: mad man
10 Replies
All times are GMT -4. The time now is 10:21 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy