Sponsored Content
Top Forums Shell Programming and Scripting Extract lines from text files Post 302885386 by Don Cragun on Saturday 25th of January 2014 03:37:21 PM
Old 01-25-2014
Quote:
Originally Posted by edweena
I have some files containing the following data

Code:
 #  RESIDUE AA STRUCTURE BP1 BP2  ACC     N-H-->O    O-->H-N    N-H-->O    O-->H-N    TCO  KAPPA ALPHA  PHI   PSI    X-CA   Y-CA   Z-CA 
    1  196 A M              0   0  230      0, 0.0     2,-0.2     0, 0.0     0, 0.0   0.000 360.0 360.0 360.0  76.4   21.7   -6.8   11.3
    2  197 A D        +     0   0  175      1,-0.1     2,-0.1     0, 0.0     0, 0.0  -0.193 360.0 151.5 -46.2  99.1   23.2   -9.3   13.8
    3  198 A E        -     0   0  170     -2,-0.2    -1,-0.1     0, 0.0     0, 0.0  -0.622  29.3-158.9-134.6  66.9   26.9   -9.0   13.0
    4  199 A K        -     0   0  161      1,-0.1     0, 0.0    -2,-0.1     0, 0.0   0.037  18.7-134.6 -43.9 157.1   28.8   -9.8   16.3
    5  200 A R        +     0   0  174      3,-0.0     2,-1.6     2,-0.0    -1,-0.1   0.294  60.4 134.1 -97.8   0.9   32.4   -8.5   16.6
    6  201 A R        +     0   0  178      1,-0.1    -2,-0.1     2,-0.1     0, 0.0  -0.429  24.5 143.8 -54.0  86.9   33.5  -11.9   17.9
    7  202 A A        +     0   0   76     -2,-1.6    -1,-0.1     2,-0.1    -2,-0.0  -0.471  24.7 108.8-134.5  48.7   36.5  -11.8   15.5
    8  203 A Q  S    S+     0   0  149      3,-0.0     2,-0.1     4,-0.0    -2,-0.1  -0.694  77.8  88.8-115.4  54.1   39.3  -13.4   17.4
    9  204 A H  S >> S-     0   0  121      4,-0.0     3,-2.1     0, 0.0     4,-0.7  -0.341  88.3  -9.7-128.0-146.1   38.5  -16.0   14.8
   10  205 A N  H 3> S+     0   0  145      1,-0.3     4,-0.8     2,-0.2     5,-0.2   0.673 125.2  50.8 -27.9 -50.8   39.4  -17.0   11.2
   11  206 A E  H 34 S+     0   0  159      1,-0.2     4,-0.3     2,-0.1    -1,-0.3   0.843 106.1  59.4 -64.2 -34.5   41.5  -13.9   10.2
   12  207 A V  H X4 S+     0   0   60     -3,-2.1     3,-0.5     2,-0.1     4,-0.4   0.982 107.8  32.9 -62.8 -61.2   43.7  -14.0   13.3
   13  208 A E  H >X S+     0   0   78     -4,-0.7     3,-4.0     1,-0.2     4,-0.9   0.950 109.6  53.5 -70.0 -62.3   45.4  -17.4   13.2

I want to extract the lines only if the values in the phi and psi column must between -67<=phi<=-99 and 100<=psi<=165
I would like to save the outputs in to another folder having the input file names. I highly appreciate your valuable suggestions.

Thanks a lot.
There are several problems here. First, and most importantly, your specification requiring a value for PHI that is greater than -67 and simultaneously less than -99 (-67<=phi<= -99) always yields the empty set.

If we assume that you meant -99 <= PHI <= -67, your sample data still produces no output (except for the heading) because only the fifth line of your input file has a PSI value between 100 and 165, and the PHI value on that line is -43.9 (which is out of range). These values are marked in red above.

When Akshay provided his suggested code, he apparently didn't notice that the data under the heading "STRUCTURE" looks like 0, 1, 2, or 3 fields to awk (when using the default field delimiter). Yoda compensated for that problem, but apparently didn't notice that sometimes there are no field delimiters between values under the headings KAPPA, ALPHA, PHI, and PSI. Some samples of this problem are marked in green above. So, rather than using field delimiters, any code processing these lines will have to be based on column positions in the input file; not field counts.

Are there ever any <tab> characters in your input files? Or, are all of the spaces between fields just sequences of <space> characters?

Please provide us with a specification that doesn't always produce an empty set, and provide us some sample input that includes some lines that will be selected as well as some lines that will be rejected. And, show us the sample output you expect to be produced for that sample input.

And, please tell us how the name of the directory to contain the new files will be passed to your script.

Last edited by Don Cragun; 01-25-2014 at 04:39 PM.. Reason: fix typo
These 2 Users Gave Thanks to Don Cragun For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

extract the lines between specific line number from a text file

Hi I want to extract certain text between two line numbers like 23234234324 and 54446655567567 How do I do this with a simple sed or awk command? Thank you. ---------- Post updated at 06:16 PM ---------- Previous update was at 05:55 PM ---------- found it: sed -n '#1,#2p'... (1 Reply)
Discussion started by: return_user
1 Replies

2. Shell Programming and Scripting

Extract lines of text based on a specific keyword

I regularly extract lines of text from files based on the presence of a particular keyword; I place the extracted lines into another text file. This takes about 2 hours to complete using the "sort" command then Kate's find & highlight facility. I've been reading the forum & googling and can find... (4 Replies)
Discussion started by: DionDeVille
4 Replies

3. Shell Programming and Scripting

Extract lines from files

hi all, I have three files. The first file (FILE_INFO in my code) consists of four parameters for each line. 0.00765600 0.08450704 M3 E3 0.00441931 0.04878049 M4 E5 0.01904574 0.21022727 M5 E10 0.00510400 0.05633803 M6 E12 0.00905960 ... (11 Replies)
Discussion started by: my_Perl
11 Replies

4. Shell Programming and Scripting

extract particular lines from text file

I have two files file A which have a number in every row and file B which contains few hundred thousand rows with about 300 characters in each row (csv) What I need is to extract whole rows from B file (only these which numbers are indicated in A file) I also need to use cygwin. Any... (7 Replies)
Discussion started by: gunio
7 Replies

5. Shell Programming and Scripting

AWK: How to extract text lines between two strings

Hi. I have a text test1.txt file like:Receipt Line1 Line2 Line3 End Receipt Line4 Line5 Line6 Canceled Receipt Line7 Line8 Line9 End (9 Replies)
Discussion started by: TQ3
9 Replies

6. Shell Programming and Scripting

Extract two lines before and after the 'search text'

Hi Guys, I have a situation wherein I need to extract two lines from below the search string. Eg. Current: $ grep "$(date +'%a %b %e')" alert.log Mon Apr 12 03:58:10 2010 Mon Apr 12 12:51:48 2010 $ Here I would like the display to be something like Mon Apr 12... (6 Replies)
Discussion started by: geetap
6 Replies

7. UNIX for Dummies Questions & Answers

Help please, extract multiple lines from a text file

Hi all, I need to extract lines between the lines 'RD' and 'QA' from a text file (following). there are more that one of such pattern in the file and I need to extract all of them. however, the number of lines between them is varied in the file. Therefore, I can not just use 'grep -A' command.... (6 Replies)
Discussion started by: johnshembb
6 Replies

8. Shell Programming and Scripting

How to extract lines between tags into different files?

I have an xml file with the below data: unix>Cat address.xml <Address City=”Amsterdam” Street = “station straat” ZIPCODE="2516 CK " </Address> <Address City=”Amsterdam” Street = “Leeuwen straat” ZIPCODE="2517 AB " </Address> <Address City=”The Hauge” Street = “kirk straat” ... (1 Reply)
Discussion started by: LinuxLearner
1 Replies

9. Shell Programming and Scripting

extract lines from text after keyword

I have a text and I want to extract the 4 lines following a keyword! For example if I have this text and the keyword is AAA hello helloo AAA one two three four helloooo hellooo I want the output to be one two three four (7 Replies)
Discussion started by: stekanius
7 Replies

10. UNIX for Dummies Questions & Answers

Extract the same lines from the two files

I used to use this script to extract the same lines from two files: grep -f file1 file2 > outputfile now I have file1 AB029895 AF208401 AF309648 AF526378 AJ444445 AJ720950 AJ851546 AY568629 AY591907 AY994087 BU116401 BU116599 BU119689 BU121308 BU125622 BU231446 BU236750 BU237045 (4 Replies)
Discussion started by: yuejian
4 Replies
All times are GMT -4. The time now is 08:31 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy