awk multiple line record retrieves only one?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk multiple line record retrieves only one?
# 1  
Old 09-03-2014
awk multiple line record retrieves only one?

I have a file with a structure like this:

Code:
Database sequence: some data
Database position: number
Query: identifier
Location: number
E-value: number

       0         .     :     .     :     .
           STRINGSTRINGSTRINGSTRING
           ||||||||||||||||||||||||||||
           STRINGSTRINGSTRINGSTRING

@

Database sequence: some data
Database position: number
Query: identifier
Location: number
E-value: number

       0     .     :     .     :     .
           STRINGSTRINGSTRINGSTRING
           ||||||||||||||||||||||||||||
           STRINGSTRINGSTRINGSTRING

@

etc.
repeated for many entries, each being unique.

I am trying to capture the first line of each entry, and the first of two matching strings (it actually doesn't matter if it's the first or second, but for precision I want the first).

I found that I can separate the records using the "@" and the fields using a newline, and then print the fields I want using sed (I would post a link but the rules won't let me, yet. Thanks, GNU awk User's Manual).

So I copied the example from there as example.awk:

Code:
BEGIN { RS = "@" ; FS = "\n" }

Code:
{

              print $1
              print $8
}

and I run: awk -f example.awk test.txt > output.txt.
In output.txt, I only get the first record. It looks like

Code:
Database sequence: some data
           STRINGSTRINGSTRINGSTRING

which is what I want, but of course I want them all.

Sorry for the dumb noob question. Thanks in advance.

Last edited by Corona688; 09-03-2014 at 07:12 PM..
# 2  
Old 09-03-2014
You are printing fields 1 and 8, and getting fields 1 and 8.

To get all fields, just 'print'.
# 3  
Old 09-03-2014
It looks like your record separator is an AT symbol on a line by it's self (perhaps also followed by one or more other blank lines?)

I'd suggest trying RS of @\n*:

Code:
BEGIN { RS = "@\n*" ; FS = "\n" }
{
    print $1
    print $8
}

# 4  
Old 09-04-2014
Thanks. Corona688, I only want fields 1 and 8, but I want them for ALL records in the file. I'm only getting 1 & 8 for the FIRST record in the file.

Chubler_XL, you are correct there is whitespace/blank lines around "@" symbol but unfortunately with RS = "@\n*" , I still only get the first record.
# 5  
Old 09-04-2014
I see what you mean. The extra newlines throw off what field your data is.

RS="@\n*" works fine for me. Please post input data it doesn't work for. [edit] Maybe there's carriage returns in there too? RS="@[\r\n]*"

Also, if you're on solaris, try nawk.
# 6  
Old 09-04-2014
I don't know whether I mis-typed something earlier, but the RS = "@\n*" is working for me. Thanks so much!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

"GET" command retrieves multiple files while using wildcard

Hi All I am using GNU/Linux This is regarding the get command to retrieve files (filename with wild card characters) from remote server. I thought Get command can retrieve only 1 file irrespective of the files it has on the remote server And it is the function of mget to retrieve all... (7 Replies)
Discussion started by: sparks
7 Replies

2. Shell Programming and Scripting

Multiple line search, replace second line, using awk or sed

All, I appreciate any help you can offer here as this is well beyond my grasp of awk/sed... I have an input file similar to: &LOG &LOG Part: "@DB/TC10000021855/--F" &LOG &LOG &LOG Part: "@DB/TC10000021852/--F" &LOG Cloning_Action: RETAIN &LOG Part: "@DB/TCCP000010713/--A" &LOG &LOG... (5 Replies)
Discussion started by: KarmaPoliceT2
5 Replies

3. Shell Programming and Scripting

How to compare current record,with next and previous record in awk without using array?

Hi! all can any one tell me how to compare current record of column with next and previous record in awk without using array my case is like this input.txt 0 32 1 26 2 27 3 34 4 26 5 25 6 24 9 23 0 32 1 28 2 15 3 26 4 24 (7 Replies)
Discussion started by: Dona Clara
7 Replies

4. Shell Programming and Scripting

Splitting record into multiple records by appending values from an input field (AWK)

Hello, For the input file, I am trying to split those records which have multiple values seperated by '|' in the last input field, into multiple records and each record corresponds to the common input fields + one of the value from the last field. I was trying with an example on this forum... (4 Replies)
Discussion started by: imtiaz99
4 Replies

5. Shell Programming and Scripting

Reject the record if the record in the next line does not satisfy the pattern

Hi, I have a input file with the following entries: 1one 2two 3three 1four 2five 3six 1seven 1eight 1nine 2ten The output should be 1one 2two 3three 1four 2five 3six (2 Replies)
Discussion started by: supchand
2 Replies

6. Shell Programming and Scripting

apply record separator to multiple files within a directory using awk

Hi, I have a bunch of records within a directory where each one has this form: (example file1) 1 2 50 90 80 90 43512 98 0909 79869 -9 7878 33222 8787 9090 89898 7878 8989 7878 6767 89 89 78676 9898 000 7878 5656 5454 5454 and i want for all of these files to be... (3 Replies)
Discussion started by: amarn
3 Replies

7. Shell Programming and Scripting

Multiple lines in a single column to be merged as a single line for a record

Hi, I have a requirement with, No~Dt~Notes 1~2011/08/1~"aaa bbb ccc ddd eee fff ggg hhh" Single column alone got splitted into multiple lines. I require the output as No~Dt~Notes 1~2011/08/1~"aaa<>bbb<>ccc<>ddd<>eee<>fff<>ggg<>hhh" mean to say those new lines to be... (1 Reply)
Discussion started by: Bhuvaneswari
1 Replies

8. Shell Programming and Scripting

Split a single record to multiple records & add folder name to each line

Hi Gurus, I need to cut single record in the file(asdf) to multile records based on the number of bytes..(44 characters). So every record will have 44 characters. All the records should be in the same file..to each of these lines I need to add the folder(<date>) name. I have a dir. in which... (20 Replies)
Discussion started by: ram2581
20 Replies

9. Shell Programming and Scripting

AWK record in multiple lines

Hi everyboby this is my problem I Have this input 1111;222 222 2;333 3333;4444 111; 22222;33 33; 444 and I need this output 1111;2222222;3333333;4444 (15 Replies)
Discussion started by: agritur
15 Replies

10. Shell Programming and Scripting

AWK - if last line/record do something

Hello: I am trying to perform a certain action if the current record is the last line of the input file. But I am unable to figure out how to determine the last line of a file in awk. I need to do something like this: awk '{ if (lastline == NR) Do Something}' myfile.txt I have tried the... (3 Replies)
Discussion started by: PacificWonder
3 Replies
Login or Register to Ask a Question