AWK Multi-Line Records Numbering Problem


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting AWK Multi-Line Records Numbering Problem
# 1  
Old 10-24-2007
AWK Multi-Line Records Numbering Problem

I have a set of files of multi-line records with the records separated by a blank line. I needed to add a record number to the front of each line followed by a colon and did the following:
Code:
awk 'BEGIN {FS = "\n"; RS = ""}{for (i=1; i<=NF; i++)print NR,":",$i}' ~/Desktop/data98-1-25.txt > ~/Desktop/numbered-data98-1-25.txt

so i would get something like:
1: XXX:CCCC:XYXYX
1: XTZ:CACC:XYXYX
1: XZZ:DDDD:XYXYX
2: XTZ:CACC:XYXYX
2: XZZ:DMMD:XYXYX
3: XZZ:DMMD:XYXYX
4: XZZ:DMMD:XYXYX
4: XVZ:DMHD:XYXYX
4: XVV:DLMD:XYXYX
4: XTZ:DCDD:XYXYX

Problem is my numbers are not coming out right. When i do a count like:
awk '{RS=""; print NR}' ~/Desktop/data98-1-25.txt > ~/Desktop/Count98-1-25.txt

I get the the number i am expecting for the last set of records in the file: 4959 but when i run the code up above for numbering each record the last set of records shows the end number as 4958. Is this one of those NR starts at zero and i started i at 1 or vice-versa type of problems; or is my code wrong to do what i was trying to do?

Another question i will have, is when i go to start processing the next file to start numbering it's records how do i get the count to start on 4960?
# 2  
Old 10-31-2007
Could you paste some sample lines from the input file as well ...
# 3  
Old 11-01-2007
IN your count program, the Record Separator must be define outside the action code.

Code:
awk '{print NR}' RS="" ~/Desktop/data98-1-25.txt 
awk -v RS="" '{print NR}' ~/Desktop/data98-1-25.txt 
awk 'BEGIN {RS=""} {print NR} ~/Desktop/data98-1-25.txt

Jean-Pierre.
# 4  
Old 11-01-2007
Thanks, i did discover that the missing BEGIN statement in my count program makes all the difference in arriving at a correct count to validate that my numbering program was working correctly.

GIVEN INPUT FILE WITH FOLLOWING RECORDS:
Code:
XXX:CCCC:XYXYX
XTZ:CACC:XYXYX
XZZ:DDDD:XYXYX

XTZ:CACC:XYXYX
XZZ:DMMD:XYXYX

XZZ:DMMD:XYXYX

XZZ:DMMD:XYXYX
XVZ:DMHD:XYXYX
XVV:DLMD:XYXYX
XTZ:DCDD:XYXYX

Using my bad count program: awk '{RS=""; print NR}' ~/Desktop/data_in.txt it will return:
1
2
3
4
5

Using your version: awk 'BEGIN {RS=""} {print NR}' ~/Desktop/data_in.txt it correctly returns:
1
2
3
4

This newbie learned a valuable lesson, the hard way.

As an aside, for others who may stumble across this thread; I solved the problem of how to get the count to start on 4960 at the beginning of the next file by doing this:
Code:
awk 'BEGIN {FS = "\n"; RS = ""}{for (i=1; i<=NF; i++)print NR+4959,":",$i}' ~/Desktop/data98-26-50.txt

I'm sure there were probably much better ways to do it, but it accomplished what i needed done to the records in the next file to be processed at the time.

Thanks again to all of you who have helped me along my way in using Awk to get some jobs done.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help with reformat single-line multi-fasta into multi-line multi-fasta

Input File: >Seq1 ASDADAFASFASFADGSDGFSDFSDFSDFSDFSDFSDFSDFSDFSDFSDFSD >Seq2 SDASDAQEQWEQeqAdfaasd >Seq3 ASDSALGHIUDFJANCAGPATHLACJHPAUTYNJKG ...... Desired Output File >Seq1 ASDADAFASF ASFADGSDGF SDFSDFSDFS DFSDFSDFSD FSDFSDFSDF SD >Seq2 (4 Replies)
Discussion started by: patrick87
4 Replies

2. Shell Programming and Scripting

awk use sequential line numbering in output

The awk below produces an output with the original header and only the matching lines (which is good), but the output where the original line numbering in the match found on is used. I can not figure out how to sequentially number the output instead of using the original. I did try to add... (2 Replies)
Discussion started by: cmccabe
2 Replies

3. Shell Programming and Scripting

awk - Multi-line data to be stored in variable

Greetings Experts, As part of automating the sql generation, I have the source table name, target table name, join condition stored in a file join_conditions.txt which is a delimited file (I can edit the file if for any reason). The reason I needed to store is I have built SELECT list without... (5 Replies)
Discussion started by: chill3chee
5 Replies

4. Shell Programming and Scripting

Multi-line filtering based on multi-line pattern in a file

I have a file with data records separated by multiple equals signs, as below. ========== RECORD 1 ========== RECORD 2 DATA LINE ========== RECORD 3 ========== RECORD 4 DATA LINE ========== RECORD 5 DATA LINE ========== I need to filter out all data from this file where the... (2 Replies)
Discussion started by: Finja
2 Replies

5. Shell Programming and Scripting

Conditional Multi-Line Grep Problem

Hi, I have a very large file I want to extract lines from. I'm hoping Grep can do the job, but I'm running into problems. I want to return all lines that match a pattern. However, if the following line of a matched line contains the word "Raw" I want to return that line as well. Is this... (3 Replies)
Discussion started by: redbluefish
3 Replies

6. Shell Programming and Scripting

Transpose multi-line records into a single row

Now that I've parsed out the data that I desire I'm left with variable length multi-line records that are field seperated by new lines (\n) and record seperated by a single empty line ("") At first I was considering doing something like this to append all of the record rows into a single row: ... (4 Replies)
Discussion started by: daveyabe
4 Replies

7. UNIX for Dummies Questions & Answers

Alphabetical sort for multi line records contains in a single file

Hi all, I So, I've got a monster text document comprising a list of various company names and associated info just in a long list one after another. I need to sort them alphabetically by name... The text document looks like this: Company Name: the_first_company's_name_here Address:... (2 Replies)
Discussion started by: quee1763
2 Replies

8. Shell Programming and Scripting

Capturing multi-line records containing known value?

Some records in a file look like this, with any number of lines between start and end flags: /Start Some stuff Banana 1 Some more stuff End/ /Start Some stuff End/ /Start Some stuff Some more stuff Banana 2 End/ ...how would I process this file to find records containing the... (8 Replies)
Discussion started by: cs03dmj
8 Replies

9. Shell Programming and Scripting

sed or awk help - line numbering w/ different start value

I'm pretty new to sed and awk, and I can't quite figure this one out. I've been trying with sed, as I'm more comfortable with it for the time being, but any tool that fits the bill will be fine. I have a few files, whose contents appear more or less like so: 1|True|12094856|12094856|Test|... (7 Replies)
Discussion started by: camwheel
7 Replies

10. Shell Programming and Scripting

AWK Multi-Line Records Processing

I am an Awk newbie and cannot wrap my brain around my problem: Given multi-line records of varying lengths separated by a blank line I need to skip the first two lines of every record and extract every-other line in each record unless the first line of the record has the word "(CONT)" in the... (10 Replies)
Discussion started by: RacerX
10 Replies
Login or Register to Ask a Question