Counting number of records with string row delimiter


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Counting number of records with string row delimiter
# 8  
Old 10-12-2011
A copy paste definitely isn't going to show us if it's full of \r's, so I really don't care what you copy-pasted it from.

What the data was edited in originally is important though. Did you edit it in windows or did it originate on a Windows machine?
# 9  
Old 10-12-2011
If your data is in a file called "myfile.txt", then run the following command:

Code:
od -bc myfile.txt

and paste its output over here.
The command prints the octal dump of your file contents and will display "\r" characters if they exist in there.

tyler_durden
# 10  
Old 10-13-2011
Quote:
Originally Posted by Corona688
awk definitely supports multiple characters as record separators. I tested with your script and your data, it even works with a crummy buxybox awk version.

I think your data's not what you think it is. Did you edit this text file in windows?
POSIX-compliant AWK implementations are not required to support multi-character record separators.

Quote:
Originally Posted by IEEE Std 1003.1-2008

RS

The first character of the string value of RS shall be the input record separator; a <newline> by default. If RS contains more than one character, the results are unspecified. If RS is null, then records are separated by sequences consisting of a <newline> plus one or more blank lines, leading or trailing blank lines shall not result in empty records at the beginning or end of the input, and a <newline> shall always be a field separator, no matter what the value of FS is.
In the Linux world, you can usually count on multi-character RS being treated as a regular expression. Busybox, gawk, and mawk behave this way and that mostly covers the AWK implementations you're likely to find on a Linux system.

nawk (aka New AWK aka BWK AWK aka One True AWK), however, does not support that behavior [1]. When RS is a multi-character string, nawk only uses the first character and it is always used literally (it is never a regular expression).

nawk is quite popular outside of the Linux world. It is used by OS X, FreeBSD, NetBSD, and OpenBSD. nawk is also present on Solaris and I wouldn't be surprised if it's present on other proprietary UNIX systems such as HP-UX and AIX.


Footnote:
1. Although its man page will not admit it, there is one nawk mutant loose in the wild which does treat a multicharacter RS as a regular expression. For details, see the readrec() portion of the following diff: http://cvsweb.netbsd.org/bsdweb.cgi/...?r1=1.1&r2=1.2


Quote:
Originally Posted by aksforum
okay, i confused it.. here is the text file
Code:
f1|_f2|_f3
|_f4~~
f1|_f2|_f3
|_f4~~
f1|_f2|_f3
|_f4~~

you can see that field f3 has a new line character in it.. but i want ~~\n as row delimiter adn so it should count to 3.
Code:
awk 'BEGIN{FS="|_" ; RS="~~\n"} {print NF, n++}END{print n} ' t.txt
5 0
0 1
6 2
0 3
6 4
0 5
2 6
7


somehow awk doesn't take multiple character as field or row delimiters.? how do i that?

thx
Your AWK implementation appears to be using RS=~. If it does not support multi-character RS (whether as a regular expression or a literal string) you cannot do it (at least not easily).

Further, note that even when considering the unintended RS, the field count is wrong. I suspect this is because your field separator, FS="|_" is set to a regular expression which yields undefined behavior. AWK implementations use the extended regular expression flavor. In that grammar, the pipe is a metacharacter whose meaning is undefined if it is the first character in the expression (among other contexts). You should backslash escape the pipe.

Which AWK implementation are you using?

Regards,
Alister
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

I want count of number of records to be printed on each row.

we want the count of number of records to be printed on each row. For Ex: if there are 5 records on one unique id , the count "5'' should be printed on each record in other column. Please help for this. I am using unix & Cygwin. Below are sample records: KCZ0650473... (2 Replies)
Discussion started by: ElijaRajesh
2 Replies

2. Shell Programming and Scripting

Counting number of single quotes in a string

i need to be able to count the number of single quotes ' in the entire string below: "description":"DevOps- Test VM's, System Admins Test VM's ", awk can most likely do this, but here's my attempt using egrep: echo "${STRING}" | egrep -wc '"'"\'"'"' or echo "${STRING}" | egrep -wc... (11 Replies)
Discussion started by: SkySmart
11 Replies

3. Shell Programming and Scripting

Compare two files with different number of records and output only the Extra records from file1

Hi Freinds , I have 2 files . File 1 |nag|HYd|1|Che |esw|Gun|2|hyd |pra|bhe|3|hyd |omu|hei|4|bnsj |uer|oeri|5|uery File 2 |nag|HYd|1|Che |esw|Gun|2|hyd |uer|oi|3|uery output : (9 Replies)
Discussion started by: i150371485
9 Replies

4. Shell Programming and Scripting

Help- counting delimiter in a huge file and split data into 2 files

I’m new to Linux script and not sure how to filter out bad records from huge flat files (over 1.3GB each). The delimiter is a semi colon “;” Here is the sample of 5 lines in the file: Name1;phone1;address1;city1;state1;zipcode1 Name2;phone2;address2;city2;state2;zipcode2;comment... (7 Replies)
Discussion started by: lv99
7 Replies

5. Shell Programming and Scripting

Help me in counting records from file

Hi, Please help me in counting the below records(1st field) from samplefile: Expected output: Count Descr ------------------------------------------- 7 Mean manager 14 ... (7 Replies)
Discussion started by: prashant43
7 Replies

6. Shell Programming and Scripting

Counting records with AWK

I've been working with an awk script and I'm wondeing id it's possible to count records in a file which DO NOT contain, in this instance fields 12 and 13. With the one script I am wanting to display the count for the records WITH fields 12 and 13 and a seperate count of records WITHOUT fields... (2 Replies)
Discussion started by: Glyn_Mo
2 Replies

7. Shell Programming and Scripting

Counting the number of occurances of all characters (a-z) in a string

Hi, I am trying out different scripts in PERL. I want to take a line/string as an input from the user and count the number of occurrances of all the alphabets (a..z) in the string. I tried doingit like this : #! /opt/exp/bin/perl print "Enter a string or line : "; $string = <STDIN>; chop... (5 Replies)
Discussion started by: rsendhilmani
5 Replies

8. Shell Programming and Scripting

Help required for counting delimiter

Hi All, I have a delimited file.Sometime it happens that delimiter get missed between 2 fields,so i need to count the number of delimiter present at each line. exam file.txt a|b|c|d e|f|a cc|dd so output should be 1 3 2 2 3 1 Thanks (2 Replies)
Discussion started by: ravi.sadani19
2 Replies

9. UNIX for Dummies Questions & Answers

count the number of files which have a search string, but counting the file only once

I need to count the number of files which have a search string, but counting the file only once if search string is found. eg: File1: Please note that there are 2 occurances of "aaa" aaa bbb ccc aaa File2: Please note that there are 3 occurances of "aaa" aaa bbb ccc... (1 Reply)
Discussion started by: sudheshnaiyer
1 Replies

10. Shell Programming and Scripting

Count No of Records in File without counting Header and Trailer Records

I have a flat file and need to count no of records in the file less the header and the trailer record. I would appreciate any and all asistance Thanks Hadi Lalani (2 Replies)
Discussion started by: guiguy
2 Replies
Login or Register to Ask a Question