Performance issue to read line by line


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Performance issue to read line by line
# 8  
Old 05-05-2016
Would this come close to what you need?
Code:
awk '
                {RT = substr ($1, 1, 4)}

RT == 1111      {a1 = 100; a2 = a3 = a4 = a5 = a6 = a7 = a8 = a9 = 0}
RT == 1112      {a2++   ; a3 += 2}
RT == 1113      {a5 += 3; a7++}
RT == 1114      {a4 += 3; a6 += 4}
RT == 1115      {a7++   ; a9 += 3}
RT == 1116      {a5++   ; a8 ++}
RT == 2221      {a6 = a7 = a8 = a9 = 0}
RT == 2222      {a3++   ; a7 += 3}
RT == 3333      {a8++   ; a9 += 5}
RT == 5555      {a1++   ; a2 += 3; a3++; a4++}

                {print NR  a1 a2 a3 a4 a5 a6 a7 a8 a9 $0}
' /tmp/test2.txt

# 9  
Old 05-05-2016
Quote:
Originally Posted by RudiC
Would this come close to what you need?
Code:
awk '
                {RT = substr ($1, 1, 4)}

RT == 1111      {a1 = 100; a2 = a3 = a4 = a5 = a6 = a7 = a8 = a9 = 0}
RT == 1112      {a2++   ; a3 += 2}
RT == 1113      {a5 += 3; a7++}
RT == 1114      {a4 += 3; a6 += 4}
RT == 1115      {a7++   ; a9 += 3}
RT == 1116      {a5++   ; a8 ++}
RT == 2221      {a6 = a7 = a8 = a9 = 0}
RT == 2222      {a3++   ; a7 += 3}
RT == 3333      {a8++   ; a9 += 5}
RT == 5555      {a1++   ; a2 += 3; a3++; a4++}

                {print NR  a1 a2 a3 a4 a5 a6 a7 a8 a9 $0}
' /tmp/test2.txt

That is close to what I came up with, but you'll need to use printf with a format string producing fixed-width, leading-zero-filled formats for the NR and a1 through a9 fields instead of just using print. (awk doesn't have the ksh typeset flags to set output formats for values assigned to variables.)
This User Gave Thanks to Don Cragun For This Post:
# 10  
Old 05-06-2016
Thanks! As you can easily see, I'm not a ksher as I didn't have a clue what the typeset -Z could possibly mean...
Still I was wondering if the increasing field size for e.g. NR was actually desired...

Well, use
Code:
                {printf "%010d%010d%07d%03d%06d%02d%07d%09d%02d%04d%s\n", NR, a1, a2, a3, a4, a5, a6, a7, a8, a9, $0}

, then, assuming the last of two entries for a8 in post#1 should count for the field size.
# 11  
Old 05-06-2016
Hi RudiC,
Yes, the 2nd typeset for a8 overrides the 1st typeset for a8. I assume that the field widths were chosen such that there could never be a field overflow. (If there ever is an overflow, the string of decimal digits in the resulting output can't be deciphered since there are no field separators in the output; all of the fields are defined by the column positions they occupy in a line.)

The way I did it was similar to the way you did it, but using an if else tree instead of separate condition-action statements. It takes more space, but runs slightly faster. The patterns used are all mutually exclusive, so subsequent tests can be skipped once a match is found.

Hoping the the OP won't subvert this into a 1-liner, here is the way I did it:
Code:
#!/bin/ksh
awk '
{	rec_type = substr($0, 1, 4)
	if(rec_type == 1111) {
		a1 = 100
		a2 = a3 = a4 = a5 = a6 = a7 = a8 = a9 = 0
	} else if(rec_type == 1112) {
		a2++
		a3 += 2
	} else if(rec_type == 1113) {
		a7++
		a5 += 3
	} else if(rec_type == 1114) {
		a4 += 3
		a6 += 4
	} else if(rec_type == 1115) {
		a7++
		a9 += 3
	} else if(rec_type == 1116) {
		a8++
		a5++
	} else if(rec_type == 2221) {
		a6 = a7 = a8 = a9 = 0
	} else if(rec_type == 2222) {
		a3++
		a7 += 3
	} else if(rec_type == 3333) {
		a8++
		a9 += 5
	} else if(rec_type == 5555) {
		a1++
		a2 += 3
		a3++
		a4++
	}
	printf("%010d%010d%07d%03d%06d%02d%07d%09d%02d%04d%s\n",
		NR, a1, a2, a3, a4, a5, a6, a7, a8, a9, $0)
}' test2.txt > test1_all_data.log

# 12  
Old 05-11-2016
Quote:
Originally Posted by Scrutinizer
try:
Code:
a2=$(( a2 + 1 )); a3=$(( a3 + 2))

Just for the record: it is possible (in ksh) to create an "integer-environment" using the double-brackets. Your line

Code:
a2=$(( a2 + 1 ))

could also be written this (C-like) way:

Code:
(( a2 += 1 ))

This, of course, changes nothing about the correctness of your observations.

I hope this helps.

bakunin
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Performance issue - to read line by line

All- We have a performance issue in reading a file line by line. Please find attached scripts for the same. Currently it is taking some 45 min to parse "512444" lines. Could you please have a look at it and provide any suggestions to improve the performance. Thanks, Balu ... (12 Replies)
Discussion started by: balu1729
12 Replies

2. Shell Programming and Scripting

[BASH] read 'line' issue with leading tabs and virtual line breaks

Heyas I'm trying to read/display a file its content and put borders around it (tui-cat / tui-cat -t(ypwriter). The typewriter-part is a 'bonus' but still has its own flaws, but thats for later. So in some way, i'm trying to rewrite cat using bash and other commands. But sadly it fails on... (2 Replies)
Discussion started by: sea
2 Replies

3. Shell Programming and Scripting

Read line, issue with leading - and {}'s

Heyas With my forum search term 'issue with leading dash' i found 2 closed threads which sadly didnt help me. Also me was to eager to add the script, that i didnt properly test, and just now figured this issue. So i have this code: if ] then while read line do line="${line/-/'\-'}"... (7 Replies)
Discussion started by: sea
7 Replies

4. Shell Programming and Scripting

How to read file line by line and compare subset of 1st line with 2nd?

Hi all, I have a log file say Test.log that gets updated continuously and it has data in pipe separated format. A sample log file would look like: <date1>|<data1>|<url1>|<result1> <date2>|<data2>|<url2>|<result2> <date3>|<data3>|<url3>|<result3> <date4>|<data4>|<url4>|<result4> What I... (3 Replies)
Discussion started by: pat_pramod
3 Replies

5. Shell Programming and Scripting

Need a program that read a file line by line and prints out lines 1, 2 & 3 after an empty line...

Hello, I need a program that read a file line by line and prints out lines 1, 2 & 3 after an empty line... An example of entries in the file would be: SRVXPAPI001 ERRO JUN24 07:28:34 1775 REASON= 0000, PROCID= #E506 #1065: TPCIPPR, INDEX= 003F ... (8 Replies)
Discussion started by: Ferocci
8 Replies

6. Shell Programming and Scripting

how to read the contents of two files line by line and compare the line by line?

Hi All, I'm trying to figure out which are the trusted-ips and which are not using a script file.. I have a file named 'ip-list.txt' which contains some ip addresses and another file named 'trusted-ip-list.txt' which also contains some ip addresses. I want to read a line from... (4 Replies)
Discussion started by: mjavalkar
4 Replies

7. Shell Programming and Scripting

while read LINE issue

Hi, This is the script and the error I am receiving Can anyone please suggest ? For the exmaple below assume we are using vg01 #!/bin/ksh echo "##### Max Mount Count Fixer #####" echo "Please insert Volume Group name to check" read VG lvs |grep $VG | awk {'print $1'} > /tmp/audit.log ... (2 Replies)
Discussion started by: galuzan
2 Replies

8. Shell Programming and Scripting

Multi Line 'While Read' command issue when using sh -c

Hi, I'm trying to run the following command using sh -c ie sh -c "while read EachLine do rm -f $EachLine ; done < file_list.lst;" It doesn't seem to do anything. When I run this at the command line, it does remove the files contained in the list so i know the command works ie... (4 Replies)
Discussion started by: chrispward
4 Replies

9. Shell Programming and Scripting

While loop read line Issue

Hi I am using while loop, below, to read lines from a very large file, around 400,000 rows. The script works fine until around line 300k but then starts giving incorrect result. I have tried running the script with a smaller data set and it works fine. I made sure to include the line where... (2 Replies)
Discussion started by: saurabhkumar198
2 Replies

10. Shell Programming and Scripting

bash: read file line by line (lines have '\0') - not full line has read???

I am using the while-loop to read a file. The file has lines with null-terminated strings (words, actually.) What I have by that reading - just a first word up to '\0'! I need to have whole string up to 'new line' - (LF, 10#10, 16#A) What I am doing wrong? #make file 'grb' with... (6 Replies)
Discussion started by: alex_5161
6 Replies
Login or Register to Ask a Question