Sponsored Content
Operating Systems AIX Print text between two delimiters Post 302317126 by funksen on Monday 18th of May 2009 06:13:52 AM
Old 05-18-2009
Quote:
Originally Posted by shockneck
I found your AWK line time result impressive The time returned is so much shorter than what I got when using AWK during my tests. To put this difference into figures I reproduced your examples in a 64Bit AIX TL6 system. As my list has approx. 1.2 million lines runtimes are not comparable to your test easily but it may be interesting to see how the results relate to each other (from the AIX system).
Code:
> time sed -e 's/.*_(\(.*[YH]\).*/\1/g' cutlist > /dev/null

real    0m28.00s
user    0m27.26s
sys     0m0.70s

> time cut -f2 -d"(" cutlist | cut -f1 -d")" > /dev/null

real    0m4.81s
user    0m6.00s
sys     0m0.28s

> time awk -F '[()]' '{print $2}' cutlist > /dev/null

real    0m14.22s
user    0m13.95s
sys     0m0.19s

> time awk -F"(" '{gsub(/\).*/,"",$2);print $2}' cutlist > /dev/null

real    0m13.14s
user    0m12.89s
sys     0m0.19s

The price for using AIX AWK and/or SED is definitely higher than using the LINUX equivalents. Surprisingly AIX cut seems to be faster than LINUX cut though.
unfortunately, gawk is faster on linux x86 systems, but not on ppc aix, I tried the tests on a p550 aix6.1, lpar has 1 virtual cpu, uncapped:

the script:
Code:
#!/usr/bin/ksh
set -x
time sed -e 's/.*_(\(.*[YH]\).*/\1/g' file >/dev/null
time cut -f2 -d"(" file | cut -f1 -d")" > /dev/null
time awk -F '[()]' '{print $2}' file  > /dev/null
time gawk -F '[()]' '{print $2}' file  > /dev/null
time awk -F"(" '{gsub(/\).*/,"",$2);print $2}' file > /dev/null
time gawk -F"(" '{gsub(/\).*/,"",$2);print $2}' file > /dev/null
time perl -pe 's/.*_\((.*[YH]).*/\1/g' file >/dev/null
time sed -e 's/).*//' -e 's/^.*(//' file >/dev/null

output:

Code:
-->./runscript              
+ sed -e s/.*_(\(.*[YH]\).*/\1/g file       
+ 1> /dev/null                              

real    0m8.07s
user    0m7.33s
sys     0m0.17s
+ cut -f1 -d)
+ cut -f2 -d( file
+ 1> /dev/null

real    0m1.78s
user    0m0.46s
sys     0m0.02s
+ awk -F [()] {print $2} file
+ 1> /dev/null

real    0m4.31s
user    0m5.46s
sys     0m0.09s
+ gawk -F [()] {print $2} file
+ 1> /dev/null

real    0m6.10s
user    0m5.23s
sys     0m0.41s
+ awk -F( {gsub(/\).*/,"",$2);print $2} file
+ 1> /dev/null

real    0m3.84s
user    0m3.47s
sys     0m0.04s
+ gawk -F( {gsub(/\).*/,"",$2);print $2} file
+ 1> /dev/null

real    0m7.33s
user    0m6.13s
sys     0m0.40s
+ perl -pe s/.*_\((.*[YH]).*/\1/g file
+ 1> /dev/null

real    0m10.34s
user    0m9.27s
sys     0m0.06s
+ sed -e s/).*// -e s/^.*(// file
+ 1> /dev/null

real    0m4.66s
user    0m3.98s
sys     0m0.18s

the same test on a high performance sap-lpar, idle that time, 3 virtual cpus, uncapped, Power 6 4,7 ghz, AIX 5.3 ML09

Code:
./runscript 
+ sed -e s/.*_(\(.*[YH]\).*/\1/g file
+ 1> /dev/null                       

real    0m4.77s
user    0m4.61s
sys     0m0.09s
+ cut -f1 -d)
+ cut -f2 -d( file
+ 1> /dev/null

real    0m1.66s
user    0m2.09s
sys     0m0.06s
+ awk -F [()] {print $2} file
+ 1> /dev/null

real    0m2.51s
user    0m2.45s
sys     0m0.03s
+ gawk -F [()] {print $2} file
+ 1> /dev/null

real    0m3.79s
user    0m3.46s
sys     0m0.31s
+ awk -F( {gsub(/\).*/,"",$2);print $2} file
+ 1> /dev/null

real    0m2.38s
user    0m2.33s
sys     0m0.03s
+ gawk -F( {gsub(/\).*/,"",$2);print $2} file
+ 1> /dev/null

real    0m4.41s
user    0m3.97s
sys     0m0.33s
+ perl -pe s/.*_\((.*[YH]).*/\1/g file
+ 1> /dev/null

real    0m4.49s
user    0m4.41s
sys     0m0.04s
+ sed -e s/).*// -e s/^.*(// file
+ 1> /dev/null

real    0m2.50s
user    0m2.39s
sys     0m0.09s


your cut-command was very fast on all sytems, as we expected
gawk was slow in comparison to awk, the reason I think is, that gawk was compiled with gcc and not with xlc if thats possible

a gcc compiled binary can't use multiple cores I was told long time ago, don't know if that's still the case

Edit:

here is the mpstat output

Code:
mpstat 10 1 > log &
awk -F"(" '{gsub(/\).*/,"",$2);print $2}' file > /dev/null

cat log

System configuration: lcpu=6 ent=0.3 mode=Uncapped

cpu  min  maj  mpc  int   cs  ics   rq  mig lpa sysc us sy wa id   pc  %ec  lcs
  0  753    0    0  801  509  223    0    3 100 4668 91  8  0  1 0.27 88.7  564
  1    0    0    0   21    0    0    0    0   -    0  0 12  0 88 0.00  0.3   21
  6  415    0    0   88    3    2    0    4 100  989 44 46  0 10 0.01  2.0   93
  7    0    0    0   20    0    0    0    0   -    0  0 31  0 69 0.00  0.1   20
 10    0    0    0   20    0    0    0    0   -    0  0 36  0 64 0.00  0.1   20
 11    0    0    0   20    0    0    0    0   -    0  0 36  0 64 0.00  0.1   20
  U    -    -    -    -    -    -    -    -   -    -  -  -  0  9 0.03  8.6    -
ALL 1168    0    0  970  512  225    0    7 100 5657 82  8  0 10 0.27 91.4  738


same command with gawk:

Code:
cat log

System configuration: lcpu=6 ent=0.3 mode=Uncapped

cpu  min  maj  mpc  int   cs  ics   rq  mig lpa sysc us sy wa id   pc  %ec  lcs
  0 2620    0    0  669  501  221    0   23 100 65960 79 20  0  1 0.18 36.1  469
  1 1035    0    0   76  117   50    0   51 100 9124 69 28  0  3 0.03  5.5  112
  6    0    0    0  244    0    0    0    1 100 136396 92  8  0  0 0.29 58.4  248
  7   24    0    0   19    0    0    0    4 100   44  8 26  0 66 0.00  0.2   21
 10    0    0    0   19    0    0    0    0   -    0  0 36  0 64 0.00  0.1   19
 11    0    0    0   19    0    0    0    0   -    0  0 36  0 64 0.00  0.1   19
ALL 3679    0    0 1046  618  271    0   79 100 211524 86 13  0  1 0.50 166.3  888

gawk produces a lot more system calls on aix

Last edited by funksen; 05-18-2009 at 07:38 AM..
 

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

extract text b/w two delimiters

I have an input file which looks like " @$SCRIPT/atp_asrmt_adj.sql $SCRIPT/dba2000.scr -s / @$SCRIPT/cim1005w.pls $SCRIPT/dba2000.scr -s / @$SCRIPT/cim1006w.pls start $SCRIPT/cim1020d.sql;^M spool $DATA/cim1021m.sql @$DATA/cim1021m.sql ! rm $DATA/cim1021m.sql spool $DATA/cim1021m.sql... (6 Replies)
Discussion started by: dowsed4u8
6 Replies

2. Programming

c program to extract text between two delimiters from some text file

needa c program to extract text between two delimiters from some text file. and then storing them in to diffrent variables ? text file like 0: abc.txt ========= aaaaaa|11111111|sssssssssss|333333|ddddddddd|34343454564|asass aaaaaa|11111111|sssssssssss|333333|ddddddddd|34343454564|asass... (7 Replies)
Discussion started by: kukretiabhi13
7 Replies

3. Shell Programming and Scripting

Fetch the rows with match string on a fixed lenth text file - NO delimiters

Hi I am trying to fetch the rows with match string "0000001234" Input file looks like below: 09 0 XXX 0000001234 Z 1 09 0 XXX 0000001234 Z 1 09 0 XXX 0000001234 Z 1 09 0 XXX 0000001234 Z 1 09 0 XXX 0000001234 Z 1... (6 Replies)
Discussion started by: nareshk
6 Replies

4. Shell Programming and Scripting

Need to print the expansion of the found string (the expansion is beween two delimiters '-' , '||'

Hi , could anyone help me out with this problem. sample.txt has this content : u001- this is used for project1 || u002- this is used for p2|| not to be printed u003- this is used for project3 || u004- this is used for p4 || u005- this is used for project5 || u006- this is used for p6... (9 Replies)
Discussion started by: Balaji PK
9 Replies

5. Shell Programming and Scripting

Order text by delimiters

I try order the content from file by delimiters. This is the text: interface Loopback0 description !!!RID RR_SLT ip address 172.31.128.19 255.255.255.255 interface GigabitEthernet0 description !!!P_SLT GI0/0/9 ip address 172.31.130.246 255.255.255.252 and the result that I need... (11 Replies)
Discussion started by: bobbasystem
11 Replies

6. Shell Programming and Scripting

Print text between delimiters IF it contains a certain term...

So I'm racking my brain on appropriate ways to solve a problem that once fixed, will solve every problem in my life. Its very easy (for you guys and gals) I'm sure, but I can't seem to wrap my mind around the right approach. I really want to use bash to do this, but I can't grasp how I'm going to... (14 Replies)
Discussion started by: eh3civic
14 Replies

7. Shell Programming and Scripting

awk: Print fields between two delimiters on separate lines and send to variables

I have email headers that look like the following. In the end I would like to accomplish sending each email address to its own variable, such as: user1@domain.com='user1@domain.com' user2@domain.com='user2@domain.com' user3@domain.com='user3@domain.com' etc... I know the sed to get rid of... (11 Replies)
Discussion started by: tay9000
11 Replies

8. Shell Programming and Scripting

How to put delimiters in text files after fix characters?

Hi , i have a text file in which i want to put delimiters after certain characters ( fix),. like put a delimiter (any like ,) after 1-3 character than 4 than 5 than 6-17 ..... files looks like this (original)... (8 Replies)
Discussion started by: anamdev
8 Replies

9. UNIX for Dummies Questions & Answers

How to create a print filter that print text & image?

Currently, I have a print filter that takes a text file, that convert it into PCL which then gets to a HP printer. This works. Now I need to embedded a image file within the text file. I'm able to convert the image file into PCL and I can cat both files together to into a single document... (1 Reply)
Discussion started by: chedlee88-1
1 Replies

10. Programming

find & Replace text using two non-unique delimiters.

I can find and replace text when the delimiters are unique. What I cannot do is replace text using two NON-unique delimiters: Ex., "This html code <text blah >contains <garbage blah blah >. All tags must go,<text > but some must be replaced with <garbage blah blah > without erasing other... (5 Replies)
Discussion started by: bedtime
5 Replies
All times are GMT -4. The time now is 11:57 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy