awk cut column based on string

02-14-2012

Registered User

2,157, 51

Join Date: Feb 2007

Last Activity: 6 September 2017, 5:43 AM EDT

Location: Innsbruck, Austria

Posts: 2,157

Thanks Given: 12

Thanked 51 Times in 48 Posts

The standard awk is fairly weak. If you don't have access to GNU awk, install it. All the above solutions rely on GNU awk or nawk or at least Sun's xpg awk (which is an old version of nawk).

Code:

awk -v IGNORECASE=1 '{if( match($0,/-Tag:([^[:space:]]*)/,found)) print found[1]; }'

With nawk you might do something similar, but using sub() because nawk's match() isn't as cool as GNU's.

otheus

View Public Profile for otheus

Find all posts by otheus

02-14-2012

Registered User

2,977, 644

Join Date: Oct 2010

Last Activity: 14 September 2019, 1:15 PM EDT

Location: France

Posts: 2,977

Thanks Given: 88

Thanked 644 Times in 613 Posts

Code:

awk -F"[ :-]" 'tolower($2)~/tag/{print "-"$2":"$3}' yourfile

Code:

awk '{split($1,a,":")}tolower(a[1])~/-tag/{print $1}' yourfile

Code:

awk '{NF=1;split($1,a,":")}tolower(a[1])~/-tag/' yourfile

Code:

$ cat tst
-tag:messages -P:/var/log/messages -P:/var/log/maillog -K:Error -K:Warning -K:critical
-TAG:messages -P:/var/log/messages -P:/var/log/maillog -K:Error -K:Warning -K:critical
-Tag:messages -P:/var/log/messages -P:/var/log/maillog -K:Error -K:Warning -K:critical
-tAG:messages -P:/var/log/messages -P:/var/log/maillog -K:Error -K:Warning -K:critical
$ awk -F"[ :-]" 'tolower($2)~/tag/{print "-"$2":"$3}' tst
-tag:messages
-TAG:messages
-Tag:messages
-tAG:messages
$ awk '{split($1,a,":")}tolower(a[1])~/-tag/{print $1}' tst
-tag:messages
-TAG:messages
-Tag:messages
-tAG:messages
$ awk '{NF=1;split($1,a,":")}tolower(a[1])~/-tag/' tst
-tag:messages
-TAG:messages
-Tag:messages
-tAG:messages

Last edited by ctsgnb; 02-14-2012 at 02:42 PM..

ctsgnb

View Public Profile for ctsgnb

Find all posts by ctsgnb

02-14-2012

Moderator

12,296, 3,792

Join Date: Nov 2008

Last Activity: 1 January 2021, 1:47 AM EST

Location: Amsterdam

Posts: 12,296

Thanks Given: 679

Thanked 3,792 Times in 3,282 Posts

Quote:

Originally Posted by otheus

[..]All the above solutions rely on GNU awk or nawk or at least Sun's xpg awk (which is an old version of nawk).

Are you certain about that otheus? I was under the impression that /usr/xpg4/bin/awk was introduced to Solaris later and does more to approach Posix standards than nawk on Solaris does, which stands for new awk, but that is only relative to ancient original awk...

Scrutinizer

View Public Profile for Scrutinizer

Find all posts by Scrutinizer

02-14-2012

Registered User

2,157, 51

Join Date: Feb 2007

Last Activity: 6 September 2017, 5:43 AM EDT

Location: Innsbruck, Austria

Posts: 2,157

Thanks Given: 12

Thanked 51 Times in 48 Posts

Not 100% sure, but I know Kernighan was maintaining nawk at least through 2007, and the open BSD project has been maintaining it since, and Solaris, well, I think they brought awk over from System V back in the 90s or maybe even before then with SunOS 4.x

---------- Post updated at 03:54 PM ---------- Previous update was at 03:16 PM ----------

Quote:

Originally Posted by Scrutinizer

Follow-up:
From the FIXES file in awk.zip downloaded from Kernighan's web page:

Code:

Jun 1, 2003:
	subtle change to split: if source is empty, number of elems
	is always 0 and the array is not set.

From Solaris 10 (2005) xpg-awk:

Code:

$ /usr/xpg4/bin/awk 'BEGIN { print split(null,out,FS) }' </dev/null
0

So it would seem Solaris DID keep nawk up-to-date w.r.t Kernighan's version.

Then again....

Code:

Jan 1, 2002:
	length(arrayname) returns number of elements; thanks to 
	arnold robbins for suggestion

And on Sun's implementation:

Code:

$ /usr/xpg4/bin/awk 'BEGIN { split("test",out,/es/); print out[1]; print length(out)}' </dev/null
t
0

Last edited by otheus; 02-14-2012 at 11:01 AM.. Reason: oops! LOL copied wrong output.

otheus

View Public Profile for otheus

Find all posts by otheus

02-14-2012

Moderator

12,296, 3,792

Join Date: Nov 2008

Last Activity: 1 January 2021, 1:47 AM EST

Location: Amsterdam

Posts: 12,296

Thanks Given: 679

Thanked 3,792 Times in 3,282 Posts

@otheus, Interesting, I think though you should be comparing these Solaris nawk, not /usr/xpg4/bin/awk, which should not be following Kernighan's changes, but rather strive to be Posix compliant, no? What is the output of the same commands with nawk ?

Scrutinizer

View Public Profile for Scrutinizer

Find all posts by Scrutinizer

02-14-2012

Registered User

2,157, 51

Join Date: Feb 2007

Last Activity: 6 September 2017, 5:43 AM EDT

Location: Innsbruck, Austria

Posts: 2,157

Thanks Given: 12

Thanked 51 Times in 48 Posts

First, I think you should split this thread into the Underground forum, for instance, and link to it

Second, Kerhnighan *is* the author of nawk. What Solaris did to what they call nawk is anyone's guess.

Third, Solaris lists the nawk man page and xpg4/awk man page as the same entity (yet oddly, the files differ vastly in size).

Fourth, nawk explicitly errors with length(arrayname):

Code:

$ nawk 'BEGIN { split("test",out,/es/); print out[1]; print length(out)}' </dev/null
t
nawk: can't read value of out; it's an array name.
 source line number 1

otheus

View Public Profile for otheus

Find all posts by otheus

02-14-2012

Moderator

12,296, 3,792

Join Date: Nov 2008

Last Activity: 1 January 2021, 1:47 AM EST

Location: Amsterdam

Posts: 12,296

Thanks Given: 679

Thanked 3,792 Times in 3,282 Posts

I think you are right, let's do that if you think it is interesting (I do), but what shall we call the thread? /usr/xpg4/bin/awk vs. nawk on Solaris? I thought in post#8 you meant on Solaris /usr/xpg4/bin/awk is an old version of nawk , i.e. the current version on Solaris. And my point was/is that nawk on Solaris is not as compliant as /usr/xpg4/bin/awk and therefore the latter is preferable to nawk on Solaris.

But on rereading you seem to be referring to a recent version of nawk on different systems. But in many other systems nawk is either non-existing or a link to gawk or mawk and on yet others awk is nawk (or bwk).

Yes, Kernighan is the author of nawk, but length() operating on an array is an added feature and is not part of the Posix specification (and unnecessary).

Scrutinizer

View Public Profile for Scrutinizer

Find all posts by Scrutinizer

Emergency UNIX and Linux Support

awk cut column based on string

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Awk/sed summation of one column based on some entry in first column

Discussion started by: kshitij

2. Shell Programming and Scripting

Awk/sed/cut to filter out records from a file based on criteria

Discussion started by: MIA651

3. UNIX for Dummies Questions & Answers

Count occurrence of string (based on type) in a column using awk

Discussion started by: Gussifinknottle

4. Shell Programming and Scripting

awk to sum a column based on duplicate strings in another column and show split totals

Discussion started by: prashob123

5. Shell Programming and Scripting

To cut a string based on last index of identifier

Discussion started by: vivek d r

6. Shell Programming and Scripting

Pick the column value based on another column using awk or CUT

Discussion started by: Ganesh L

7. UNIX for Dummies Questions & Answers

How to cut from a text file based on value of a specific column?

Discussion started by: evelibertine

8. UNIX for Dummies Questions & Answers

how to cut based on a string

Discussion started by: gvc

9. Shell Programming and Scripting

sed or awk command to replace a string pattern with another string based on position of this string

Discussion started by: vivek d r

10. UNIX for Dummies Questions & Answers

Cut from tables based on column values

Discussion started by: Gussifinknottle