Awk Multiple Field Separators | Unix Linux Forums | Shell Programming and Scripting

  Go Back    


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Awk Multiple Field Separators

Shell Programming and Scripting


Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 04-07-2004
Tonka52 Tonka52 is offline
Registered User
 
Join Date: Feb 2004
Last Activity: 1 December 2004, 8:24 AM EST
Posts: 14
Thanks: 0
Thanked 0 Times in 0 Posts
Awk Multiple Field Separators

Hi Guys,

I'm tying to split a line similar to this:

YO6-2000-30.htm: (3 properties found).

Into seperate columns, so effectively I need to check for a - . : tab and space in the statement.

Any help would be appreciated

Thanks!
Sponsored Links
    #2  
Old 04-07-2004
google's Avatar
google google is offline Forum Advisor  
Advisor
 
Join Date: Jul 2002
Last Activity: 17 June 2009, 5:17 PM EDT
Location: Atlanta
Posts: 740
Thanks: 0
Thanked 2 Times in 2 Posts
Does your data look like this? YO6-2000-30.htm: or like this?YO6-2000-30.htm: (3 properties found).

In the first case, set RS = ":" to delimit each record and then you can parse each field within the record using regexp's. In the latter case, play the same game by setting RS = ")."
Sponsored Links
    #3  
Old 04-07-2004
Tonka52 Tonka52 is offline
Registered User
 
Join Date: Feb 2004
Last Activity: 1 December 2004, 8:24 AM EST
Posts: 14
Thanks: 0
Thanked 0 Times in 0 Posts
Google,

The line is exactly as printed.

So :

YO6-2000-30.htm: (3 properties found).

needs to become

$1 $2 $3 $4 $5
YO6 2000 30 htm (3 properties found)

I understand that $5 could affectively become:

$5 $6 $7
(3 properties found)

....but that's manageable!
    #4  
Old 04-07-2004
google's Avatar
google google is offline Forum Advisor  
Advisor
 
Join Date: Jul 2002
Last Activity: 17 June 2009, 5:17 PM EDT
Location: Atlanta
Posts: 740
Thanks: 0
Thanked 2 Times in 2 Posts
Awk takes input and creates "records" by delimiting by the value of RS. Awk delimits each "record" by the value of "FS", the field separator. You can then slice and dice each value of a field at your whim. In addition, if you dont have an easy way to split records and fields, Awk (gawk) allows you to define your own record by specifying column widths using the FIELDWIDTHS variable. Example
BEGIN { FIELDWIDTHS = "9 6 10 6 7 7 35" } Will define a record of fixed width including whitespace between columns. So $1 is defined as a field of 9 bytes, $2 is defined as a field of 6 bytes and so on.

This is a pretty good tutorial on Awk. GNU Awk Tutorial
Sponsored Links
    #5  
Old 04-07-2004
Tonka52 Tonka52 is offline
Registered User
 
Join Date: Feb 2004
Last Activity: 1 December 2004, 8:24 AM EST
Posts: 14
Thanks: 0
Thanked 0 Times in 0 Posts
Google,

I can't rely on fixed width fields, as the length changes. I was hoping for a command something like :

awk ' {FS = "[:, -]+" } { print $1 $2 $3 $4 $5}'

of course, this syntax is wrong...but you know what I'm getting at yeah?
Sponsored Links
    #6  
Old 04-07-2004
google's Avatar
google google is offline Forum Advisor  
Advisor
 
Join Date: Jul 2002
Last Activity: 17 June 2009, 5:17 PM EDT
Location: Atlanta
Posts: 740
Thanks: 0
Thanked 2 Times in 2 Posts
You can massage the data a bit by changing all of the "(" and ")" and "." to a ":" before you parse the data. Once you have that then all of your data looks the same. Set FS = ":" to define your fields, set OFS to some output delimiter you need and print your data. If you need the parens in the output, add them back in your print statement. Remember, Awk does not change the original record so you can make these changes for the purposes of your program without mucking anything up!


Code:
gensub(regexp, replacement, how [, target]) # 
gensub is a general substitution function. Like sub and gsub, it 
searches the target string target for matches of the regular 
expression regexp. Unlike sub and gsub, the modified string is 
returned
as the result of the function and the original target string is not 
changed. If how is a string beginning with g or G, then it replaces
 all matches of regexp with replacement. Otherwise, how is 
treated as a number that indicates which match of regexp to 
replace. If no target is supplied, $0 is used.
 

gensub provides an additional feature that is not available in sub
 or gsub: the ability to specify components of a regexp in the
replacement text. This is done by using parentheses in the regexp
 to mark the components and then specifying \N in the 
replacement text, where N is a digit from 1 to 9. For example:


Last edited by google; 04-07-2004 at 07:24 AM..
Sponsored Links
    #7  
Old 04-07-2004
Tonka52 Tonka52 is offline
Registered User
 
Join Date: Feb 2004
Last Activity: 1 December 2004, 8:24 AM EST
Posts: 14
Thanks: 0
Thanked 0 Times in 0 Posts
Yeah, thought of that but my curiosity got me wondering if there's a single expression I can use.

Thanks anyway!
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Can one use 2 field separators in awk? kristinu UNIX for Dummies Questions & Answers 5 01-10-2012 10:29 AM
awk Varing Field Separators bikerben UNIX Desktop for Dummies Questions & Answers 2 07-27-2011 12:56 PM
Problem with changing field separators in a file mk1216 Shell Programming and Scripting 7 02-17-2011 11:08 AM
Multiple field separators in awk? (First a space, then a colon) doubleminus UNIX for Dummies Questions & Answers 3 04-27-2008 03:28 PM
Multiple input field Separators in awk. kinksville Shell Programming and Scripting 4 04-25-2008 05:12 PM



All times are GMT -4. The time now is 07:06 PM.