![]() |
|
|
|
|
|||||||
| Forums | Portal | Register | Forum Rules | FAQ | Contribute | Members List | Arcade | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here. |
|
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Multiple field separators in awk? (First a space, then a colon) | doubleminus | UNIX for Dummies Questions & Answers | 3 | 04-27-2008 12:28 PM |
| I need help counting the fields and field separators using Nawk | scrappycc | Shell Programming and Scripting | 3 | 02-06-2008 08:47 PM |
| can you redirect multiple files for input? | Matrix_Prime | UNIX for Dummies Questions & Answers | 4 | 02-27-2005 04:07 PM |
| Awk Multiple Field Separators | Tonka52 | Shell Programming and Scripting | 7 | 04-07-2004 06:37 PM |
| Output Multiple Field from dataBase file | Dennz | UNIX for Dummies Questions & Answers | 3 | 09-01-2003 09:41 AM |
|
|
Submit Tools | LinkBack | Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
I saw a couple of posts here referencing how to handle more than one input field separator in awk. I figured I would share how I (just!) figured out how to turn this line in a logfile:
90000000000000000000010001 name D0.90000000000103787900010001QF840840916070000007085814Y216254@D1111111111111111=1107xxxxxxxxxxxxxxx x919MENCHIES into this format: 90000000000000000000010001,name,840840916070000007085814Y216654,1111111111111111,1107,919MENCHIES I have an entire script since this is just one step in a process of turning logs into useful information, but heres the relevant portion. #Author: kinksville #Date: April 24, 2008 #Revised: April 24, 2008 #Revision: Revision 1.00 #Other files: cclookup.s, cclookup.rep #Changelog: #April 24, 2008: Initial creation of the script. # #End changelog. BEGIN { FS="[ \. QF \@D = x]+" OFS = "," } #First iteration of the @D search, stripping out the . character and inserting a OFS. /\@D/ { #Search for any line containing the string @D report2="cclookup.rep2"; #Define report2 variable. report="cclookup.rep"; #Define report variable. num_cclookup++; #Get number of auth requests. print $1, $2, $5, $6, $7, $8 > report; print $0 > report2; } #End of the @D search. The key is the fact that awk will accept a regular expression as file separator. This regexp FS="[ \. QF \@D = x]+" matches spaces, the . the string QF, the string @D, the =, and the character x. The + after the trailing bracket is the key, since that allows for 1 or more instances of any of the characters matched by the regexp. That means that x and xxxxxx are both treated as a single field separator. I still need to work on the output, since now I need to trim the name off the end of the last field. Unfortunately the number in the last field can range anywhere from 9999999 to 1 and that is the part that I want to preserve. Maybe a [^0-9]+ expression? |
| Forum Sponsor | ||
|
|
|
#2
|
||||
|
||||
|
Are you sure that your FS definition is valid for your requirement ?
You doesn't define "@D" and "QF" as separators. The caracters @,D,Q and F are define as separators. The valid syntax is : Code:
FS = "([[:space:]]|\\.|QF|=|x)+"; Code:
last_field=$NF sub(/^[0-9]*/, "", last_field); |
|
#3
|
|||
|
|||
|
I was a little confused by the fact that QF and @D were working too. I think its because [QF]+ matches QQ QQQ QF QQFF etc.
It's not as clean as I might like but those characters are always at that particular place in the logged message, so it does what I want it to. I'll sub in your expression and see what happens too |
|
#4
|
|||
|
|||
|
No such luck
Quote:
Neither of those snippets worked correctly for me. The FS syntax that you used probably changed the number of fields and so they didn't all get printed out. The second snippet just seemed to add the #1 to the last field ie (,619MENCHIES1). I'll play with it some more and see what happens. |
|
#5
|
|||
|
|||
|
Code:
#This script scans the appropriate log file and copies lines containing authorization requests to the output.
#All output is comma separated.
#Author: kinksville
#Date: April 24, 2008
#Revised: April 25, 2008
#Revision: Revision 1.01
#Other files: cclookup.s, cclookup.rep
#Changelog:
#April 24, 2008: Initial creation of the script.
#April 25, 2008: Updated the regex for the input FS to match multiple characters.
#
#End changelog.
BEGIN {
#Input field separators will match any of the following characters/strings: blank space, . , QF, @D, =, x (repeating).
#The + on the outside of the brackets will allow it to match 0 or more instances of any of the characters/strings in any combination.
#% Any comments with the % sign are temporarily there for testing purposes.
FS="[ \. QF \@D = x]+"
#Output field separator is defined as a comma.
OFS = ","
}
#@D search, stripping out the field separator characters and inserting a OFS.
/\@D/ { #Search for any line containing the string @D
last_field=$8 ;
sub(/[^0-9]*/,"",last_field );
dollar_val=last_field/100 ;
report="cclookup.rep"; #Define report variable.
num_cclookup++; #Get number of auth requests.
field1=$1 ;
field2=$2 ;
field3=$5 ;
field4=$6 ;
field5=$7 ;
printf ("%s,%s,%s,%s,%s,$%-.2f\n",field1,field2,field3,field4,field5,dollar_val) > report
#print $1, $2, $5, $6, $7, $8 > report; #Print fields 1-2 with the OFS between them to report.
} #End of the @D search.
Last edited by kinksville; 04-25-2008 at 02:13 PM. Reason: Removed full name from the comments. |
|||
| Google The UNIX and Linux Forums |