04-25-2008
Multiple input field Separators in awk.
I saw a couple of posts here referencing how to handle more than one input field separator in awk. I figured I would share how I (just!) figured out how to turn this line in a logfile:
90000000000000000000010001 name D0.90000000000103787900010001QF840840916070000007085814Y216254@D1111111111111111=1107xxxxxxxxxxxxxxx x919MENCHIES
into this format:
90000000000000000000010001,name,840840916070000007085814Y216654,1111111111111111,1107,919MENCHIES
I have an entire script since this is just one step in a process of turning logs into useful information, but heres the relevant portion.
#Author: kinksville
#Date: April 24, 2008
#Revised: April 24, 2008
#Revision: Revision 1.00
#Other files: cclookup.s, cclookup.rep
#Changelog:
#April 24, 2008: Initial creation of the script.
#
#End changelog.
BEGIN {
FS="[ \. QF \@D = x]+"
OFS = ","
}
#First iteration of the @D search, stripping out the . character and inserting a OFS.
/\@D/ { #Search for any line containing the string @D
report2="cclookup.rep2"; #Define report2 variable.
report="cclookup.rep"; #Define report variable.
num_cclookup++; #Get number of auth requests.
print $1, $2, $5, $6, $7, $8 > report;
print $0 > report2;
} #End of the @D search.
The key is the fact that awk will accept a regular expression as file separator. This regexp FS="[ \. QF \@D = x]+" matches spaces, the . the string QF, the string @D, the =, and the character x. The + after the trailing bracket is the key, since that allows for 1 or more instances of any of the characters matched by the regexp.
That means that x and xxxxxx are both treated as a single field separator.
I still need to work on the output, since now I need to trim the name off the end of the last field. Unfortunately the number in the last field can range anywhere from 9999999 to 1 and that is the part that I want to preserve. Maybe a [^0-9]+ expression?
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi Guys,
I'm tying to split a line similar to this:YO6-2000-30.htm: (3 properties found).......into separate columns, so effectively I need to check for a -, ., :, a tab and a space in the statement.
Any help would be appreciated
Thanks! (7 Replies)
Discussion started by: Tonka52
7 Replies
2. UNIX for Dummies Questions & Answers
How do I deal with extracting a portion of a record when multiple field separators are involved.
Let's say I have:
Mike Harrington;(555) 555-5555:250:100:175
Christian Dobbins;(555) 555-2358:155:90:201
Susan Dalsass;(555) 555-6279:250:60:50
Archie McNichol;(555) 555-1348:250:100:175
Jody... (3 Replies)
Discussion started by: doubleminus
3 Replies
3. Shell Programming and Scripting
I need to print the second field of a file, taking spaces, tab and = as field separators.
; for 16-bit app support
MAPI=1
CMC=1
CMCDLLNAME32=mapi32.dll
CMCDLLNAME=mapi.dll
MAPIX=1
MAPIXVER=1.0.0.1
OLEMessaging=1
asf=MPEGVideo
asx=MPEGVideo
ivf=MPEGVideo
m3u=MPEGVideo (2 Replies)
Discussion started by: PamPam
2 Replies
4. UNIX Desktop Questions & Answers
Hi Guys,
I have small dilemma which I could do with a little help solving . I currently have text HDD S.M.A.R.T report which I have pasted below:
smartctl 5.39 2008-10-24 22:33 (openSUSE RPM)
Copyright (C) 2002-8 by Bruce Allen, http://smartmontools.sourceforge.net
Device: COMPAQ... (2 Replies)
Discussion started by: bikerben
2 Replies
5. UNIX for Dummies Questions & Answers
I have files such as
n02-z30-dsr65-terr0.25-dc0.008-16x12drw-run1.cmd
I am wondering if it is possible to define two field separators "-" and "."
for these strings so that $7 is run1. (5 Replies)
Discussion started by: kristinu
5 Replies
6. Shell Programming and Scripting
Hello,
For the input file, I am trying to split those records which have multiple values seperated by '|' in the last input field, into multiple records and each record corresponds to the common input fields + one of the value from the last field.
I was trying with an example on this forum... (4 Replies)
Discussion started by: imtiaz99
4 Replies
7. Shell Programming and Scripting
How do I use multiple field separators in awk?
I know that if I use awk -F"", both a and b will be field separators. But what if I need two field separators that both are longer than one letter?
If I want the field separators to be "ab" and "cd", I will not be able to use awk -F"". The ... (2 Replies)
Discussion started by: locoroco
2 Replies
8. Shell Programming and Scripting
Can you please help me with this ....
Input File
share "FTPTransfer" "/v31_fs01/root/FTP-Transfer" umask=022 maxusr=4294967295 netbios=NJ09FIL530
share "Test" "/v31_fs01/root/Test" umask=022 maxusr=4294967295 netbios=NJ09FIL530
share "ENR California" "/v31_fs01/root/ENR California"... (14 Replies)
Discussion started by: greycells
14 Replies
9. Shell Programming and Scripting
There is an usual ifconfig output
vlan30 Link encap:Ethernet HWaddr
inet addr:192.168.0.1 Bcast:192.168.0.255 Mask:255.255.255.0
inet6 addr: 2407:4c00:0:1:aaff::1/64 Scope:Global
inet6 addr: fe80::224:e8ff:fe6b:cc4f/64 Scope:Link
UP BROADCAST... (1 Reply)
Discussion started by: urello
1 Replies
10. Shell Programming and Scripting
I have a large file that I need to print certain sections out of.
file.txt
/alpha/beta/delta/gamma/425/590/USC00015420.blah.lt.0.01.str:USC00015420Y2017M10BLALT.01 12 13 14 -9 1 -9 -9 -9 -9 -9 1 2 3 4 5 -9 -9
I need to print the "USC00015420" and... (5 Replies)
Discussion started by: ncwxpanther
5 Replies
JOIN(1) General Commands Manual JOIN(1)
NAME
join - relational database operator
SYNOPSIS
join [-an] [-e s] [-o list] [-tc] file1 file2
DESCRIPTION
Join forms, on the standard output, a join of the two relations specified by the lines of file1 and file2. If file1 is `-', the standard
input is used.
File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the first in
each line.
There is one line in the output for each pair of lines in file1 and file2 that have identical join fields. The output line normally con-
sists of the common field, then the rest of the line from file1, then the rest of the line from file2.
Fields are normally separated by blank, tab or newline. In this case, multiple separators count as one, and leading separators are dis-
carded.
These options are recognized:
-an In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2.
-e s Replace empty output fields by string s.
-o list
Each output line comprises the fields specified in list, each element of which has the form n.m, where n is a file number and m is a
field number.
-tc Use character c as a separator (tab character). Every appearance of c in a line is significant.
SEE ALSO
sort(1), comm(1), awk(1).
BUGS
With default field separation, the collating sequence is that of sort -b; with -t, the sequence is that of a plain sort.
The conventions of join, sort, comm, uniq, look and awk(1) are wildly incongruous.
7th Edition April 29, 1985 JOIN(1)