parsing with multible delimiters


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers parsing with multible delimiters
# 1  
Old 03-19-2002
parsing with multible delimiters

I have data that looks like this

aaa!bbb!ccc/ddd/eee

It is not fixed format. I need to parse ddd into a var in order to decide if I want to process that row. If I do I need to put ccc and bbb into vars to process it. I need to do this during a while loop one record at a time. Any suggestions? thanks.
# 2  
Old 03-19-2002
This can be done fairly easily in perl:

Code:
open(INPUTFILE, "inputfile.txt") || die "$!";

while ($inputLine = <INPUTFILE>) {

  if ($inputLine =~ /(\w+)!(\w+)!(\w+)\/(\w+)\/(\w+)/) {
    print "\taaa is $1\n";
    print "\tbbb is $2\n";
    print "\tccc is $3\n";
    print "\tddd is $4\n";
    print "\teee is $5\n";
  };
  
};

In the above example the first portion "aaa" is $1, "bbb" is $2, et cetera. You can perform checks against $4 ("ddd") to determine if you wish to process the rest of the line.

Hope that helps!
# 3  
Old 03-19-2002
I can't seem to get that to work

I tried to execute that code but it doesn't seem to work. Here's what I've tried so far and the actual data:

$stri="369!pcust-ap03!f0gaudf5/Deal30/test/ma26_3_1arms_showout.xls!232448!Mar_4_13:04";


if ($stri =~ /(\w+)!(\w+)!(\w+)\/(\w+)\/(\S+)/) {
print (STDERR "\taaa is $1\n");
print "\tbbb is $2\n";
print "\tccc is $3\n";
print "\tddd is $4\n";
print "\teee is $5\n";
};

The three lines of data are really one long string. Thanks, Gill
# 4  
Old 03-19-2002
RegExp are for job security!

This is by no means pretty, but I can't remember the regexp metachar right at this moment and I am about to go study for my Linear Algebra test this evening... But this seems to work:

Code:
#!/usr/bin/perl

#       1 !    2     !    3   /   4  / 5  /            6           !  7   !      8
$stri="369!pcust-ap03!f0gaudf5/Deal30/test/ma26_3_1arms_showout.xls!232448!Mar_4_13:04";

if ($stri =~ /(\w+)!([a-zA-Z0-9\-.]+)!(\w+)\/(\w+)\/(\w+)\/([a-zA-Z0-9\-._]+)!(\w+)!([a-zA-Z0-9\-._:]+)/) {
#                     ^^^^^^^^^^^^^ Yeah... It's pretty cheezy!
#                                   but I can't remember the proper metacharacter
#                                   right at this second...

  print "\$1 is $1\n";
  print "\$2 is $2\n";
  print "\$3 is $3\n";
  print "\$4 is $4\n";
  print "\$5 is $5\n";
  print "\$6 is $6\n";
  print "\$7 is $7\n";
  print "\$8 is $8\n";
};

Example output:

Code:
OpenBSD:/home/joeuser/sample $ ./sample2.pl
$1 is 369
$2 is pcust-ap03
$3 is f0gaudf5
$4 is Deal30
$5 is test
$6 is ma26_3_1arms_showout.xls
$7 is 232448
$8 is Mar_4_13:04
OpenBSD:/home/joeuser/sample $

The proper regexp metachar will make this code much easier to read and maintain...
# 5  
Old 03-19-2002
Thanks

Beuty, works like a champ.
# 6  
Old 03-19-2002
Here is an alternate solution, not using perl:

Code:
cat gillbates.txt | while read string junk
do
echo $string |
  tr "[/]" "[ ]" |
  read words123 word4 rem

if [ "$word4" = Deal30 ] ; then
   word2=`expr $words123 : ".*!\(.*\)!"`
   word3=`expr $words123 : ".*!.*!\(.*\)"`
fi
done

Jimbo
# 7  
Old 03-20-2002
simple solution

I know this won't please the hard core among you but a co-worker of mine came up with the simple solution. Sed the data to change one of the delimeter's to the other and then cut out the fields. done!
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

--Parsing out strings for repeating delimiters for everyline

Hello: I have some text output, on SunOS 5.11 platform using KSH: I am trying to parse out each string within the () for each line. I tried, as example: perl -lanF"" -e 'print "$F $F $F $F $F $F"' But for some reason, the output gets all garbled after the the first fields.... (8 Replies)
Discussion started by: gilgamesh
8 Replies

2. Programming

Segfault When Parsing Delimiters In C

Another project, another bump in the road and another chance to learn. I've been trying to open gzipped files and parse data from them and hit a snag. I have data in gzips with a place followed by an ip or ip range sort of like this: Some place:x.x.x.x-x.x.x.x I was able to modify some code... (6 Replies)
Discussion started by: Azrael
6 Replies

3. Shell Programming and Scripting

Delimiters with awk?

I have a file which is separated by delimiter "|", but the prob is one of my column do contain delimiter as description so how can i differentiate it? PS : the delmiter does have backslash coming before it, if occurring in column Annual|Beleagured|Desc|Denver... (2 Replies)
Discussion started by: nikhil jain
2 Replies

4. Shell Programming and Scripting

Inserting Delimiters

Hi Team, I am trying to get the data in below format Jan 01 | 19:00:32 | xyz | abc | sometext | string however I am not sure of the total number strings which can come in the record hence i cant use something like below as it can end $6 or it can go further cat file| awk... (8 Replies)
Discussion started by: rakesh_411
8 Replies

5. UNIX for Dummies Questions & Answers

delimiters used in UNIX

Can you point me to information on the different delimited in UNIX like colon, spaces and tabs? (1 Reply)
Discussion started by: momhef4
1 Replies

6. Shell Programming and Scripting

sort with different delimiters

I have a file with the following lines in it: Inbound1:remote - - 01/Nov/2011:08:29:51 -0500 "GET / HTTP/1.1" 404 2098 HTTP Inbound1:remote - - 02/Dec/2011:08:31:42 -0500 "GET / HTTP/1.1" 404 2098 HTTP Inbound3:remote - - 01/Oct/2011:08:29:52 -0500 "GET / HTTP/1.1" 404 2098 HTTP Inbound4:remote... (5 Replies)
Discussion started by: oldman2
5 Replies

7. Shell Programming and Scripting

Two delimiters with AWK

Hello, this thread is more about scripting style than a specific issue. I've to grep from a output some lines and from them obtain a specific entry delimited by < and >. This is my way : 1) grep -i user list | awk '{FS="<";print $NF}' | sed -e 's/>//g' 2) grep -i user list | cut -d","... (10 Replies)
Discussion started by: gogol_bordello
10 Replies

8. Shell Programming and Scripting

Parsing of file for Report Generation (String parsing and splitting)

Hey guys, I have this file generated by me... i want to create some HTML output from it. The problem is that i am really confused about how do I go about reading the file. The file is in the following format: TID1 Name1 ATime=xx AResult=yyy AExpected=yyy BTime=xx BResult=yyy... (8 Replies)
Discussion started by: umar.shaikh
8 Replies

9. Shell Programming and Scripting

Perl parsing compared to Ksh parsing

#! /usr/local/bin/perl -w $ip = "$ARGV"; $rw = "$ARGV"; $snmpg = "/usr/local/bin/snmpbulkget -v2c -Cn1 -Cn2 -Os -c $rw"; $snmpw = "/usr/local/bin/snmpwalk -Os -c $rw"; $syst=`$snmpg $ip system sysName sysObjectID`; sysDescr.0 = STRING: Cisco Internetwork Operating System Software... (1 Reply)
Discussion started by: popeye
1 Replies

10. Solaris

To extract everything between two delimiters

My input file looks like " @$SCRIPT/atp_asrmt_adj.sql $SCRIPT/dba2000.scr -s / @$SCRIPT/cim1005w.pls $SCRIPT/dba2000.scr -s / @$SCRIPT/cim1006w.pls start $SCRIPT/cim1020d.sql;^M spool $DATA/cim1021m.sql @$DATA/cim1021m.sql ! rm $DATA/cim1021m.sql spool $DATA/cim1021m.sql... (1 Reply)
Discussion started by: dowsed4u8
1 Replies
Login or Register to Ask a Question