Need help parsing data with sed and/or nawk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Need help parsing data with sed and/or nawk
# 1  
Old 03-05-2012
Need help parsing data with sed and/or nawk

Good day all.

I have the following entries of data in a file in a column, however, I need this data written on a single line with several parameters in a different order.

Current format:
Code:
Treatment
      ,parmeter1=value
      ,parmeter2=value
      ,parmeter3=value
      ,parmeter4=value
      ,parmeter5=value
      ,parmeter6=value
      ,parmeter7=value
      ,parmeter8=value
      ,parmeter9=value
      ,parmeter10=value
      ,parmeter11=value
      ,parmeter12=value
      ,parmeter13=value
      ,parmeter14=value
      ,parmeter15=value
;;
Treatment
      ,parmeter1=value
      ,parmeter2=value
      ,parmeter3=value
      ,parmeter4=value
      ,parmeter5=value
      ,parmeter6=value
      ,parmeter7=value
      ,parmeter8=value
      ,parmeter9=value
      ,parmeter10=value
      ,parmeter11=value
      ,parmeter12=value
      ,parmeter13=value
      ,parmeter14=value
      ,parmeter15=value
;;

The above should look like this:
Code:
Treatment ,parmeter1=value,parmeter2=value,parmeter5=value,parmeter3=value,parmeter4=value,parmeter6=value,parmeter7=value,parmeter10=value,parmeter9=value,parmeter11=value,,parmeter12=value,parmeter15=value,parmeter14=value,parmeter13=value;;
Treatment ,parmeter1=value,parmeter2=value,parmeter5=value,parmeter3=value,parmeter4=value,parmeter6=value,parmeter7=value,parmeter10=value,parmeter9=value,parmeter11=value,,parmeter12=value,parmeter15=value,parmeter14=value,parmeter13=value;;

Thanks in advance,

Bjoern
# 2  
Old 03-05-2012
Code:
$ cat treatment.awk

BEGIN { FS="," }

!T { T=$1" "; next }
T && /^[ \t]*,/ { T=T "," $2 }
/;;/ { print T; T="" }

$ awk -f treatment.awk data

Treatment ,parmeter1=value,parmeter2=value,parmeter3=value,parmeter4=value,parmeter5=value,parmeter6=value,parmeter7=value,parmeter8=value,parmeter9=value,parmeter10=value,parmeter11=value,parmeter12=value,parmeter13=value,parmeter14=value,parmeter15=value
Treatment ,parmeter1=value,parmeter2=value,parmeter3=value,parmeter4=value,parmeter5=value,parmeter6=value,parmeter7=value,parmeter8=value,parmeter9=value,parmeter10=value,parmeter11=value,parmeter12=value,parmeter13=value,parmeter14=value,parmeter15=value

$

# 3  
Old 03-05-2012
Tools What about something like?

Code:
$ cat sample1.txt
Treatment
      ,parmeter1=value
      ,parmeter2=value
      ,parmeter3=value
      ,parmeter4=value
      ,parmeter5=value
      ,parmeter6=value
      ,parmeter7=value
      ,parmeter8=value
      ,parmeter9=value
      ,parmeter10=value
      ,parmeter11=value
      ,parmeter12=value
      ,parmeter13=value
      ,parmeter14=value
      ,parmeter15=value
;;
Treatment
      ,parmeter1=value
      ,parmeter2=value
      ,parmeter3=value
      ,parmeter4=value
      ,parmeter5=value
      ,parmeter6=value
      ,parmeter7=value
      ,parmeter8=value
      ,parmeter9=value
      ,parmeter10=value
      ,parmeter11=value
      ,parmeter12=value
      ,parmeter13=value
      ,parmeter14=value
      ,parmeter15=value
;;

$ sed 's/;;/;;~/g' sample1.txt | tr -d "\n" | tr "~" "\n" | tr -s " "
Treatment ,parmeter1=value ,parmeter2=value ,parmeter3=value ,parmeter4=value ,parmeter5=value ,parmeter6=value ,parmeter7=value ,pa
rmeter8=value ,parmeter9=value ,parmeter10=value ,parmeter11=value ,parmeter12=value ,parmeter13=value ,parmeter14=value ,parmeter15
=value;;
Treatment ,parmeter1=value ,parmeter2=value ,parmeter3=value ,parmeter4=value ,parmeter5=value ,parmeter6=value ,parmeter7=value ,pa
rmeter8=value ,parmeter9=value ,parmeter10=value ,parmeter11=value ,parmeter12=value ,parmeter13=value ,parmeter14=value ,parmeter15
=value;;

I put a ~ character at end, so when I get rid of new-lines, I can put a Return back in the data. I also suppress extra blank space characters.
# 4  
Old 03-05-2012
Quote:
Originally Posted by joeyg
Code:
$ cat sample1.txt
Treatment
      ,parmeter1=value
      ,parmeter2=value
      ,parmeter3=value
      ,parmeter4=value
      ,parmeter5=value
      ,parmeter6=value
      ,parmeter7=value
      ,parmeter8=value
      ,parmeter9=value
      ,parmeter10=value
      ,parmeter11=value
      ,parmeter12=value
      ,parmeter13=value
      ,parmeter14=value
      ,parmeter15=value
;;
Treatment
      ,parmeter1=value
      ,parmeter2=value
      ,parmeter3=value
      ,parmeter4=value
      ,parmeter5=value
      ,parmeter6=value
      ,parmeter7=value
      ,parmeter8=value
      ,parmeter9=value
      ,parmeter10=value
      ,parmeter11=value
      ,parmeter12=value
      ,parmeter13=value
      ,parmeter14=value
      ,parmeter15=value
;;

$ sed 's/;;/;;~/g' sample1.txt | tr -d "\n" | tr "~" "\n" | tr -s " "
Treatment ,parmeter1=value ,parmeter2=value ,parmeter3=value ,parmeter4=value ,parmeter5=value ,parmeter6=value ,parmeter7=value ,pa
rmeter8=value ,parmeter9=value ,parmeter10=value ,parmeter11=value ,parmeter12=value ,parmeter13=value ,parmeter14=value ,parmeter15
=value;;
Treatment ,parmeter1=value ,parmeter2=value ,parmeter3=value ,parmeter4=value ,parmeter5=value ,parmeter6=value ,parmeter7=value ,pa
rmeter8=value ,parmeter9=value ,parmeter10=value ,parmeter11=value ,parmeter12=value ,parmeter13=value ,parmeter14=value ,parmeter15
=value;;

I put a ~ character at end, so when I get rid of new-lines, I can put a Return back in the data. I also suppress extra blank space characters.
Quote:
Originally Posted by Corona688
Code:
$ cat treatment.awk

BEGIN { FS="," }

!T { T=$1" "; next }
T && /^[ \t]*,/ { T=T "," $2 }
/;;/ { print T; T="" }

$ awk -f treatment.awk data

Treatment ,parmeter1=value,parmeter2=value,parmeter3=value,parmeter4=value,parmeter5=value,parmeter6=value,parmeter7=value,parmeter8=value,parmeter9=value,parmeter10=value,parmeter11=value,parmeter12=value,parmeter13=value,parmeter14=value,parmeter15=value
Treatment ,parmeter1=value,parmeter2=value,parmeter3=value,parmeter4=value,parmeter5=value,parmeter6=value,parmeter7=value,parmeter8=value,parmeter9=value,parmeter10=value,parmeter11=value,parmeter12=value,parmeter13=value,parmeter14=value,parmeter15=value

$

Thanks for the replies, Joey, Corona. However, you missed the last part of the challenge: the order of the parameters has to change as well.

Regards,
Bjoern
# 5  
Old 03-05-2012
So, is the order?
1,2,5,3,4,6,7,10,9,11,?,12,15,14,13
where ? is 8 suppressed
# 6  
Old 03-05-2012
Quote:
Originally Posted by joeyg
So, is the order?
1,2,5,3,4,6,7,10,9,11,?,12,15,14,13
where ? is 8 suppressed
I guess I forgot the 8, this being an example. Also I'm not sure of the order yet, but if I can get an idea as to how to stream the order to a single line, I'd be very grateful.

Thanks,
Bjoern
# 7  
Old 03-06-2012
How to stream the order really depends on what defines the order. Is parameter13 given last because it's called parameter13, or because it happened to be the 13th in the order and you always want the 13th parameter there?

---------- Post updated at 10:36 AM ---------- Previous update was at 10:22 AM ----------

If it's the order and not their names:
Code:
$ cat treatment2.awk

BEGIN { FS=","; OFS=","
        split("1,2,5,3,4,6,7,10,9,11,10,12,15,14,13", ORDER);
}

!T { TITLE=$1 ; T=1 ; next }

/^[ \t]*,/ { C[T++]=$2 }

/;;/ {  $0="";  T=""
        for(X in ORDER) $(ORDER[X])=" "C[X];
        $1=TITLE $1
        print
}

$ awk -f treatment2.awk data

Treatment parmeter1=value, parmeter2=value, parmeter4=value, parmeter5=value, parmeter3=value, parmeter6=value, parmeter7=value,, parmeter9=value, parmeter11=value, parmeter10=value, parmeter12=value, parmeter15=value, parmeter14=value, parmeter13=value
Treatment parmeter1=value, parmeter2=value, parmeter4=value, parmeter5=value, parmeter3=value, parmeter6=value, parmeter7=value,, parmeter9=value, parmeter11=value, parmeter10=value, parmeter12=value, parmeter15=value, parmeter14=value, parmeter13=value

$

Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Parsing XML (and insert data) then output data (bash / Solaris)

Hi folks I have a script I wrote that basically parses a bunch of config and xml files works out were to add in the new content then spits out the data into a new file. It all works - apart from the xml and config file format in the new file with XML files the original XML (that ends up in... (2 Replies)
Discussion started by: dfinch
2 Replies

2. Shell Programming and Scripting

Help with parsing data with awk , eliminating unwanted data

Experts , Below is the data: --- Physical volumes --- PV Name /dev/dsk/c1t2d0 VG Name /dev/vg00 PV Status available Allocatable yes VGDA 2 Cur LV 8 PE Size (Mbytes) 8 Total PE 4350 Free PE 2036 Allocated PE 2314 Stale PE 0 IO Timeout (Seconds) default --- Physical volumes ---... (5 Replies)
Discussion started by: rveri
5 Replies

3. Shell Programming and Scripting

XML parsing using nawk help needed

i need one help, below is one more xml file with diff pattern i tried it but dint get it , iam sure its a peice of cake for you guys. <xn:MeContext id="LSVLKY001"> <xn:ManagedElement id="1"> <un:RncFunction id="1"> <un:UtranCell... (2 Replies)
Discussion started by: tech_frk
2 Replies

4. Shell Programming and Scripting

Sed/Nawk help

Hi all, I have the following piece of code which reformats a file but the part that isn't working converts a date from dd mmm ccyy format to 2 fields containing dd/mm/ccyy,ccyy-mm-dd hh:mm:ss.000000 This is for a DB2 load file. The problem I have is that the input dd values (the dd variable... (2 Replies)
Discussion started by: dazedandconfuse
2 Replies

5. Shell Programming and Scripting

Help in parsing xml file (sed/nawk)

I have a large xml file as shown below: <input> <blah> <blah> <atr="blah blah value = ""> <blah> <blah> </input> ..2nd chunk... ..3rd chunk... ...4th chunk... All lines between <input> and </input> is one 'order' and this 'order' is repeated... (14 Replies)
Discussion started by: shekhar2010us
14 Replies

6. Shell Programming and Scripting

parsing(xml) using nawk/awk

Hi , I have an xml format as shown below: <Info> <last name="sean" first name="john"/> <period="5" time="11"/> <test value="1",test2 value="2",test3 value="3",test4 value="5"> <old> <value1>1</value1> <value2>2</value2> </old> <new> <value1>4</value1> <value2>3</value2> </new>... (1 Reply)
Discussion started by: natalie23
1 Replies

7. Shell Programming and Scripting

Format - Inventory Row data into Column - Awk - Nawk

Hi All, I have the following file that has computer data for various pcs in my network... Snap of the file is as follows ******************************************************************************* Serial 123456 Computer IP Address lo0:... (1 Reply)
Discussion started by: aavam
1 Replies

8. Shell Programming and Scripting

sed selective data parsing

i have file in the following format *RECORD* *FIELD NO* 123456 *FIELD TX* this is a sample entry *FIELD SA* See Also *FIELD RF* References *FIELD CS* Clinical Symptoms *FIELD AV* Allelic Variants *FIELD EH* Edit History *RECORD* *FIELD NO* 123456 (1 Reply)
Discussion started by: dunstonrocks
1 Replies

9. Shell Programming and Scripting

Need help in sed or nawk replace RE

Dear Friends, I am trying to replace the following pattern , But I cant understand how to express the RE in this case . Please help me with a sed command to replace . (Address = 918h : Initial = 0000h : RD /WR (Address = 91Ah : Initial =... (2 Replies)
Discussion started by: user_prady
2 Replies
Login or Register to Ask a Question