Complex transpose awk script


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Complex transpose awk script
# 1  
Old 01-16-2013
Complex transpose awk script

Hello to all in forum,

Maybe an awk expert could help me with this complex task for me.

I have the input shown below and I would like to get the output as follow:

- I would like the output separated by commas.
- The header is fixed and will be the same always.
- For the lines containing OCTDP, TCTDP and GPRSTDP I would like to join their values separated by "|", so they joined in the output will represent a single field (I've highlighted for best understanding)
- Each value below " CAMEL SUBSCRIPTION OPTIONS" would go in different fields in the output.
- When any value is not found, leave blank.

Output desired:
Code:
MSISDN,OCTDP,TCTDP,GPRSTDP,GCSO,MCSO,SSLO,GC2SO,MC2SO,TIF,GPRSSO,OSMSSO,GC3SO,MC3SO,GC4SO,MC4SO,MMSO
20385503286,2|15|2038759581|0|2||0,18|15|2038759581|1|2|Y|,,0,5,0,1,2,0,1,0,1,0,1,1,1
20386671719,2|15|2038759581|0|2||7,18|15|2038759594|1|2|Y|,18|15|20387592403|0|3||,0,5,0,1,2,0,1,0,1,2,1,2,0
20387323200,,18|15|2038759581|1|2|Y|,18|15|20387592403|0|3||,1,3,0,1,2,0,1,1,0,2,2,2,2

The Input is:
Code:
<HGCMP:MSISDN=20385503286,OPT;
HLR CAMEL SUBSCRIPTION DATA

MSISDN           CSP
20385503286

TDPTYPE  TDP  SK         GSA              DEH   CCH I  DIALNUM
OCTDP      2          15 2038759581       0     2
TCTDP     18          15 2038759581       1     2   Y

CAMEL SUBSCRIPTION OPTIONS

GCSO  MCSO  SSLO  GC2SO  MC2SO  TIF  GPRSSO  OSMSSO
0     5     0     1      2      0    1       0

GC3SO  MC3SO  GC4SO  MC4SO
1      0      1      1

MMSO
1

END

<HGCMP:MSISDN=20386671719,OPT;
HLR CAMEL SUBSCRIPTION DATA

MSISDN           CSP
20386671719

TDPTYPE  TDP  SK         GSA              DEH   CCH I  DIALNUM
OCTDP      2          15 2038759581       0     2      7
TCTDP     18          15 2038759594       1     2   Y
GPRSTDP   18          15 20387592403      0     3

CAMEL SUBSCRIPTION OPTIONS

GCSO  MCSO  SSLO  GC2SO  MC2SO  TIF  GPRSSO  OSMSSO
0     5     0     1      2      0    1       0

GC3SO  MC3SO  GC4SO  MC4SO
1      2      1      2

MMSO
0

END

<HGCMP:MSISDN=20387323200,OPT;
HLR CAMEL SUBSCRIPTION DATA

MSISDN           CSP
20387323200

TDPTYPE  TDP  SK         GSA              DEH   CCH I  DIALNUM
TCTDP     18          15 2038759581       1     2   Y
GPRSTDP   18          15 20387592403      0     3

CAMEL SUBSCRIPTION OPTIONS

GCSO  MCSO  SSLO  GC2SO  MC2SO  TIF  GPRSSO  OSMSSO
1     3     0     1      2      0    1       1

GC3SO  MC3SO  GC4SO  MC4SO
0      2      2      2

MMSO
2

END

Thanks in advance for any help.
# 2  
Old 01-16-2013
It seems more difficult than it actually is:
Code:
awk 'BEGIN {split("OCTDP TCTDP GPRSTDP",TDPAr);
            print "MSISDN,OCTDP,TCTDP,GPRSTDP,GCSO,MCSO,SSLO,GC2SO,MC2SO,TIF,GPRSSO,OSMSSO,GC3SO,MC3SO,GC4SO,MC4SO,MMSO"}
     /^MSISDN/   {getline; rec=$0}
     /^TDPTYPE/  {while ($0 !~ /^$/)
                     {getline; OFS="|"; tmp=$1; $1=""; sub("\|",""); tmpAr[tmp]=$0
                     }
                     for (i=1;i<=3;i++) rec=rec","tmpAr[TDPAr[i]]; delete tmpAr
                 }
     /^(GCSO|GC3SO|MMSO)/ {
                      getline; OFS=","; $1=$1; rec=rec","$0}
     /END/       {print rec; rec=""}
    ' file
MSISDN,OCTDP,TCTDP,GPRSTDP,GCSO,MCSO,SSLO,GC2SO,MC2SO,TIF,GPRSSO,OSMSSO,GC3SO,MC3SO,GC4SO,MC4SO,MMSO
20385503286,2|15|2038759581|0|2,18|15|2038759581|1|2|Y,,0,5,0,1,2,0,1,0,1,0,1,1,1
20386671719,2|15|2038759581|0|2|7,18|15|2038759594|1|2|Y,18|15|20387592403|0|3,0,5,0,1,2,0,1,0,1,2,1,2,0
20387323200,,18|15|2038759581|1|2|Y,18|15|20387592403|0|3,1,3,0,1,2,0,1,1,0,2,2,2,2

What I did not cover is the missing I and DIALNUM fields - they are not identifyable by separators separating empty fields. Well, it's up to you to introduce a counter and fill those gaps ...

Last edited by RudiC; 01-16-2013 at 06:39 AM..
# 3  
Old 01-17-2013
Hello RudiC,

Many thanks for your help, it works!.With getline you read the row below the string searched? Only to learn, is there other way to read line below matched string?

Besides this, if a number doesn't have values in the input, I would like to print the number in field 1 and "No_values" in the 2nd field. example:

Format when a number doesn't have values:
Code:
<HGCMP:MSISDN=20385503289,OPT;
HLR CAMEL SUBSCRIPTION DATA

MSISDN           CSP
20385503289

TDPTYPE  TDP  SK         GSA              DEH   CCH I  DIALNUM

NONE

END

Output desired for this case:
Code:
MSISDN,OCTDP,TCTDP,GPRSTDP,GCSO,MCSO,SSLO,GC2SO,MC2SO,TIF,GPRSSO,OSMSSO,GC3SO,MC3SO,GC4SO,MC4SO,MMSO
20385503286,2|15|2038759581|0|2,18|15|2038759581|1|2|Y,,0,5,0,1,2,0,1,0,1,0,1,1,1
20386671719,2|15|2038759581|0|2|7,18|15|2038759594|1|2|Y,18|15|20387592403|0|3,0,5,0,1,2,0,1,0,1,2,1,2,0
20387323200,,18|15|2038759581|1|2|Y,18|15|20387592403|0|3,1,3,0,1,2,0,1,1,0,2,2,2,2
20385503289,No_values

What code could be added to your script to get this part combined in the output?

Thanks in advance
# 4  
Old 01-17-2013
You should provide a sample on how the output looks like when 20385503289 have no value.
Then it could be made an IF test to print No_values if an field is missing.
Regarding I and DAILNUM, they can be found by looking at their position from the left, if its always the same location.
# 5  
Old 01-17-2013
Try
Code:
     /END/       {if (length(rec)<=11) rec=rec"no value"; print rec; rec=""}

@Jotne: May be difficult to identify I and DIALNUM as sometimes one or the other is present (NF==7, which is which?) , sometimes both (easy! NF==8), and sometimes none (easy! NF==6) of them.
# 6  
Old 01-17-2013
Hello RudiC,

I've tried:
Code:
awk 'BEGIN {split("OCTDP TCTDP GPRSTDP",TDPAr);
            print "MSISDN,OCTDP,TCTDP,GPRSTDP,GCSO,MCSO,SSLO,GC2SO,MC2SO,TIF,GPRSSO,OSMSSO,GC3SO,MC3SO,GC4SO,MC4SO,MMSO"}
     /^MSISDN/   {getline; rec=$0}
     /^TDPTYPE/  {while ($0 !~ /^$/)
                     {getline; OFS="|"; tmp=$1; $1=""; sub("|",""); tmpAr[tmp]=$0
                     }
                     for (i=1;i<=3;i++) rec=rec","tmpAr[TDPAr[i]]; delete tmpAr
                 }
     /^(GCSO|GC3SO|MMSO)/ {
                      getline; OFS=","; $1=$1; rec=rec","$0}
    /END/       {if (length(rec)<=11) rec=rec"no value"; print rec; rec=""}' input

but the output is:
Code:
MSISDN,OCTDP,TCTDP,GPRSTDP,GCSO,MCSO,SSLO,GC2SO,MC2SO,TIF,GPRSSO,OSMSSO,GC3SO,MC3SO,GC4SO,MC4SO,MMSO
20385503286,|2|15|2038759581|0|2,|18|15|2038759581|1|2|Y,,0,5,0,1,2,0,1,0,1,0,1,1,1
20386671719,|2|15|2038759581|0|2|7,|18|15|2038759594|1|2|Y,|18|15|20387592403|0|3,0,5,0,1,2,0,1,0,1,2,1,2,0
20387323200,,|18|15|2038759581|1|2|Y,|18|15|20387592403|0|3,1,3,0,1,2,0,1,1,0,2,2,2,2
20385503289,,,

So, prints the number, but with commas and "no value" is not printed.

But the "if" to print "no value" could be without using length()? because the length of the numbers could be less or greater than 11 digits.

So, it is possible to do that "if" to print "no value" checking if below TDPTYPE is blank or if the string NONE is found?

Thanks in advance again for your kind help!
# 7  
Old 01-17-2013
I is at position 53 and DIALNUM at 56 if I count correctly, so it should be not a big problem to take it out. NF will not work.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Transpose table with awk

I am trying to format the table below to the output input: cand week sub1 sub2 sub3 sub4 joe 1 94.19 70.99 43.93 60.14 joe 2 94.07 51.02 41.07 38.92 joe 3 26.24 30.95 44.56 67.67 joe 4 72.36 60.92 40.78 83.25 joe 5 51 70.01 44.66 82.22... (7 Replies)
Discussion started by: aydj
7 Replies

2. Shell Programming and Scripting

Transpose rows to columns complex

Input: IN,A,1 IN,B,3 IN,B,2 IN,C,7 BR,A,1 BR,A,5 BR,C,9 AR,C,9 Output: CNTRY,A,B,C IN,1,5,7 BR,6,0,9 AR,0,0,9 (7 Replies)
Discussion started by: unme
7 Replies

3. Shell Programming and Scripting

Transpose using awk

Hi Friends, Very urgent requirement please do needful ASAP.. Input: |1||1|1||1|3||3|2||2|4||4|2||2|3||3|NA||0|5||5|NA||0|4||4|3||3 output: |1||1 |1||1 |3||3 |2||2 |4||4 |2||2 |3||3 |NA||0 |5||5 (4 Replies)
Discussion started by: bharat1211
4 Replies

4. Shell Programming and Scripting

Transpose data as rows using awk

Hi I have below requirement, need help One file contains the meta data information and other file would have the data, match the column from file1 and with file2 and extract corresponding column value and display in another file File1: CUSTTYPECD COSTCENTER FNAME LNAME SERVICELVL ... (1 Reply)
Discussion started by: ravlapo
1 Replies

5. Shell Programming and Scripting

awk to transpose every 7 rows into columns

input: a1 a2 a3 a4 a5 a6 a7 b1 b2 b3 .. b7 .. z1 .. z7 (12 Replies)
Discussion started by: ux4me
12 Replies

6. Shell Programming and Scripting

Complex match of numbers between 2 files awk script

Hello to all, I hope some awk guru could help me. I have 2 input files: File1: Is the complete database File2: Contains some numbers which I want to compare File1: "NUMBERKEY","SERVICENAME","PARAMETERNAME","PARAMETERVALUE","ALTERNATENUMBERKEY"... (9 Replies)
Discussion started by: Ophiuchus
9 Replies

7. Shell Programming and Scripting

awk script (complex)

picked this up from another thread. echo 1st_file.csv; nawk -F, 'NR==FNR{a++;next} a{b++} END{for(i in b){if(b-1&&a!=b){print i";\t\t"b}else{print "NEW:"i";\t\t"b} } }' OFS=, 1st_file.csv *.csv | sort -r i need to use the above but with a slight modification.. 1.compare against 3 month... (25 Replies)
Discussion started by: slashbash
25 Replies

8. Shell Programming and Scripting

How can i transpose this rerult by using awk?

From>>> ATOM 1 ca 2 o 3 h 4 h 5 o dE/dx 0.2057422D-01 0.2463722D-01-0.1068047D-01-0.1495280D-01-0.3725362D-02 dE/dy -0.7179106D-02-0.1554542D-01 0.1016889D-01 0.3268502D-02-0.4888578D-01 dE/dz -0.5600872D-02 0.3110649D-01-0.4088230D-02-0.2295107D-01-0.2832048D-01 ATOM 6 h 7 h 8 o 9 h 10 h... (1 Reply)
Discussion started by: wanchem
1 Replies

9. Shell Programming and Scripting

Transpose using awk

I have a requirement to transpose the below xml which is in a text file on unix: <?xml version="1.0" ?> <REQUEST> <ID>XXX</ID> <TIMESTAMP>20090720062610</TIMESTAMP> <FLAG>Y</FLAG> <TO_FLAG>Y</TO_FLAG> </REQUEST> to <?xml version="1.0"... (13 Replies)
Discussion started by: new_ds_man
13 Replies

10. Shell Programming and Scripting

How to transpose data elements in awk

Hi, I have an input data file :- Test4599,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,2,2,Rain Test90,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,1,0,Not Rain etc.... I wanted to transpose these data to:-... (2 Replies)
Discussion started by: ahjiefreak
2 Replies
Login or Register to Ask a Question