Here is RudiC's script with some slight modifications:
add comments,
added tracing to show which input lines are being processed, and what data is being captured from those lines (to make it easier for you to follow what the code is doing),
capture data from multiple <Profile> tags,
look for tags that do not appear at the start of a line (needed since you didn't originally use CODE tags when you posted your sample input), and
slightly reformat the trailer to match your expected output.
Note that neither of our scripts reformat the data found with the <ns5:LastSignOn> tags to YYYYMMDD instead of DD/MM/YY format nor to change n/a to an empty string. If that is important to you, try changing the code to do that on your own. If you can't get it to work, show us what you tried and the output it produced (in CODE tags) and we'll try to help fix it.
Code:
# Use awk to run the following script with the variable trace set to 0.
awk -v trace=0 '
# Before reading lines from the input file, print the header.
BEGIN { print "20151027 GLOBAL USER GROUP"
}
# Look for lines containing "<Operator_info" or "</Operator_info".
/<.?Operator_info/ {
# If the 2nd character of the 1st field is "O", set ON to 1; otherwise
# (i.e., if the 2nd character is "/") set ON to 0.
ON = (substr ($1, 2, 1) == "O")
}
# If on is 0 (or has not yet been set), skip to next input line and ignore the
# following sections of this script for the current line.
!ON { next
}
# Look for lines containing:
# "<ns5:" followed by "EnableStat" or by "SignOn"
# "<Identif"
# or "<Profile"
/<ns5:.*(EnableStat|SignOn)/ || /<(Identif|Profile)/ {
# Set IX to the uppercase version of the 6th character in the 1st field:
# i.e., E for <ns5:"E"nableStatus
# I for <Prof"i"le
# L for <ns5:"L"astSignOn
# or T for <Iden"t"ifier>
IX = toupper (substr ($1, 6, 1))
# If trace is set to a non-0, non-empty-string value, print the current
# line number and contents.
if(trace) printf("line %d:%s\n", NR, $0)
# If IX is "I" or "T" replace the current input line with the next
# input line and continue processing.
if (IX ~ /[IT]/) {
getline
# And, if trace is set to a non-0, non-empty-string value, print
# the current line number and contents.
if(trace) printf("Line %d:%s\n", NR, $0)
}
# Throw away everying from the start of the current input line from the
# start of the current line up to and including the 1st ">" and
# everything from the next "<" to the end of the line.
gsub (/^[^>]*>|<[^<]*$/, "")
# If we are processing a <Profile> tag increment the number of <Profile>
# tags we have seen and save the remaining data from the current line
# in the array T with the subscript being the number of <Profile> tags
# we have seen, otherwise, save the remaining data from the current line
# in the array T with the subscript being the current value saved in IX.
if(IX == "I")
T[++pcnt] = $0
else T[IX] = $0
# If trace is set to a non-0, non-empty-string value, print the array
# elemnt we just initialized.
if(trace) printf("T[%s]=%s\n", (IX=="I") ? pcnt : IX, $0)
# If we are processing an <ns5:LastSignOn> tag, print the results we
# have accumulated for this <Identifier tag.
if(IX == "L") {
# Print one line for each <Profile> tag we have seen.
for(i = 1; i <= pcnt; i++) {
# Note that if the data saved for the <ns5:EnableStatus>
# flag was "DISABLED" (or, actually, started with "D"),
# print "D" for that field; otherwise, print a <space>
# for that field.
printf("ACR|%s|%s|%s|%s|%s\n",
T["T"], T[i], (T["E"]~/^D/)?"D":" ", T["L"], T["T"])
# Increment the number of detail records we have
# printed.
CNT++
}
# Clear the number of <Profile> tags we have seen.
pcnt = 0
}
}
# When we hit end-of-file on the last input file, print the trailer line.
END { print "NUMBER OF DETAIL RECORDS:" CNT
}
# End the script to be run by awk and list the input files to be processed.
' file
If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.
When run as shown above (with tracing turned off) and with the sample data you supplied being contained in a file named file, it produces the output:
Code:
20151027 GLOBAL USER GROUP
ACR|HELLO123|PROFILE1| |18/10/15|HELLO123
ACR|HELLO12|PROFILE2|D|n/a|HELLO12
ACR|HELLO12|PROFILE3|D|n/a|HELLO12
NUMBER OF DETAIL RECORDS:3
If you change the line:
Code:
awk -v trace=0 '
to:
Code:
awk -v trace=1 '
to enable tracing, it produces the output:
Code:
20151027 GLOBAL USER GROUP
line 7: <Identifier>
Line 8: <Name>HELLO123</Name>
T[T]=HELLO123
line 13: <Profile>
Line 14: <ns3:Name>PROFILE1</ns3:Name>
T[1]=PROFILE1
line 20: <ns5:EnableStatus>ENABLED</ns5:EnableStatus>
T[E]=ENABLED
line 24: <ns5:LastSignOn>18/10/15</ns5:LastSignOn>
T[L]=18/10/15
ACR|HELLO123|PROFILE1| |18/10/15|HELLO123
line 29: <Identifier>
Line 30: <Name>HELLO12</Name>
T[T]=HELLO12
line 35: <Profile>
Line 36: <ns3:Name>PROFILE2</ns3:Name>
T[1]=PROFILE2
line 38: <Profile>
Line 39: <ns3:Name>PROFILE3</ns3:Name>
T[2]=PROFILE3
line 45: <ns5:EnableStatus>DISABLED</ns5:EnableStatus>
T[E]=DISABLED
line 49: <ns5:LastSignOn>n/a</ns5:LastSignOn>
T[L]=n/a
ACR|HELLO12|PROFILE2|D|n/a|HELLO12
ACR|HELLO12|PROFILE3|D|n/a|HELLO12
NUMBER OF DETAIL RECORDS:3
i have a file whose data is like this::
osr_pe_assign|-120|wg000d@att.com|4|
osr_evt|-21|wg000d@att.com|4|
pe_avail|-21|wg000d@att.com|4|
osr_svt|-11|wg000d@att.com|4|
pe_mop|-13|wg000d@att.com|4|
instar_ready|-35|wg000d@att.com|4|
nsdnet_ready|-90|wg000d@att.com|4|... (6 Replies)
Hi experts,
I need urget help! I have the a text file with this format:
Types of fruits
Name of fruits
1,1
Farm_no,1
apple,1
pineapple,1
grapes,1
orange,1
banana,1
2,2--->this is the record seperator
Farm_no,2
apple,1
pineapple,1
grapes,3
orange,2
banana,1
3,3--->this is the... (1 Reply)
Hi experts,
I need urget help! I have the a text file with this format:
Types of fruits
Name of fruits
1,1
Farm_no,1
apple,1
pineapple,1
grapes,1
orange,1
banana,1
2,2--->this is the record seperator
Farm_no,2
apple,1
pineapple,1
grapes,3
orange,2
banana,1
3,3--->this is the... (2 Replies)
I have a text file that made using text editor in Ubuntu. However the text file is not being recognized as space or tab delimited, the formatting seems to be messed up. How can I convert the text file into tab delimited format? (3 Replies)
Hi, could some help me convert CSV file (with double quoted strings) to pipe delimited file:
here you go with the same data:
1,Friends,"$3.99 per 1,000 listings",8158here " 1,000 listings " should be a single field.
Thanks,
Ram (8 Replies)
Hi All,
I have space delimited file similar to the one as shown below.. I need to convert it as a pipe delimited, the values inside the pipe delimited file should be as highlighted...
AA ATIU2345098809 009697 005374
BB ATIU2345097809 005445 006518
CC ATIU9685098809 003215 003571
DD... (7 Replies)
Hi, I have a rquirement in unix as below .
I have a text file with me seperated by | symbol and i need to generate a excel file through unix commands/script so that each value will go to each column.
ex:
Input Text file:
1|A|apple
2|B|bottle
excel file to be generated as output as... (9 Replies)
we have a CSV which i need to convert to XML using Perl or Unix shell scripting.
I was able to build this XML in oracle database. However, SQL/XML query is running for long time. Hence, I'm considering to write a Perl or shell script to generate this XML file. Basically need to build this XML... (3 Replies)
I have file in linux with comma delimited and string fields in double quotations ", I need to convert them to pipe delimiter please share your inputs.
Example:
Input:
"2017-09-30","ACBD,TVF","01234",NULL,18,NULL,"686091802","BANK OF ABCD, LIMITED, THE",790456
Output:
... (4 Replies)