Extract a value from an xml file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract a value from an xml file
# 1  
Old 12-23-2016
Extract a value from an xml file

I have this XML file format and all in one line:

Code:
Fri Dec 23 00:14:52 2016 Logged Message:689|<?xml version="1.0" encoding="UTF-8"?><PORT_RESPONSE><HEADER><ORIGINATOR>XMG</ORIGINATOR><DESTINAT
ION>ENSEMBLE</DESTINATION><MESSAGE_ID>NXT107349698</MESSAGE_ID><MSGTYPE>PRI</MSGTYPE><TIMESTAMP>12232016061452</TIMESTAMP></HEADER><ADMIN><WIC
IS_REL_NO>5.0.0</WICIS_REL_NO><NNSP>9664</NNSP><OLSP>6529</OLSP><ONSP>6529</ONSP><REQ_NO>6664016358514349</REQ_NO><VER_ID_REQ>00</VER_ID_REQ><
VER_ID_RESP>00</VER_ID_RESP><RT>C</RT><RESP_NO>652901635838480144</RESP_NO><CD_TSENT>122220160614</CD_TSENT><REP>Port Center</REP><TEL_NO_REP>
000-207-8009</TEL_NO_REP><CHC></CHC><DD_T>122320160909</DD_T><NPQTY>00001</NPQTY></ADMIN><LINE_DATA><PORTED_NUM>990-799-1234</PORTED_NUM></LIN
E_DATA></PORT_RESPONSE>

How can I write a script to extract only the value 990-799-1234 which is the value for PORTED_NUM and store in a field.

I am using solaris 10 Unix box.



Moderator's Comments:
Mod Comment Please use correct CODE tags as required by forum rules!

Last edited by RudiC; 12-23-2016 at 01:17 PM.. Reason: Changed ICODE to CODE tags.
# 2  
Old 12-23-2016
WHERE to store that number? Try
Code:
awk  'match ($0, /<PORTED_NUM>[0-9-]*<\/PORTED_NUM>/) {print substr ($0, RSTART+12, RLENGTH-25)}' file
990-799-1234

This User Gave Thanks to RudiC For This Post:
# 3  
Old 12-23-2016
Quote:
Originally Posted by RudiC
WHERE to store that number? Try
Code:
awk  'match ($0, /<PORTED_NUM>[0-9-]*<\/PORTED_NUM>/) {print substr ($0, RSTART+12, RLENGTH-25)}' file
990-799-1234

Getting the following errors:

awk: syntax error near line 1
awk: bailing out near line 1
# 4  
Old 12-23-2016
Please, try the following:

Code:
perl -nle '/<PORTED_NUM>([\d-]*)</ and print $1' mrn6430.xml

This User Gave Thanks to Aia For This Post:
# 5  
Old 12-23-2016
Quote:
Originally Posted by mrn6430
Getting the following errors:
awk: syntax error near line 1
awk: bailing out near line 1
Hello mrn6430,

Kindly change awk to nawk or /usr/xpg4/bin/awk and then it should fly.

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 6  
Old 12-23-2016
Hi,

I have some package here on my debian box which is called xml-twig-tools containing a program called xml_grep(a perl script) . I'm sure there are a bunch of xml-tools out there. I think within Solaris it would require to download some Perl Modules from CPAN for you and getting the script from here: xml_grep - search.cpan.org

With this tool I can do the following:

Code:
 xml_grep //PORTED_NUM --text_only your-file.xml

Output:
Code:
990-799-1234

or with xmlstarlet(a lot faster than the perl script, there's a package available for solaris) some compiled binary:
Code:
xmlstarlet sel -t -v //PORTED_NUM your-file.xml

If this is only a one-shot thing, you'll be better of with above awk or perl tips, since you do not have to fiddle with any installation. If you have more often to deal with XML those xmltools will help be a lot easier to handle.

Last edited by stomp; 12-23-2016 at 07:04 PM.. Reason: shortened xmlstarlet/xml_grep call with XPATH global search Pattern //
This User Gave Thanks to stomp For This Post:
# 7  
Old 12-23-2016
Hi,

IMO, if it is xml file with proper xml syntax then xmllint is better choice.

Read man xmllint for detailed information.

Code:
echo "cat //*[local-name()='PORT_RESPONSE']/*[local-name()='LINE_DATA']/*[local-name()='PORTED_NUM']/text()" | xmllint --shell input.xml | sed -e '/^\//d'

Gives desired output:
Quote:
990-799-1234
I removed
Quote:
Fri Dec 23 00:14:52 2016 Logged Message:689|
due to xml parser error,
Quote:
parser error : Start tag expected, '<' not found
Fri Dec 23 00:14:52 2016 Logged Message:689|<?xml version="1.0" encoding="UTF-8"
^
I tried below xml content in single line.( unlike post #1 thrown errors when checked for xml syntax using xmllint )

Simulated file content:
Code:
cat input.xml
<?xml version="1.0" encoding="UTF-8"?><PORT_RESPONSE><HEADER><ORIGINATOR>XMG</ORIGINATOR><DESTINATION>ENSEMBLE</DESTINATION><MESSAGE_ID>NXT107349698</MESSAGE_ID><MSGTYPE>PRI</MSGTYPE><TIMESTAMP>12232016061452</TIMESTAMP></HEADER><ADMIN><WICIS_REL_NO>5.0.0</WICIS_REL_NO><NNSP>9664</NNSP><OLSP>6529</OLSP><ONSP>6529</ONSP><REQ_NO>6664016358514349</REQ_NO><VER_ID_REQ>00</VER_ID_REQ><VER_ID_RESP>00</VER_ID_RESP><RT>C</RT><RESP_NO>652901635838480144</RESP_NO><CD_TSENT>122220160614</CD_TSENT><REP>Port Center</REP><TEL_NO_REP>000-207-8009</TEL_NO_REP><CHC></CHC><DD_T>122320160909</DD_T><NPQTY>00001</NPQTY></ADMIN><LINE_DATA><PORTED_NUM>990-799-1234</PORTED_NUM></LINE_DATA></PORT_RESPONSE>

These 2 Users Gave Thanks to greet_sed For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Extract Element from XML file

<?xml version = '1.0' encoding =... (8 Replies)
Discussion started by: Siva SQL
8 Replies

2. Shell Programming and Scripting

Extract a particular xml only from an xml jar file

Hi..need help on how to extract a particular xml file only from an xml jar file... thanks! (2 Replies)
Discussion started by: qwerty000
2 Replies

3. Shell Programming and Scripting

Extract data from XML file

Hi , I have input file as XML. following are input data #complex.xml <?xml version="1.0" encoding="UTF-8"?> <TEST_doc xmlns="http://www.w3.org/2001/XMLSchema-instance"> <ENTRY uid="123456"> <protein> <name>PROT001</name> <organism>Human</organism> ... (1 Reply)
Discussion started by: mohan sharma
1 Replies

4. Shell Programming and Scripting

Extract XML tag value from file

Hello, Hope you are doing fine. I have an log file which looks like as follows: Some junk text1 Date: Thu Mar 15 13:38:46 CDT 2012 DATA SENT SUCCESSFULL: Some jun text 2 Date: Thu Mar 15 13:38:46 CDT 2012 DATA SENT SUCCESSFULL: ... (3 Replies)
Discussion started by: srattani
3 Replies

5. Shell Programming and Scripting

Extract values from an XML File

Hi, I need to capture all the attributes with delete next to it. The source XML file is attached. The output should contain something like this below: Attributes = legacyExchangeDN Action = Delete Username = Hero Joker Loginid = joker09 OU =... (4 Replies)
Discussion started by: prvnrk
4 Replies

6. Shell Programming and Scripting

extract a pattern from a xml file

Hello All, I want to write a shell script for extracting a content from a xml file the xml file looks like this: <Variable name="moreAxleInfo"> <type> <Table> <type> <NamedType> <type> <TypeRef... (11 Replies)
Discussion started by: suvendu4urs
11 Replies

7. Shell Programming and Scripting

Extract XML content from a file

310439 2012-01-11 03:44:42,291 INFO PutServlet:? - Content of the Message is:="1.0" encoding="UTF-8"?><ESP_SSIA_ACC_FEED> 310440 <BATCH_ID>12345678519</BATCH_ID> 310441 <UID>3498748823</UID> 310442 <FEED_TYPE>FULL</FEED_TYPE> 310443 <MART_NAME>SSIA_DM_TRANSACTIONS</MART_NAME> 310444... (11 Replies)
Discussion started by: arukuku
11 Replies

8. Shell Programming and Scripting

Extract details from XML file

Hi , I have one xml file contains more than 60 lines. I need to extract some details from the file and store it in new file.Not the whole file Please find the xml file below: <?xml version="1.0" encoding="UTF-8"?> <DeploymentDescriptors xmlns="http://www.tibco.com/xmlns/dd"> ... (6 Replies)
Discussion started by: ckchelladurai
6 Replies

9. UNIX for Dummies Questions & Answers

Extract Field Value from XML file

Hi, Within a UNIX shell script I need to extract a value from an XML field. The field will contain different values but will always be 6 digits in length. E.g.: <provider-id>999999</provider-id> I've tried various ways but no luck. Any ideas how I might get the provider id (in this case... (2 Replies)
Discussion started by: pnclayt11
2 Replies

10. Shell Programming and Scripting

How to extract text from xml file

I have some xml files that got created by exporting a website from RedDot. I would like to extract the cost, course number, description, and meeting information. <?xml version="1.0" encoding="UTF-16" standalone="yes" ?> - <PAG PAG0="3AE6FCFD86D34896A82FCA3B7B76FF90" PAG3="525312"... (3 Replies)
Discussion started by: chrisf
3 Replies
Login or Register to Ask a Question