To extract a string between two words in XML file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting To extract a string between two words in XML file
# 1  
Old 06-01-2013
To extract a string between two words in XML file

i need to extract the string between two tags,
input file is
Code:
<PersonInfoShipTo AddressID="446311709" AddressLine1="" AddressLine2="" AddressLine3="" AddressLine4="" AddressLine5="" AddressLine6="" AlternateEmailID="" Beeper="" City="" Company="" Country="" DayFaxNo="" DayPhone="" Department="" EMailID="" EveningFaxNo="" EveningPhone="" FirstName="la" IsAddressVerified="Y" JobTitle="" LastName="la" MiddleName="" MobilePhone="" OtherPhone="" PersonID="" PersonInfoKey="201204240041014009667499" State="" Suffix="" Title="" ZipCode=""/>

I need to extract between <PersonInfoShipTo and /> and put in another file.
I tried following code
Code:
awk '/PersonInfoshipTo /, ///' input2.xml | sed '$d'
awk '/PersonInfoshipTo /{s=x}{s=s$0"\n"}/Line13/{p=1}/Canceled/ && p{print s;exit}' file
sed -e 's/PersonInfoshipTo \(.*\)>/\1/'

Please help with your ideas
ThanksSmilie

Last edited by Scrutinizer; 06-01-2013 at 07:33 AM.. Reason: code tags also for data samples
# 2  
Old 06-01-2013
What should your output look like?
# 3  
Old 06-01-2013
This s the outout which i need,

Code:
AddressID="446311709" AddressLine1="" AddressLine2="" AddressLine3="" AddressLine4="" AddressLine5="" AddressLine6="" AlternateEmailID="" Beeper="" City="" Company="" Country="" DayFaxNo="" DayPhone="" Department="" EMailID="" EveningFaxNo="" EveningPhone="" FirstName="la" IsAddressVerified="Y" JobTitle="" LastName="la" MiddleName="" MobilePhone="" OtherPhone="" PersonID="" PersonInfoKey="201204240041014009667499" State="" Suffix="" Title="" ZipCode=""

---------- Post updated at 04:10 PM ---------- Previous update was at 04:07 PM ----------

output is like

AddressID="446311709" AddressLine1="" AddressLine2="" AddressLine3="" AddressLine4="" AddressLine5="" AddressLine6="" AlternateEmailID="" Beeper="" City="" Company="" Country="" DayFaxNo="" DayPhone="" Department="" EMailID="" EveningFaxNo="" EveningPhone="" FirstName="la" IsAddressVerified="Y" JobTitle="" LastName="la" MiddleName="" MobilePhone="" OtherPhone="" PersonID="" PersonInfoKey="201204240041014009667499" State="" Suffix="" Title="" ZipCode=""
# 4  
Old 06-01-2013
Hi, see if this works:
Code:
awk 'sub("^" s FS,x) && sub(/\/>\n/,x)' s=PersonInfoShipTo RS=\< file > newfile

if it is always alway on one line, you could try:
Code:
sed -n 's|^<PersonInfoShipTo \(.*\)/>|\1|p' file > newfile

Otherwise you could try using an XML parser...
# 5  
Old 06-01-2013
hi it works perfect.thanks a lotSmilieSmilie
i have one more requirement.I have modified the extracted data in a file and and i need to insert in the place exactly where i take from.
i.e between <PersonInfoShipTo and />
thanks for ur help.Smilie
# 6  
Old 06-02-2013
This replaces the matching line(s) with the contents of newfile
Code:
sed -e '/^<PersonInfoShipTo .*\/>/ {r newfile' -e 'd;}' file

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Shell - Read a text file with two words and extract data

hi I made this simple script to extract data and pretty much is a list and would like to extract data of two words separated by commas and I would like to make a new text file that would list these extracted data into a list and each in a new line. Example that worked for me with text file... (5 Replies)
Discussion started by: dandaryll
5 Replies

2. Shell Programming and Scripting

How can I extract XML block around matching search string?

I want to extract XML block surrounding search string Ex: print XML block for string "myapp1-ear" surrounded by "<application> .. </application>" Input XML: <?xml version="1.0" encoding="UTF-8"?> <deployment-request> <requestor> <first-name>kchinnam</first-name> ... (16 Replies)
Discussion started by: kchinnam
16 Replies

3. Shell Programming and Scripting

Extract words starting with a pattern from a file

Hi Guys.. I have a file and i want to extract all words that starts with a pattern 'ABC_' or 'ADF_' For example, ABC.txt ---- INSERT INTO ABC_DLKFJAL_FJKLD SELECT DISTINCT S,B,C FROM ADF_DKF_KDFJ_IERU8 A, ABC_LKDJFREUE9_FJKDF B WHERE A.FI=B.EI; COMMIT; Output : ABS_DLKFJAL_FJKLD,... (5 Replies)
Discussion started by: Pramod_009
5 Replies

4. Shell Programming and Scripting

I need to extract uique words from text file

Hello programmers, I need to create a list of unique words from a text file using PERL...may i have the code for that please? Thank you (1 Reply)
Discussion started by: alsohari
1 Replies

5. Shell Programming and Scripting

Extract a particular xml only from an xml jar file

Hi..need help on how to extract a particular xml file only from an xml jar file... thanks! (2 Replies)
Discussion started by: qwerty000
2 Replies

6. Shell Programming and Scripting

Extract string from XML

Hi, I wish to grep for the first instance of <listen-address> value between the first <server></server> tag in an xml file. Sample xml: ......... <timeout-seconds>1500</timeout-seconds> </jta> <server> <name>Adminserver_DEV</name> ... (9 Replies)
Discussion started by: mohtashims
9 Replies

7. Shell Programming and Scripting

XML - Split And Extract String between Chars

Hi, I am trying to read the records from file and split into multiple files. SourceFile.txt <?xml version="1.0" encoding="UTF-8"?>... (2 Replies)
Discussion started by: unme
2 Replies

8. UNIX for Dummies Questions & Answers

Extract words to new file

Hi there, Unix Gurus Working with big listings of english sentences for my pupils, of the type: 1. If the boss's son had been , someone would have asked for money by now. 2. Look, I haven't a crime, so why can't you let me go? .... I wondered how to extract the words between brackets in... (7 Replies)
Discussion started by: eldeingles
7 Replies

9. Shell Programming and Scripting

[sed] extract words from a string

Hi, One of the scripts creates logs in the format: progname_file1.log.20100312020657 where after file the number could be from 1 to 28 and after log. the date is attached in the format YYYYMMDDHHMISS progname_file<1-28>.log.YYYYMMDDHHMISS. Now I want to discard the .20100312020657... (7 Replies)
Discussion started by: dips_ag
7 Replies

10. UNIX for Dummies Questions & Answers

To Extract words from File based on Position

Hi Guys, While I was writing one shell script , I just got struck at this point. I need to extract words from a file at some specified position and do some comparison operation and need to replace the extracted word with another word. Eg : I like Orange very much. I need to replace... (19 Replies)
Discussion started by: kuttu123
19 Replies
Login or Register to Ask a Question