How can I extract XML block around matching search string?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How can I extract XML block around matching search string?
# 1  
Old 02-12-2016
How can I extract XML block around matching search string?

I want to extract XML block surrounding search string
Ex: print XML block for string "myapp1-ear" surrounded by "<application> .. </application>"
Input XML:
Code:
<?xml version="1.0" encoding="UTF-8"?>
<deployment-request>
  <requestor>
    <first-name>kchinnam</first-name>
    <last-name>Group</last-name>
    <email-address>kchinnam@some.com</email-address>
  </requestor>
  <notify-list>
    <email-address>kchinnam@some.com</email-address>
  </notify-list>
  <application>
    <application-name>myapp1-ear</application-name>
    <ear-file-name>myapp1-ear.ear</ear-file-name>
    <edition/>
    <shared-library-name/>
  </application>
  <application>
    <application-name>myapp2-ear</application-name>
    <ear-file-name>myapp2-ear.ear</ear-file-name>
    <edition/>
    <shared-library-name/>
    <CookieSettings>
      <path>/</path>
    </CookieSettings>
    <options/>
  </application>
</deployment-request>

Expected Output XML:
Code:
  <application>
    <application-name>myapp1-ear</application-name>
    <ear-file-name>myapp1-ear.ear</ear-file-name>
    <edition/>
    <shared-library-name/>
  </application>

Can I do something like
strear=myapp1-ear; sed -n '/$strear/ /<application>/, /<\/application>/' <xmlfile.xml>

---------- Post updated at 03:25 PM ---------- Previous update was at 02:05 PM ----------

I tried perl regex "pcregrep", it is not working.

Code:
 
pcregrep -M '\{(<application>.*myapp1-ear.*<\/application>)\}' xmlfile.xml


Last edited by kchinnam; 02-13-2016 at 11:49 PM.. Reason: corrected input
# 2  
Old 02-12-2016
You haven't said what operating system or shell you're using, but for things like this I usually use awk. This seems to do what you want:
Code:
#!/bin/ksh
strear='myapp1-ear'

awk -v app_name="$strear" '
/<application>/	{
	cnt = copy = 0
}
$0 ~ "<application-name>" app_name "</application-name>" {
	copy = 1
}
{	line[++cnt] = $0
}
/<\/application>/ {
	if(copy) {
		copy = 0
		for(i = 1; i <= cnt; i++)
			print line[i]
	}
}' xmlfile.xml

If you're running this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.
This User Gave Thanks to Don Cragun For This Post:
# 3  
Old 02-12-2016
Don,
Your solution is working. Thanks a lot for your effort.
Here is my bash and OS version.. so I got better ammo here :-).
Code:
GNU bash, version 3.2.51(1)-release (x86_64-suse-linux-gnu)

I am wondering if solution can be simplified if we can break text into groups with delimiter:
Code:
<application>..</application>

Then simply select the group string that has --> $strear
# 4  
Old 02-12-2016
If the awk script I suggested was too complicated for you, you could try this simple ed script:
Code:
#!/bin/ksh
strear='myapp1-ear'

ed -s xmlfile.xml <<EOF
g/<application-name>$strear<\/application-name>/?<application>?,/<\/application>/p
EOF

This will work with any shell that recognizes basic Bourne shell syntax (so you can use bash instead of ksh if you want to.

If I knew more details about your XML file tags, the BREs in the above script could probably be significantly simplified. With the limited information provided, these verbose BREs should accurately perform the requested operation as long as the opening <application> and closing </application> tags are on lines by themselves as shown in your sample data.

If you still don't like this, feel free to use your better ammo.
# 5  
Old 02-12-2016
Please, try this Perl version, the highlighted is the search parameter you might want to change. kchinnam.xml is the modified file I used against.

Code:
cat kchinnam.xml

Code:
<application>
        <application-name>myapp1-ear</application-name>
        <ear-file-name>myapp1-ear.ear</ear-file-name>
        <edition></edition>
        <shared-library-name></shared-library-name>
</application>
<nothinghere>
    <test>test-in-case-of-other-blocks-inserted</test>
</nothinghere>
<application>
        <application-name>myapp2-ear</application-name>
        <ear-file-name>myapp2-ear.ear</ear-file-name>
        <edition></edition>
        <shared-library-name></shared-library-name>
                <CookieSettings>
                           <path>/</path>
                  </CookieSettings>
        </options>
</application>

Code:
perl -ne 'BEGIN{$/="</application>\n";} print m|(<application>.*myapp1-ear.*$/)|ms' kchinnam.xml

Code:
<application>
        <application-name>myapp1-ear</application-name>
        <ear-file-name>myapp1-ear.ear</ear-file-name>
        <edition></edition>
        <shared-library-name></shared-library-name>
</application>

Code:
perl -ne 'BEGIN{$/="</application>\n";} print m|(<application>.*myapp2-ear.*$/)|ms' kchinnam.xml.xml

Code:
<application>
        <application-name>myapp2-ear</application-name>
        <ear-file-name>myapp2-ear.ear</ear-file-name>
        <edition></edition>
        <shared-library-name></shared-library-name>
                <CookieSettings>
                           <path>/</path>
                  </CookieSettings>
        </options>
</application>

# 6  
Old 02-12-2016
Don ed solution worked great.. I never used it, I need to understand how its working. syntax looks very close to sed. I wish I could use a single like sed for this.

Aia,, your solution worked when I removed prefix spaces with tag <application>. I tried below to allow spaces, its not working..
Code:
perl -ne 'BEGIN{$/="\s.*</application>\n";} print m|(\s.*<application>.*myapp1-ear.*/)|ms' xmlfile.xml

Can we tell it to ignore spaces before and after </application> tag?
# 7  
Old 02-12-2016
Quote:
Originally Posted by kchinnam
Aia,, your solution worked when I removed prefix spaces with tag <application>. I tried below to allow spaces, its not working..
[...]
Can we tell it to ignore spaces before and after </application> tag?
Is your posted data not a true representation of the real file?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Extract XML block when value is matched (Shell script)

Hi everyone, So i'm struggling with an xml (log file) where we get information about some devices, so the logfile is filled with multiple "blocks" like that. Based on the <devId> i want to extract this part of the xml file. If possible I want it to have an script for this, cause we'll use... (5 Replies)
Discussion started by: Pouky
5 Replies

2. Shell Programming and Scripting

Extract all text between the same matching string from a given column

Hello All, I have an input sample data like below (In actual I have many columns and few million rows). Column1,Column2 4,2 1,5 Hello,4 1,4 Hello,2 3,5 Hello,8 4,5 Need the output (using awk and/or sed preferably) like below. Here I need all the lines between 2 matching... (1 Reply)
Discussion started by: ks_reddy
1 Replies

3. Shell Programming and Scripting

Search String and extract few lines under the searched string

Need Assistance in shell programming... I have a huge file which has multiple stations and i wanted to search particular station and extract few lines from it and the rest is not needed Bold letters are the stations . The whole file has multiple stations . Below example i wanted to search... (4 Replies)
Discussion started by: ajayram_arya
4 Replies

4. Shell Programming and Scripting

Extract First and matching word from string in UNIX

Thank you (2 Replies)
Discussion started by: Pratik Majithia
2 Replies

5. Shell Programming and Scripting

To extract a string between two words in XML file

i need to extract the string between two tags, input file is <PersonInfoShipTo AddressID="446311709" AddressLine1="" AddressLine2="" AddressLine3="" AddressLine4="" AddressLine5="" AddressLine6="" AlternateEmailID="" Beeper="" City="" Company="" Country="" DayFaxNo="" DayPhone="" Department=""... (5 Replies)
Discussion started by: Padmanabhan
5 Replies

6. Shell Programming and Scripting

Extract string from XML

Hi, I wish to grep for the first instance of <listen-address> value between the first <server></server> tag in an xml file. Sample xml: ......... <timeout-seconds>1500</timeout-seconds> </jta> <server> <name>Adminserver_DEV</name> ... (9 Replies)
Discussion started by: mohtashims
9 Replies

7. Shell Programming and Scripting

XML - Split And Extract String between Chars

Hi, I am trying to read the records from file and split into multiple files. SourceFile.txt <?xml version="1.0" encoding="UTF-8"?>... (2 Replies)
Discussion started by: unme
2 Replies

8. UNIX for Dummies Questions & Answers

Search and extract matching patterns

%%%%% (9 Replies)
Discussion started by: lucasvs
9 Replies

9. Shell Programming and Scripting

Extract selective block from XML file

Hi, There's an xml file produced from a front-end tool as shown below: <INPUT DATABASE ="ORACLE" DBNAME ="UNIX" NAME ="FACT_TABLE" OWNERNAME ="DIPS"> <INPUTFIELD DATATYPE ="double" DEFAULTVALUE ="" DESCRIPTION ="" NAME ="STORE_KEY" PICTURETEXT ="" PORTTYPE ="INPUT" PRECISION ="15" SCALE... (6 Replies)
Discussion started by: dips_ag
6 Replies

10. Shell Programming and Scripting

Search for string in a file and extract another string to a variable

Hi, guys. I have one question: I need to search for a string in a file, and then extract another string from the file and assign it to a variable. For example: the contents of the file (group) is below: ... ftp:x:23: mail:x:34 ... testing:x:2001 sales:x:2002 development:x:2003 ...... (6 Replies)
Discussion started by: daikeyang
6 Replies
Login or Register to Ask a Question