Extracting content from xml file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extracting content from xml file
# 1  
Old 05-02-2013
Extracting content from xml file

Hello All,

Hope you are doing well!!!!!

I have a small code in the below format in xml file:
Code:
<UML:ModelElement.taggedValue>
				<UML:TaggedValue tag="documentation" value="This sequence&#xA;&#xA;HLD_EA_0001X&#xA;HLD_DOORS_002X"/>
				<UML:TaggedValue tag="documentation" value="This sequence&#xA;&#xA;HLD_EA_0231X&#xA;HLD_DOORS_003X;HLD_DOORS_0021"/>
				<UML:TaggedValue tag="documentation" value="This sequence&#xA;&#xA;HLD_EA_0232X&#xA;HLD_DOORS_003X;HLD_DOORS_ijkl"/>
				<UML:TaggedValue tag="documentation" value="This sequence&#xA;&#xA;HLD_EA_0345X&#xA;HLD_DOORS_05762X;HLD_DOORS_aasja"/>
				<UML:TaggedValue tag="documentation" value="This sequence&#xA;&#xA;HLD_EA_0001X&#xA;HLD_DOORS_002X"/>
				<UML:TaggedValue tag="documentation" value="This sequence&#xA;&#xA;HLD_EA_0001X&#xA;HLD_DOORS_002X"/>
				<UML:TaggedValue tag="documentation" value="This sequence&#xA;&#xA;HLD_EA_0001X&#xA;HLD_DOORS_002X"/>
				<UML:TaggedValue tag="documentation" value="This sequence&#xA;&#xA;HLD_EA_0001X&#xA;HLD_DOORS_002X"/>
				<UML:TaggedValue tag="version" value="1.0"/>
				<UML:TaggedValue tag="author" value="suvendu.rath"/>
				<UML:TaggedValue tag="created_date" value="2013-05-02 10:40:30"/>
				<UML:TaggedValue tag="modified_date" value="2013-05-02 16:59:06"/>
				<UML:TaggedValue tag="package" value="EAPK_6C9E48AC_4D1E_4953_A547_C222079BD1DD"/>
				<UML:TaggedValue tag="type" value="Sequence"/>
				<UML:TaggedValue tag="swimlanes" value="locked=false;orientation=0;width=0;inbar=false;names=false;color=0;bold=false;fcol=0;;cls=0;"/>
				<UML:TaggedValue tag="matrixitems" value="locked=false;matrixactive=false;swimlanesactive=true;width=1;"/>
				<UML:TaggedValue tag="ea_localid" value="63"/>
				<UML:TaggedValue tag="EAStyle" value="ShowPrivate=1;ShowProtected=1;ShowPublic=1;HideRelationships=0;Locked=0;Border=1;HighlightForeign=1;PackageContents=1;SequenceNotes=0;ScalePrintImage=0;PPgs.cx=2;PPgs.cy=1;DocSize.cx=850;DocSize.cy=1098;ShowDetails=0;Orientation=P;Zoom=100;ShowTags=0;OpParams=1;VisibleAttributeDetail=0;ShowOpRetType=1;ShowIcons=1;CollabNums=0;HideProps=0;ShowReqs=0;ShowCons=0;PaperSize=1;HideParents=0;UseAlias=0;HideAtts=0;HideOps=0;HideStereo=0;HideElemStereo=0;ShowTests=0;ShowMaint=0;ConnectorNotation=UML 2.1;ExplicitNavigability=0;AdvancedElementProps=1;AdvancedFeatureProps=1;AdvancedConnectorProps=1;ShowNotes=0;SuppressBrackets=0;SuppConnectorLabels=0;PrintPageHeadFoot=0;ShowAsList=0;"/>
				<UML:TaggedValue tag="styleex" value="ExcludeRTF=0;DocAll=0;HideQuals=0;AttPkg=1;ShowTests=0;ShowMaint=0;SuppressFOC=0;INT_ARGS=;INT_RET=;INT_ATT=;SeqTopMargin=50;MatrixActive=0;SwimlanesActive=1;MatrixLineWidth=1;MatrixLocked=0;TConnectorNotation=UML 2.1;TExplicitNavigability=0;AdvancedElementProps=1;AdvancedFeatureProps=1;AdvancedConnectorProps=1;ProfileData=;MDGDgm=;STBLDgm=;ShowNotes=0;VisibleAttributeDetail=0;ShowOpRetType=1;SuppressBrackets=0;SuppConnectorLabels=0;PrintPageHeadFoot=0;ShowAsList=0;"/>
</UML:ModelElement.taggedValue>

I want to export the tags starts with HLD_EA and HLD_DOORS.
These tags are only visible in these lines

Code:
<UML:TaggedValue tag="documentation" value="This sequence&#xA;&#xA;HLD_EA_0001X&#xA;HLD_DOORS_002X"/>
				<UML:TaggedValue tag="documentation" value="This sequence&#xA;&#xA;HLD_EA_0231X&#xA;HLD_DOORS_003X;HLD_DOORS_0021"/>
				<UML:TaggedValue tag="documentation" value="This sequence&#xA;&#xA;HLD_EA_0232X&#xA;HLD_DOORS_003X;HLD_DOORS_ijkl"/>
				<UML:TaggedValue tag="documentation" value="This sequence&#xA;&#xA;HLD_EA_0345X&#xA;HLD_DOORS_05762X;HLD_DOORS_aasja"/>
				<UML:TaggedValue tag="documentation" value="This sequence&#xA;&#xA;HLD_EA_0001X&#xA;HLD_DOORS_002X"/>
				<UML:TaggedValue tag="documentation" value="This sequence&#xA;&#xA;HLD_EA_0001X&#xA;HLD_DOORS_002X"/>
				<UML:TaggedValue tag="documentation" value="This sequence&#xA;&#xA;HLD_EA_0001X&#xA;HLD_DOORS_002X"/>
				<UML:TaggedValue tag="documentation" value="This sequence&#xA;&#xA;HLD_EA_0001X&#xA;HLD_DOORS_002X"/>

Now i want to feed this tags in to one excel sheet/any file type:

My file shall looks like this:

HLD_EA_0001X HLD_DOORS_002X
HLD_EA_0231X HLD_DOORS_003X HLD_DOORS_0021
HLD_EA_0232X HLD_DOORS_003X HLD_DOORS_ijkl


Can you please help me out how to write this script?

Thanks,
Suvendu
# 2  
Old 05-02-2013
Please show us your attempts at the solution.
This User Gave Thanks to balajesuri For This Post:
# 3  
Old 05-02-2013
I am not an expert of shell scripting
But this is my approach

Code:
sed -n '/<UML:TaggedValue tag="documentation" value="This sequence&#xA;&#xA;"[HLD]/,/<\/Variable>/{
s/.*=\("[^"]*"\).*/\1/
t prnt
b
:prnt
p
}' file

# 4  
Old 05-02-2013
Code:
grep "documentation" file | grep -o -E "HLD_[0-9a-zA-Z_]+"

This User Gave Thanks to balajesuri For This Post:
# 5  
Old 05-03-2013
Thanks a lot for your reply.....

The script works fine......

Code:
<UML:TaggedValue tag="documentation" value="This sequence

HLD_EA_0001X
HLD_DOORS_002X"/>
				<UML:TaggedValue tag="documentation" value="This sequence

HLD_EA_0231X
HLD_DOORS_003X;HLD_DOORS_0021"/>
				<UML:TaggedValue tag="documentation" value="This sequence

HLD_EA_0232X
HLD_DOORS_003X;HLD_DOORS_ijkl"/>
				<UML:TaggedValue tag="documentation" value="This sequence

HLD_EA_0345X
HLD_DOORS_05762X;HLD_DOORS_aasja"/>
				<UML:TaggedValue tag="documentation" value="This sequence

HLD_EA_0001X
HLD_DOORS_002X"/>
				<UML:TaggedValue tag="documentation" value="This sequence

HLD_EA_0001X
HLD_DOORS_002X"/>
				<UML:TaggedValue tag="documentation" value="This sequence

HLD_EA_0001X
HLD_DOORS_002X"/>
				<UML:TaggedValue tag="documentation" value="This sequence

HLD_EA_0001X
HLD_DOORS_002X"/>

In the above code i want to extract the content in to an excel file:
Ex: My excel file shall looks like below:---


Code:
1st Column in Excel      2nd Column in Excel 
HLD_EA_0001X           HLD_DOORS_002X
HLD_DOORS_003X       HLD_DOORS_003X
                          HLD_DOORS_0021
HLD_EA_0232X           HLD_DOORS_003X
                          HLD_DOORS_ijkl


Hope you have understood my question.

I donot know whether it is possible in shell script or not.But as per my knowledge it is possible in perl script.Give your idea and suggestions.

---------- Post updated at 08:48 AM ---------- Previous update was at 02:40 AM ----------

Any suggestion guys.......

Can i use "use Spreadsheet::WriteExcel"in perl and can do this.....

Or any simple solution is possible

Last edited by Scrutinizer; 05-04-2013 at 10:14 AM.. Reason: icode to code tags
# 6  
Old 05-03-2013
Code:
perl -ne 'if (/documentation/){while(/(HLD_\w+)/g){print "$1"};print "\n"}' file

And yes, you may use Spreadsheet::WriteExcel module to write to an xls file.
This User Gave Thanks to balajesuri For This Post:
# 7  
Old 05-04-2013
Hello Balajesuri,

Thanks for your reply...It works fine....

Trying to do the same using Spreadsheet::WriteExcel.

HLD_DOORS_XXX needs to be in first column in xls
Corresponding HLD_EA_XXX needs to be in second column in xls....

Any idea or suggestion are always welcome
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting content from a file in specific format

Hi All, I have the file in this format **** Results Data **** Time or Step 1 2 20 0.000000000e+00 0s 0s 0s 1.024000000e+00 Us 0s 0s 1.100000000e+00 1s 0s 0s 1.100000001e+00 1s 0s 1s 2.024000000e+00 Us Us 1s 2.024000001e+00 ... (7 Replies)
Discussion started by: diehard
7 Replies

2. Shell Programming and Scripting

Create xml file using a content from another xml file

I need to create a xml file(master.xml) with contents from another xml files(children). I have below list of xml files in a temporary location (C:/temp/xmls) 1. child1.xml 2. child2.xml Below is the content of the child1.xml & child2.xml files, child1.xml <root> <emp> ... (3 Replies)
Discussion started by: vel4ever
3 Replies

3. Shell Programming and Scripting

Need help in extracting data from xml file

Hello, This is my first post in here, so excuse me if I sound too noob here! I need to extract the path "/apps/mp/installedApps/V61/HRO/hrms_01698_A_qa.ear" from the below xml extract. The path will always appear with the key "binariesURL" <deployedObject... (6 Replies)
Discussion started by: abhishek2386
6 Replies

4. Shell Programming and Scripting

Extracting content of a file

Hello, I'm working on a script to extract the contents of a file (in general, plain txt file with numbers, symbols, and letters) and output it into a .txt file. but it is kind of all over the place. It needs to not include duplicates and the content has to be readable. I jumped all over the place... (7 Replies)
Discussion started by: l20N1N
7 Replies

5. UNIX for Dummies Questions & Answers

Extracting data from an xml file

Hello, Please can someone assist. I have the following xml file: <?xml version="1.0" encoding="utf-8" ?> - <PUTTRIGGER xmlns:xsd="http://www.test.org/2001/XMLSchema" xmlns:xsi="http://www.test.org/2001/XMLSchema-instance" APPLICATIONNUMBER="0501160" ACCOUNTNAME="Mrs S Test"... (15 Replies)
Discussion started by: Dolph
15 Replies

6. UNIX for Dummies Questions & Answers

Extracting values from an XML file

Hello People, I have an xml file from which I need to extract the values of the parameters using UNIX shell commands. Ex : Input is like : <Name>Roger</Name> or <Address>MI</Address> I need the output as just : Roger or MI with the tags removed. Please help. (1 Reply)
Discussion started by: sushant172
1 Replies

7. Shell Programming and Scripting

How to read the content of the particular file from tar.Z without extracting?

Hi All, I want to read the content of the particular file from tar.Z without extracting. aaa.tar.Z contains a file called one.txt, I want to read the content of the one.txt without extracting. Please help me to read the content of it. Regards, Kalai. (12 Replies)
Discussion started by: kalpeer
12 Replies

8. Shell Programming and Scripting

Extracting a part of XML File

Hi Guys, I have a very large XML feed (2.7 MB) which crashes the server at the time of parsing. Now to reduce the load on the server I have a cron job running every 5 min.'s. This job will get the file from the feed host and keep it in the local machine. This does not solve the problem as... (9 Replies)
Discussion started by: shridhard
9 Replies

9. Shell Programming and Scripting

Extracting Data from xml file

Hi ppl out there... Can anyone help me with the shell script to extract data from an xml file. My xml file looks like : - <servlet> <servlet-name>FrontServlet</servlet-name> <display-name>FrontServlet</display-name> ... (3 Replies)
Discussion started by: nishana
3 Replies

10. Shell Programming and Scripting

extracting XML file using sed

Hello folks I want to extract data between certain tag in XML file using 'sed' <xml> ......... .......... <one>XXXXXXXXXXXXXXXXXXXX</one> ...... Anyone ?Thank you (7 Replies)
Discussion started by: pujansrt
7 Replies
Login or Register to Ask a Question