Extracting content from xml file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extracting content from xml file
# 8  
Old 05-04-2013
Give it a try, suvendu4urs. Take a look at the sample program for WriteExcel in CPAN. Let us know what you have cooked and where you're stuck Smilie
This User Gave Thanks to balajesuri For This Post:
# 9  
Old 05-05-2013
Hello Balajesuri,

Thing started working out as required....Just missing with some ideas...find my below approach:

-Extracted the content from xml file using your perl script:
Code:
perl -ne 'if (/documentation/){while(/(HLD_\w+)/g){print "$1"};print "\n"}' file

-
Done the below script where i have given two file as command line argument
one is the output of the above step
second one is the new .xls file name.
Code:
#!/usr/bin/perl -w

    use strict;
    use Spreadsheet::WriteExcel;

    # Check for valid number of arguments
    #if (($#ARGV < 1) || ($#ARGV > 2)) {
     #  die("Usage: tab2xls tabfile.txt newfile.xls\n");
    #};

    # Open the tab-delimited file
    open (TABFILE, $ARGV[0]) or die "$ARGV[0]: $!";

    # Create a new Excel workbook
    my $workbook  = Spreadsheet::WriteExcel->new($ARGV[1]);
    my $worksheet = $workbook->addworksheet();
    # Row and column are zero indexed
    my $row = 0;

    while (<TABFILE>) {
       chomp;
       # Split on single space
       my @Fld = split(' ', $_);

       my $col = 0;
       foreach my $token (@Fld) {
           $worksheet->write($row, $col, $token);
           $col++;   
       }
       $row++;
    }

Wherever i am getting a single space i am splitting it.Now the data are feeded in to excel sheet.
But multiple columns are created.

May be i will try to figure out the solution for this.Your ideas and guideance on this is always welcome.
# 10  
Old 05-05-2013
Quote:
Originally Posted by suvendu4urs
while (<TABFILE>) {
chomp;
# Split on single space
my @Fld = split(' ', $_);
It's always a good idea to split according to white spaces (\s+). This will ensure that regardless of whether the columns are separated by spaces or tabs, your code will work.
Code:
# Split on white spaces
my @Fld = split(/\s+/, $_);

# 11  
Old 05-05-2013
Thanks for your input...

Now i have got all my required details in different columns.Please find the below outcome:
Code:
1st column                2nd column            3rd column          4th column
HLD_EA_0001X           HLD_DOORS_002X
HLD_EA_003X             HLD_DOORS_003X  HLD_DOORS_0021 HLD_DOORS_XXX
HLD_EA_0232X           HLD_DOORS_003X   HLD_DOORS_ijkl    HLD_DOORS_CDKL

But here i dont want the 3rd and fourth column.
All the HLD_EA needs to be in 1st column
All the corresponding HLD_DOORS needs to be in 2 nd column.Data of 3rd and 4th column needs to be in second column.

I hope you have understand my question.

---------- Post updated at 11:12 AM ---------- Previous update was at 11:10 AM ----------

The output shall looks like this:

Code:
Code:
1st column                2nd column           
HLD_EA_0001X           HLD_DOORS_002X
HLD_EA_003X            HLD_DOORS_003X  
                              HLD_DOORS_0021 
                              HLD_DOORS_XXX
HLD_EA_0232X          HLD_DOORS_003X
                              HLD_DOORS_ijkl    
                              HLD_DOORS_CDKL

# 12  
Old 05-10-2013
Hello Guys.....
Anyways whatever i need i have got it.....

Trying to do something different from pattern extraction.....

The below is my xml code
Code:
<UML:TaggedValue tag="documentation" value="This sequence HLD_EA_0001X SRS_DOORS_002X"/>
<UML:TaggedValue tag="documentation" value="This sequence HLD_EA_0231X SRS_DOORS_003X;SRS_DOORS_0021"/>
<UML:TaggedValue tag="documentation" value="This sequence HLD_EA_0232X SRS_DOORS_003X;SRS_DOORS_ijkl"/>
<UML:TaggedValue tag="documentation" value="This sequence HLD_EA_0345X SRS_DOORS_05762X;SRS_DOORS_aasja"/>
<UML:TaggedValue tag="documentation" value="This sequence HLD_EA_0001X SRS_DOORS_002X"/>
<UML:TaggedValue tag="documentation" value="This sequence HLD_EA_0001X SRS_DOORS_002X"/>
<UML:TaggedValue tag="documentation" value="This sequence HLD_EA_0001X SRS_DOORS_002X"/>
<UML:TaggedValue tag="documentation" value="This sequence HLD_EA_0001X SRS_DOORS_002X"/>

I know that the below code extract the HLD_tag:
Code:
perl -ne 'if (/documentation/){while(/(HLD_\w+)/g){print "$1"};print "\n"}' file

I want to do some code changes in the above so that HLD tag as well corresponding SRS tag needs to be extracted...

Some change in while loop is required to achieve the same but not getting the exact one...
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extracting content from a file in specific format

Hi All, I have the file in this format **** Results Data **** Time or Step 1 2 20 0.000000000e+00 0s 0s 0s 1.024000000e+00 Us 0s 0s 1.100000000e+00 1s 0s 0s 1.100000001e+00 1s 0s 1s 2.024000000e+00 Us Us 1s 2.024000001e+00 ... (7 Replies)
Discussion started by: diehard
7 Replies

2. Shell Programming and Scripting

Create xml file using a content from another xml file

I need to create a xml file(master.xml) with contents from another xml files(children). I have below list of xml files in a temporary location (C:/temp/xmls) 1. child1.xml 2. child2.xml Below is the content of the child1.xml & child2.xml files, child1.xml <root> <emp> ... (3 Replies)
Discussion started by: vel4ever
3 Replies

3. Shell Programming and Scripting

Need help in extracting data from xml file

Hello, This is my first post in here, so excuse me if I sound too noob here! I need to extract the path "/apps/mp/installedApps/V61/HRO/hrms_01698_A_qa.ear" from the below xml extract. The path will always appear with the key "binariesURL" <deployedObject... (6 Replies)
Discussion started by: abhishek2386
6 Replies

4. Shell Programming and Scripting

Extracting content of a file

Hello, I'm working on a script to extract the contents of a file (in general, plain txt file with numbers, symbols, and letters) and output it into a .txt file. but it is kind of all over the place. It needs to not include duplicates and the content has to be readable. I jumped all over the place... (7 Replies)
Discussion started by: l20N1N
7 Replies

5. UNIX for Dummies Questions & Answers

Extracting data from an xml file

Hello, Please can someone assist. I have the following xml file: <?xml version="1.0" encoding="utf-8" ?> - <PUTTRIGGER xmlns:xsd="http://www.test.org/2001/XMLSchema" xmlns:xsi="http://www.test.org/2001/XMLSchema-instance" APPLICATIONNUMBER="0501160" ACCOUNTNAME="Mrs S Test"... (15 Replies)
Discussion started by: Dolph
15 Replies

6. UNIX for Dummies Questions & Answers

Extracting values from an XML file

Hello People, I have an xml file from which I need to extract the values of the parameters using UNIX shell commands. Ex : Input is like : <Name>Roger</Name> or <Address>MI</Address> I need the output as just : Roger or MI with the tags removed. Please help. (1 Reply)
Discussion started by: sushant172
1 Replies

7. Shell Programming and Scripting

How to read the content of the particular file from tar.Z without extracting?

Hi All, I want to read the content of the particular file from tar.Z without extracting. aaa.tar.Z contains a file called one.txt, I want to read the content of the one.txt without extracting. Please help me to read the content of it. Regards, Kalai. (12 Replies)
Discussion started by: kalpeer
12 Replies

8. Shell Programming and Scripting

Extracting a part of XML File

Hi Guys, I have a very large XML feed (2.7 MB) which crashes the server at the time of parsing. Now to reduce the load on the server I have a cron job running every 5 min.'s. This job will get the file from the feed host and keep it in the local machine. This does not solve the problem as... (9 Replies)
Discussion started by: shridhard
9 Replies

9. Shell Programming and Scripting

Extracting Data from xml file

Hi ppl out there... Can anyone help me with the shell script to extract data from an xml file. My xml file looks like : - <servlet> <servlet-name>FrontServlet</servlet-name> <display-name>FrontServlet</display-name> ... (3 Replies)
Discussion started by: nishana
3 Replies

10. Shell Programming and Scripting

extracting XML file using sed

Hello folks I want to extract data between certain tag in XML file using 'sed' <xml> ......... .......... <one>XXXXXXXXXXXXXXXXXXXX</one> ...... Anyone ?Thank you (7 Replies)
Discussion started by: pujansrt
7 Replies
Login or Register to Ask a Question