Reading XML data in a FLAT FILE


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Reading XML data in a FLAT FILE
# 1  
Old 07-27-2011
Reading XML data in a FLAT FILE

I have a requirement to read the xml file and split the files into two diffrent files in Unix shell script. Could anyone please help me out with this requirement.

Sample file
---------------
Code:
0,<?xml version="1.0" encoding="UTF-8" standalone="yes"?> 
<Information xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> 
<s>
<Name>aaa</Name>
<age>12</age>
</s>
</Information>,<?xml version="1.0" encoding="UTF-8" standalone="yes"?><Information xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><s>
<Name>aaa</Name><age>12</age></s></Information>
1,<?xml version="1.0" encoding="UTF-8" standalone="yes"?> 
<Information xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> 
<s>
<Name>bbb</Name>
<age>12</age>
</s>
</Information>,<?xml version="1.0" encoding="UTF-8" standalone="yes"?><Information xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><s>
<Name>bbb</Name><age>12</age></s></Information>

---------------

Expected output:
output1.xml
---------------
Code:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> 
<Information xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> 
<s>
<Name>aaa</Name>
<age>12</age>
</s>
</Information>

---------------

Output2.xml
---------------
Code:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><Information xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><s>
<Name>aaa</Name><age>12</age></s></Information>

---------------

After processing the file output1.xml and output2.xml the files need to be purged.

The second line of the sample file has to be read and written to the output1.xml and output2.xml file that is created during this process

Expected output:
output1.xml
---------------
Code:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> 
<Information xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> 
<s>
<Name>bbb</Name>
<age>12</age>
</s>
</Information>

---------------

Output2.xml
---------------
Code:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><Information xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><s>
<Name>bbb</Name><age>12</age></s></Information>

---------------

The sample file has to be processed for the whole file in looping process.

Could anyone help me to resolve this?

Thanks in Advance

Krishnakanth Manivannan

Last edited by pludi; 07-27-2011 at 06:01 PM..
# 2  
Old 07-27-2011
How about this, assumes each xml file begins with "<?xml"

Code:
awk '/^[0-9][0-9]*\,<\?xml/ {
    FNUM++
    gsub(/^[0-9]*\,/, "")
    print $0 > "output" FNUM ".xml"
    next
}
/.*,<\?xml/ {
   FIRST=$0
   gsub(/,<\?xml.*/, "",FIRST)
   print FIRST >> "output" FNUM ".xml"
   FNUM++
   gsub(/.*,<\?xml/, "<?xml")
   print $0 > "output" FNUM ".xml"
   next
}
FNUM { print $0 >> "output" FNUM ".xml" }' sample

This User Gave Thanks to Chubler_XL For This Post:
# 3  
Old 07-28-2011
Thanks for your reply. It works fine.

One more thing is i need to perform in loop.

sample.txt
----
0,<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Information xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<s>
<Name>aaa</Name>
<age>12</age>
</s>
</Information>,<?xml version="1.0" encoding="UTF-8" standalone="yes"?><Information xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><s>
<Name>aaa</Name><age>12</age></s></Information>
1,<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Information xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<s>
<Name>bbb</Name>
<age>12</age>
</s>
</Information>,<?xml version="1.0" encoding="UTF-8" standalone="yes"?><Information xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><s>
<Name>bbb</Name><age>12</age></s></Information>
----

For example in this case, output1.xml will be having
----
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Information xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<s>
<Name>aaa</Name>
<age>12</age>
</s>
</Information>
----

and output2.xml will be having
----
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><Information xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><s>
<Name>aaa</Name><age>12</age></s></Information>
----

Once the output1.xml and output2.xml is generated, the script will invoke the datastage job for further processing.

Once the datastage job completes its execution, the control has to comes back to unix and it has to read the next set to generate/overwrite same output1.xml and output2.xml and it should have the following content.

next set
----
1,<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Information xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<s>
<Name>bbb</Name>
<age>12</age>
</s>
</Information>,<?xml version="1.0" encoding="UTF-8" standalone="yes"?><Information xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><s>
<Name>bbb</Name><age>12</age></s></Information>
----

output1.xml
----
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Information xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<s>
<Name>bbb</Name>
<age>12</age>
</s>
</Information>
----

output2.xml should have
----
<?xml version="1.0" encoding="UTF-8" standalone="yes"?><Information xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"><s>
<Name>bbb</Name><age>12</age></s></Information>
----

The following code snippet which you have given works fine.
----
awk '/^[0-9][0-9]*\,<\?xml/ {
FNUM++
gsub(/^[0-9]*\,/, "")
print $0 > "output" FNUM ".xml"
next
}
/.*,<\?xml/ {
FIRST=$0
gsub(/,<\?xml.*/, "",FIRST)
print FIRST >> "output" FNUM ".xml"
FNUM++
gsub(/.*,<\?xml/, "<?xml")
print $0 > "output" FNUM ".xml"
next
}
FNUM { print $0 >> "output" FNUM ".xml" }' sample
----

But I need to perform this in loop till the end of the file sample.txt.

Could you please help me out?

Thanks
Krishnakanth Manivannan
# 4  
Old 08-10-2011
Thank You Chubler_XL. The logic which you have given works.

Thanks for your help!!

Krishnakanth Manivannan
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Programming

Reading flat file content

is there any unix tools that can read the text files like through SQL queries? (ie in Hadoop, Impala DB support flat file query) (1 Reply)
Discussion started by: omykarshan
1 Replies

2. Shell Programming and Scripting

XML Parsing having optional tags into flat file

In xml file i have following data where some tags like<ChrgBr> may not be present in every next file. So i want these values to be stored in some variable like var1="405360,00" , var2="DEBT" and so on ,but if <ChrgBr> tag has no value or is absent var2 should have space like var2=" " so that i... (1 Reply)
Discussion started by: sandipgawale
1 Replies

3. Shell Programming and Scripting

[ask]xml to flat file

dear all, i need your advice, i have xml file like this input.xml <?xml version="1.0" encoding="UTF-8"?> <session xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'> <capture> <atribut name="tmp_Filename" value="INTest.rbs"/> <atribut name="size_Filename" value="INTest.rbs"/>... (2 Replies)
Discussion started by: zvtral
2 Replies

4. Shell Programming and Scripting

Help with converting XML to Flat file

Hi Friends, I want to convert a XML file to flat file. Sample I/p: <?xml version='1.0' encoding='UTF-8' ?> <DataFile xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance' contactCount='4999' date='2012-04-14' time='22:00:14' xsi:noNamespaceSchemaLocation='gen .xsd'> <Contact... (3 Replies)
Discussion started by: karumudi7
3 Replies

5. Shell Programming and Scripting

To read a flat file containing XML data

I have a file something like this:aaaa.xml content of the file is 0,<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <storeInformation xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <s> <BRANCH_NO>3061</BRANCH_NO> <BRANCH_NAME>GREEN EXPRESS</BRANCH_NAME> ... (4 Replies)
Discussion started by: kmanivan82
4 Replies

6. Shell Programming and Scripting

Converting a flat file in XML

Hello Friends, I am new to UNIX shell scripting. Using bash....Could you please help me in converting a flat file into an XML style output file. Flat file: (Input File entries looks like this) John Miller: 617-569-7996:15 Bunting lane, staten Island, NY: 10/21/79: 60600 The... (4 Replies)
Discussion started by: humkhn
4 Replies

7. Shell Programming and Scripting

Searching for Log / Bad file and Reading and writing to a flat file

Need to develop a unix shell script for the below requirement and I need your assistance: 1) search for file.log and file.bad file in a directory and read them 2) pull out "Load_Start_Time", "Data_File_Name", "Error_Type" from log file 4) concatinate each row from bad file as... (3 Replies)
Discussion started by: mlpathir
3 Replies

8. Shell Programming and Scripting

Reading a FLAT File - No Delimeters

Hi Folks, I have a file without any delimeters and it is a flat file. Example, my raw data looks: x25abcy26defz27ghi..... Now, could you please any one help me to program to split this into variable and create a text file. I want a output as below Name Age Number x 25 abc... (14 Replies)
Discussion started by: Jerald Nathan
14 Replies

9. Shell Programming and Scripting

XML to flat file

Hi all, can u please help me in converting any given XML file to flat file. thanks in advance. -bali (2 Replies)
Discussion started by: balireddy_77
2 Replies

10. UNIX for Advanced & Expert Users

XML to flat file in Unix

Hello, How can I take a file in XML format and convert it to a comma separated format? Is there any scripts or programs that can do this for Unix? I tried surfing the net for such an application, but everything seems to be for Windows OS. Any help or suggestions are greatly appreciated. ... (2 Replies)
Discussion started by: oscarr
2 Replies
Login or Register to Ask a Question