Split XML file based on tags


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Split XML file based on tags
# 1  
Old 11-20-2013
Split XML file based on tags

Hello All ,

Please help me with below requirement

I want to split a xml file based on tag.here is the file format

Code:
<data-set>
some-information
</data-set>
<data-set1>
some-information
</data-set1>
<data-set2>
some-information
</data-set2>

I want to split the above file into 3 files based on <data-set>
# 2  
Old 11-20-2013
Try:

Code:
$ awk '/^<d/{close(f);f=$1;gsub(/[[:punct:]]/,x,f); f=f".txt"}{print >f}' file

Resulting

Code:
$ ls dataset*.txt -1
dataset1.txt
dataset2.txt
dataset.txt

Code:
$ cat dataset.txt
<data-set>
some-information
</data-set>

$ cat dataset1.txt 
<data-set1>
some-information
</data-set1>

$ cat dataset2.txt 
<data-set2>
some-information
</data-set2>

OR
Code:
$ awk -F'[<>]' '/^<d/{close(f);f=$2".txt"}{print >f}' file


Last edited by Akshay Hegde; 11-20-2013 at 10:20 AM..
This User Gave Thanks to Akshay Hegde For This Post:
# 3  
Old 11-20-2013
Thanks Akash ,

However I am getting the below error

awk: A print or getline function must have a file name.
The input line number is 1. The file is HSBC_new.xml.
The source line number is 1.
# 4  
Old 11-20-2013
which OS ?

If you want to run this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk , /usr/xpg6/bin/awk , or nawk

In your original file there is null string or blank line so you are getting error

Last edited by Akshay Hegde; 11-20-2013 at 10:31 AM..
# 5  
Old 11-20-2013
Another awk approach:
Code:
awk 'NR%3==1{close(f);f="file"++c".xml"}{print > f}' file.xml

This User Gave Thanks to Yoda For This Post:
# 6  
Old 01-29-2014
Hello,

another approach for same.

Code:
awk 'NR==FNR{a[$1];next} ($1 in a) {if($1 ~ /\<.*/) {f=1;j=$0} {if($1 ~ /\<\/.*/) { f=0;k=$0}} {if(f==1 && $1 !~ /\<.*/)  val=j"\n"$0 } {if(f==0 && $1 ~ /\<\/.*/) {val=val"\n"k}}} !f{print val > "file_"i++".txt"}' split_files_accordingly split_files_accordingly

Output will be 3 files named file_0.txt, file_1.txt and file_2.txt.

Code:
$cat file_0.txt
<data-set>
some-information
</data-set>

cat file_1.txt
<data-set1>
some-information
</data-set1>

 cat file_2.txt
<data-set2>
some-information
</data-set2>


NOTE: split_files_accordingly is the input file.


Thanks,
R. Singh
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Issue splitting file based on XML tags

more a-d.txt1 <a-dets> <a-serv> <aserv>mymac14,mymac15:MYAPP:mydom:/web/domain/mydom/config <NMGR>:MYAPP:/web/bea_apps/perf/NMGR/NMGR1034 <a-rep-string> 11.12.10.01=192.10.00.26 10.20.18.10=192.10.00.27 </a-rep-string> </a-serv> <w-serv>... (2 Replies)
Discussion started by: mohtashims
2 Replies

2. Shell Programming and Scripting

Split xml file into multiple xml based on letterID

Hi All, We need to split a large xml into multiple valid xml with same header(2lines) and footer(last line) for N number of letterId. In the example below we have first 2 lines as header and last line as footer.(They need to be in each split xml file) Header: <?xml version="1.0"... (5 Replies)
Discussion started by: vx04
5 Replies

3. Shell Programming and Scripting

Help with Splitting a Large XML file based on size AND tags

Hi All, This is my first post here. Hoping to share and gain knowledge from this great forum !!!! I've scanned this forum before posting my problem here, but I'm afraid I couldn't find any thread that addresses this exact problem. I'm trying to split a large XML file (with multiple tag... (7 Replies)
Discussion started by: Aviktheory11
7 Replies

4. Shell Programming and Scripting

Perl : to split the tags from xml file

I do have an xml sheet as below where I need the perl script to filter only the hyperlink tags. <cols><col min="1" max="1" width="30.5703125" customWidth="1"/><col min="2" max="2" width="7.140625" bestFit="1" customWidth="1"/> <col min="3" max="3" width="32.28515625" bestFit="1"... (3 Replies)
Discussion started by: scriptscript
3 Replies

5. Shell Programming and Scripting

Split XML file

Hi Experts, Can you please help me to split following XML file based on new Order ? Actual file is very big. I have taken few lines of it. <?xml version="1.0" encoding="utf-8" standalone="yes"?> <Orders xmlns='http://www.URL.com/Orders'> <Order> <ORDNo>450321</ORDNo> ... (3 Replies)
Discussion started by: meetmedude
3 Replies

6. Shell Programming and Scripting

Split xml file into many

Hi, I had a scenario need a help as I am new to this. I have a xml file employee.xml with the below content. <Organisation><employee>xxx</employee><employee>yyy</employee><employee>zzz</employee></Organisation> I want to split the file into multiple file as below. Is there a specifice way... (5 Replies)
Discussion started by: mankuar
5 Replies

7. Shell Programming and Scripting

split XML file into multiple files based on pattern

Hello, I am using awk to split a file into multiple files using command: nawk '{ if ( $1 == "<process" ) { n=split($2, arr, "\""); file=arr } print > file }' processes.xml <process name="Process1.process"> ... (3 Replies)
Discussion started by: chiru_h
3 Replies

8. Shell Programming and Scripting

Need to split a xml file in proper format

Hi, I have a file which has xml data but all in single line Ex - <?xml version="1.0"?><User><Name>Robert</Name><Location>California</Location><Occupation>Programmer</Occupation></User> I want to split the data in proper xml format Ex- <?xml version="1.0"?> <User> <Name>Robert</Name>... (6 Replies)
Discussion started by: avishek007
6 Replies

9. UNIX for Dummies Questions & Answers

Extract a specific number from an XML file based on the start and end tags

Hello People, I have the following contents in an XML file ........... ........... .......... ........... <Details = "Sample Details"> <Name>Bob</Name> <Age>34</Age> <Address>CA</Address> <ContactNumber>1234</ContactNumber> </Details> ........... ............. .............. (4 Replies)
Discussion started by: sushant172
4 Replies

10. Shell Programming and Scripting

Shell script to split XML file

Hi, I'm experiencing difficulty in loading an XML file to an Oracle destination table.I keep running into a memory problem due to the large size of the file. I want to split the XML file into several smaller files based on the keyword(s)/tags : '' and '' and would like to use a Unix shell... (2 Replies)
Discussion started by: bayflash27
2 Replies
Login or Register to Ask a Question