Using AWK to separate data from a large XML file into multiple files
I have a 500 MB XML file from a FileMaker database export, it's formatted horribly (no line breaks at all). The node structure is basically
There are two things I need to get out of that file:
1. I'd like to generate an XML file that just contains everything within the < METADATA > nodes (the < FIELD > nodes) and I'll name it fields.xml
2.Then I'd like to generate an XML for each individual < ROW > node, and incrementally name each row1.xml, row2.xml, etc...
I'm using AWK via Terminal in OS X Leopard, I'm not sure how to go about item #1, but for #2 I tried the following:
Which produces a syntax error at line 1 when executed.
Can anyone help me out with these issues? What am I doing wrong?
Sorry I'm a complete AWK beginner, I've been programming for about 8 years, but only learned of AWK about an hour before I posted.
Let me make sure I understand everything completely, this is what I'm trying step by step, please correct me where I'm wrong:
1. I have my working directory, in it I have db.xml file
2. I create a file called split.awk inside my working directory, in it I put the file contents:
3. I open up terminal, cd to my working directory and then execute:
When I execute that, I just get an error saying awk can't find the file.
Again, sorry for being such a beginner -- now that I know AWK exists, I plan to purchase a few books on and dive into how I can apply in my day-to-day programming.
what is the exact output of awk? This seems to happen mostly when there are invisible characters introduced to the awk file during the copy of the text to the .awk file. And make sure all the files are readable and the directory is writable by the account you use to open the terminal window...
I'm using AWK via Terminal in OS X Leopard, I'm not sure how to go about item #1, but for #2 I tried the following:
Which produces a syntax error at line 1 when executed
Split large xml into mutiple files and with header and footer in file
tried below
it splits unevenly and also i need help in adding header and footer
command :
csplit -s -k -f my_XML_split.xml extrfile.xml "/<Document>/" {1}
sample xml
<?xml version="1.0" encoding="UTF-8"?><Recipient>... (36 Replies)
Greetings experts,
Have 2 input files, of which 1 file has 1 record per line; in 2nd file, multiple lines constitute 1 record; Hence declared the RS=";"
Now in the first file which ends with ";" at each line of the line; But \nis also being considered as part of the data due to which
I am... (1 Reply)
Hi,
I'm having a xml file with multiple xml header. so i want to split the file into multiple files.
Sample.xml consists multiple headers so how can we split these multiple headers into multiple files in unix.
eg :
<?xml version="1.0" encoding="UTF-8"?>
<ml:individual... (3 Replies)
Hi there, I'm camor and I'm trying to process huge files with bash scripting and awk.
I've got a dataset folder with 10 files (16 millions of row each one - 600MB), and I've got a sorted file with all keys inside.
For example:
a sample_1 200
a.b sample_2 10
a sample_3 10
a sample_1 10
a... (4 Replies)
Hi,
I have one requirement, create separate files (".csv") from one excel file(xlsx) with multiple sheets. These ".csv" files are my source files. So anybody please suggest me the process.
Thanks in Advance.
Regards,
Harris (3 Replies)
Hi,
I have a data file xyz.dat similar to the one given below,
2345|98|809||x|969|0
2345|98|809||y|0|537
2345|97|809||x|544|0
2345|97|809||y|0|651
9685|98|809||x|321|0
9685|98|809||y|0|357
9685|98|709||x|687|0
9685|98|709||y|0|234
2315|98|809||x|564|0
2315|98|809||y|0|537... (2 Replies)
Howdy Folks,
I have a list that looks like this:
(file2.txt)
AAA
BBB
CCC
DDD
and there are 24 of these short words.
I am matching these patterns to another file with 755795 lines (file1.txt).
I have this code for matching:
awk -v f2=file2.txt '
BEGIN {
while(... (2 Replies)
Hi,
I'd like to process multiple files. For example:
file1.txt
file2.txt
file3.txt
Each file contains several lines of data. I want to extract a piece of data and output it to a new file.
file1.txt ----> newfile1.txt
file2.txt ----> newfile2.txt
file3.txt ----> newfile3.txt
Here is... (3 Replies)
hai all
I am new to the world of shell scripting
I wanted to extract two columns from multiple files say around 25 files
and i wanted to get the separate outfile for each input file
tired using the following command to extract two columns from 25 files
awk... (2 Replies)
I have a file with a simple list of ids. 750,000 rows. I have to break it down into multiple 50,000 row files to submit in a batch process.. Is there an easy script I could write to accomplish this task? (2 Replies)