06-17-2008
splitting huge xml into multiple files
hi all
i have a some huge html files (500MB to 1GB). Each file has multiple
<html></html> tags
<html>
.................
....................
....................
</html>
<html>
.................
....................
....................
</html>
<html>
.................
....................
....................
</html>
..........
..........
I want to split these html files into smaller files with <html> in the beginning and </html> at the end
<html>
.......
.......
</html>
<html>
..........
..........
</html>
Kindly suggest me a perl/awk/sed or any shell script solution.
Thanks and Regards,
uttam hoode
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
I need to split xml-files with sizes greater than 2 gb into smaler chunks. As I dont want to end up with billions of files, I want those splitted files to have configurable sizes like 250 MB. Each file should be well formed having an exact copy of the header (and footer as the closing of the... (0 Replies)
Discussion started by: Malapha
0 Replies
2. Shell Programming and Scripting
Hi
I have to write a script to split the huge file into several pieces. The file columns is | pipe delimited. The data sample is as:
6625060|1420215|07308806|N|20100120|5572477081|+0002.79|+0000.00|0004|0001|......... (3 Replies)
Discussion started by: lakteja
3 Replies
3. Shell Programming and Scripting
To split the files
Hi,
I'm having a xml file with multiple xml header. so i want to split the file into multiple files.
Test.xml
---------
<?xml version="UTF_8">
<emp: ....>
<name>a</name>
<age>10</age>
</emp>
<?xml version="UTF_8">
<emp: ....>
<name>b</name>
<age>10</age>... (11 Replies)
Discussion started by: sasi_u
11 Replies
4. Shell Programming and Scripting
Hi,
I have a huge file with a single line.
But I want to break that line into lines of with each line having five columns.
My file is like this:
code:
"hi","there","how","are","you?","It","was","great","working","with","you.","hope","to","work","you."
I want it like this:
code:... (1 Reply)
Discussion started by: rajsharma
1 Replies
5. Shell Programming and Scripting
HI All,
I have to split a xml file into multiple xml files and append it in another .xml file. for example below is a sample xml and using shell script i have to split it into three xml files and append all the three xmls in a .xml file. Can some one help plz.
eg:
<?xml version="1.0"?>... (4 Replies)
Discussion started by: ganesan kulasek
4 Replies
6. Shell Programming and Scripting
Hi ,
I have a XML file like below
file name : sample.xml
<?xml version="1.0"?>
<catalog>
<author>Rajini</author>
<title>XML Guide</title>
<Text> </Text>
<genre>Computer</genre>
<price>44.95</price>
</catalog>
<?xml version="1.0"?>
<catalog>
... (5 Replies)
Discussion started by: karthinvk
5 Replies
7. Shell Programming and Scripting
Hi Everyone,
I'm new here and I was checking this old post:
/shell-programming-and-scripting/180669-splitting-file-into-several-smaller-files-using-perl.html
(cannot paste link because of lack of points)
I need to do something like this but understand very little of perl.
I also check... (4 Replies)
Discussion started by: mcosta
4 Replies
8. Shell Programming and Scripting
Hi,
I'm having a xml file with multiple xml header. so i want to split the file into multiple files.
Sample.xml consists multiple headers so how can we split these multiple headers into multiple files in unix.
eg :
<?xml version="1.0" encoding="UTF-8"?>
<ml:individual... (3 Replies)
Discussion started by: Narendra921631
3 Replies
9. Shell Programming and Scripting
Hello Shell Guru's
I have a requirement to split the source xml file into three different text file.
And i need your valuable suggestion to finish this.
Here is my source xml snippet, here i am using only one entry of <jms-system-resource>. There may be multiple entries in the source file.
... (5 Replies)
Discussion started by: Siv51427882
5 Replies
10. UNIX for Beginners Questions & Answers
Hello Gurus,
I have a requirement to split the xml file into different xml files.
Can you please help me with that?
Here is my Source XML file
<jms-system-resource>
<name>PS6SOAJMSModule</name>
<target>soa_server1</target>
<sub-deployment>
... (3 Replies)
Discussion started by: Siv51427882
3 Replies
LEARN ABOUT MINIX
dh_compress
DH_COMPRESS(1) Debhelper DH_COMPRESS(1)
NAME
dh_compress - compress files and fix symlinks in package build directories
SYNOPSIS
dh_compress [debhelperoptions] [-Xitem] [-A] [file...]
DESCRIPTION
dh_compress is a debhelper program that is responsible for compressing the files in package build directories, and makes sure that any
symlinks that pointed to the files before they were compressed are updated to point to the new files.
By default, dh_compress compresses files that Debian policy mandates should be compressed, namely all files in usr/share/info,
usr/share/man, files in usr/share/doc that are larger than 4k in size, (except the copyright file, .html and other web files, image files,
and files that appear to be already compressed based on their extensions), and all changelog files. Plus PCF fonts underneath
usr/share/fonts/X11/
FILES
debian/package.compress
These files are deprecated.
If this file exists, the default files are not compressed. Instead, the file is ran as a shell script, and all filenames that the shell
script outputs will be compressed. The shell script will be run from inside the package build directory. Note though that using -X is a
much better idea in general; you should only use a debian/package.compress file if you really need to.
OPTIONS
-Xitem, --exclude=item
Exclude files that contain item anywhere in their filename from being compressed. For example, -X.tiff will exclude TIFF files from
compression. You may use this option multiple times to build up a list of things to exclude.
-A, --all
Compress all files specified by command line parameters in ALL packages acted on.
file ...
Add these files to the list of files to compress.
CONFORMS TO
Debian policy, version 3.0
SEE ALSO
debhelper(7)
This program is a part of debhelper.
AUTHOR
Joey Hess <joeyh@debian.org>
11.1.6ubuntu2 2018-05-10 DH_COMPRESS(1)