It's easy enough with low-level programming with C (or Perl). The principle is the same as in "tail" command. You rewind the file pointer to FILESIZE/16 and go back to find the first <page> and remember the byte position, then rewind to 2*FILESIZE/16 and so on. When you get your positions you split the file with dd (or in this program).
If no one comes with a solution, I'll write a program but a little later.
So, it means that this task cannot be handled very well with simple shell scripting. It will take some time but will be worth trying in C. I'll write a program for it and put it here.
I wrote an awk script to do the job, check if it is what you need:
under the directory of your bigFile:
this will create 1-16.txt 16 empty files. then run this:
well the code can be optimized but try if it is working 4 u first.
(you can change the output file name in the code).
Last edited by sk1418; 08-29-2011 at 07:00 AM..
Reason: the print line was removed
Quick & dirty and not tested thoroughly. But it prints something. If there will be problems with '\0' or with 64-bit sizes or offsets it's better to translate this to perl.
You can test this with:
Yes, there is a bug. Too quick... ))) Thanks, Corona688!
Last edited by yazu; 08-29-2011 at 12:59 PM..
Reason: Bug
Nice-looking program, though I would note one problem:
Replace NBUF with 4 and follow along:
If you're lucky, this will do nothing.
If you're unlucky, it will crash your program.
If you're very unlucky, it will corrupt stack values in strange ways that alter other local variables and cause unpredictable misbehavior.
This often results in programs that work fine when compiled for debugging, but do strange things when optimized -- suddenly memory values which didn't matter get stripped out and you're only stomping on ones that do.
That should be all it needs I think.
These 2 Users Gave Thanks to Corona688 For This Post:
Yes, a baby mistake. It has one more problem - if there are not enough WORDs in a file, then it will crash (or infinitely loop) so you cannot test it on an arbitrary file. But I think in your situation it's impossible.
HI
I want to split file base on tag name.
I have few header and footer on file
<?xml version="1.33" encing="UTF-8"?>
<bulkCmConfigDataFile"
<xn:SubNetwork id="ONRM_ROOT">
<xn:MeContext id="PPP04156">
... (4 Replies)
Hi All,
We need to split a large xml into multiple valid xml with same header(2lines) and footer(last line) for N number of letterId.
In the example below we have first 2 lines as header and last line as footer.(They need to be in each split xml file)
Header:
<?xml version="1.0"... (5 Replies)
Hello All ,
Please help me with below requirement
I want to split a xml file based on tag.here is the file format
<data-set>
some-information
</data-set>
<data-set1>
some-information
</data-set1>
<data-set2>
some-information
</data-set2>
I want to split the above file into 3... (5 Replies)
I do have an xml sheet as below where I need the perl script to filter only the hyperlink tags.
<cols><col min="1" max="1" width="30.5703125" customWidth="1"/><col min="2" max="2" width="7.140625" bestFit="1" customWidth="1"/>
<col min="3" max="3" width="32.28515625" bestFit="1"... (3 Replies)
Hi Experts,
Can you please help me to split following XML file based on new Order ? Actual file is very big. I have taken few lines of it.
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<Orders xmlns='http://www.URL.com/Orders'>
<Order>
<ORDNo>450321</ORDNo>
... (3 Replies)
Hi,
I had a scenario need a help as I am new to this. I have a xml file employee.xml with the below content.
<Organisation><employee>xxx</employee><employee>yyy</employee><employee>zzz</employee></Organisation>
I want to split the file into multiple file as below. Is there a specifice way... (5 Replies)
Dear all,
I have a big file:2879(rows)x400,170 (columns) like below. I 'd like to split the file into small pieces:2879(rows)x2000(columns) per file (the last small piece will be 2879x170.
So far, I only know how to create one samll piece at one time. But actually I need to repeat this work... (6 Replies)
Hi,
I have a file which has xml data but all in single line
Ex -
<?xml version="1.0"?><User><Name>Robert</Name><Location>California</Location><Occupation>Programmer</Occupation></User>
I want to split the data in proper xml format Ex-
<?xml version="1.0"?>
<User>
<Name>Robert</Name>... (6 Replies)
Hi,
I'm experiencing difficulty in loading an XML file to an Oracle destination table.I keep running into a memory problem due to the large size of the file.
I want to split the XML file into several smaller files based on the keyword(s)/tags : '' and '' and would like to use a Unix shell... (2 Replies)