Perl or Python looping over a set of file handles would seem like the most i efficient approach. For a more pedestrian solution, an awk script run four times with different parameters might be acceptable even if the file is big.
Does file four only contain every tenth line, and then 11, 14, and 17 go to the first file again?
csplit has some fairly versatile options, you might be able to pull this off simply with a suitable csplit pattern as well.
Last edited by era; 09-30-2008 at 01:56 PM..
Reason: csplit note
Dear all,
I have a large file which is composed of 8000 frames, what i would like to do is split the file into 8000 single files names file.pdb.1, file.pdb.2 etc etc
each frame in the large file is seperated by a "ENDMDL" flag so my thinking is to use this flag a a point to split the files... (4 Replies)
Hello, I am using awk to split a file into multiple files using command:
nawk '{
if ( $1 == "<process" )
{
n=split($2, arr, "\"");
file=arr
}
print > file }' processes.xml
<process name="Process1.process">
... (3 Replies)
Hello, I have a large file (2GB) that I would like to split based on pattern and size.
I've used the following command to split the file (token is "HELLO")
awk '/HELLO/{i++}{print > "file"i}' input.txt
and the output is similar to the following (i included filesize in KB):
10 ... (2 Replies)
Hi ,
I have huge files around 400 mb, which has clob data and have diffeent scenarios:
I am trying to pass scenario number as parameter and and get required modified file based on the scenario number and criteria.
Scenario 1:
file name : scenario_1.txt
... (2 Replies)
Hi,
I have a Huge 7 GB file which has around 1 million records, i want to split this file into 4 files to contain around 250k messages each.
Please help me as Split command cannot work here as it might miss tags..
Format of the file is as below
<!--###### ###### START-->... (6 Replies)
Hi
i have requirement like below
M <form_name> sdasadasdMklkM
D ......
D .....
M form_name> sdasadasdMklkM
D ......
D .....
D ......
D .....
M form_name> sdasadasdMklkM
D ......
M form_name> sdasadasdMklkM
i want split file based on line number by finding... (10 Replies)
Hello All ,
I have a file which needs to split based on the blank lines
Name ABC
Address London
Age 32
(4 blank new line)
Name DEF
Address London
Age 30
(4 blank new line)
Name DEF
Address London (8 Replies)
Hi,
I have a text file (attached the sample). I have also, attached the way the way the files need to be split.
We get this file, that will either have 24 Jurisdictions, or will miss some and retain some.
Like in the attached sample file, there are only Jurisdictions 03,11,14,15, 20 and 30.... (3 Replies)
Hello All,
I have records in a file in a pattern A,B,B,B,B,K,A,B,B,K
Is there any command or simple logic I can pull out records into multiple files based on A record? I want output as
File1: A,B,B,B,B,K
File2: A,B,B,K (9 Replies)
Hi
I have a requirement, where i will receive multiple files in a folder (say: /fol1/fol2/). There will be at least 14 to 16 files. The size of the files will different, some may be 80GB or 90GB, some may be less than 5 GB (and the size of the files are very unpredictable). But the names of the... (10 Replies)
Discussion started by: kpk_ds
10 Replies
LEARN ABOUT FREEBSD
csplit
CSPLIT(1) BSD General Commands Manual CSPLIT(1)NAME
csplit -- split files based on context
SYNOPSIS
csplit [-ks] [-f prefix] [-n number] file args ...
DESCRIPTION
The csplit utility splits file into pieces using the patterns args. If file is a dash ('-'), csplit reads from standard input.
Files are created with a prefix of ``xx'' and two decimal digits. The size of each file is written to standard output as it is created. If
an error occurs whilst files are being created, or a HUP, INT, or TERM signal is received, all files previously written are removed.
The options are as follows:
-f prefix
Create file names beginning with prefix, instead of ``xx''.
-k Do not remove previously created files if an error occurs or a HUP, INT, or TERM signal is received.
-n number
Create file names beginning with number of decimal digits after the prefix, instead of 2.
-s Do not write the size of each output file to standard output as it is created.
The args operands may be a combination of the following patterns:
/regexp/[[+|-]offset]
Create a file containing the input from the current line to (but not including) the next line matching the given basic regular
expression. An optional offset from the line that matched may be specified.
%regexp%[[+|-]offset]
Same as above but a file is not created for the output.
line_no
Create containing the input from the current line to (but not including) the specified line number.
{num} Repeat the previous pattern the specified number of times. If it follows a line number pattern, a new file will be created for each
line_no lines, num times. The first line of the file is line number 1 for historic reasons.
After all the patterns have been processed, the remaining input data (if there is any) will be written to a new file.
Requesting to split at a line before the current line number or past the end of the file will result in an error.
ENVIRONMENT
The LANG, LC_ALL, LC_COLLATE and LC_CTYPE environment variables affect the execution of csplit as described in environ(7).
EXIT STATUS
The csplit utility exits 0 on success, and >0 if an error occurs.
EXAMPLES
Split the mdoc(7) file foo.1 into one file for each section (up to 21 plus one for the rest, if any):
csplit -k foo.1 '%^.Sh%' '/^.Sh/' '{20}'
Split standard input after the first 99 lines and every 100 lines thereafter:
csplit -k - 100 '{19}'
SEE ALSO sed(1), split(1), re_format(7)STANDARDS
The csplit utility conforms to IEEE Std 1003.1-2001 (``POSIX.1'').
HISTORY
A csplit command appeared in PWB UNIX.
BUGS
Input lines are limited to LINE_MAX (2048) bytes in length.
BSD February 6, 2014 BSD