Perl or Python looping over a set of file handles would seem like the most i efficient approach. For a more pedestrian solution, an awk script run four times with different parameters might be acceptable even if the file is big.
Does file four only contain every tenth line, and then 11, 14, and 17 go to the first file again?
csplit has some fairly versatile options, you might be able to pull this off simply with a suitable csplit pattern as well.
Last edited by era; 09-30-2008 at 01:56 PM..
Reason: csplit note
Dear all,
I have a large file which is composed of 8000 frames, what i would like to do is split the file into 8000 single files names file.pdb.1, file.pdb.2 etc etc
each frame in the large file is seperated by a "ENDMDL" flag so my thinking is to use this flag a a point to split the files... (4 Replies)
Hello, I am using awk to split a file into multiple files using command:
nawk '{
if ( $1 == "<process" )
{
n=split($2, arr, "\"");
file=arr
}
print > file }' processes.xml
<process name="Process1.process">
... (3 Replies)
Hello, I have a large file (2GB) that I would like to split based on pattern and size.
I've used the following command to split the file (token is "HELLO")
awk '/HELLO/{i++}{print > "file"i}' input.txt
and the output is similar to the following (i included filesize in KB):
10 ... (2 Replies)
Hi ,
I have huge files around 400 mb, which has clob data and have diffeent scenarios:
I am trying to pass scenario number as parameter and and get required modified file based on the scenario number and criteria.
Scenario 1:
file name : scenario_1.txt
... (2 Replies)
Hi,
I have a Huge 7 GB file which has around 1 million records, i want to split this file into 4 files to contain around 250k messages each.
Please help me as Split command cannot work here as it might miss tags..
Format of the file is as below
<!--###### ###### START-->... (6 Replies)
Hi
i have requirement like below
M <form_name> sdasadasdMklkM
D ......
D .....
M form_name> sdasadasdMklkM
D ......
D .....
D ......
D .....
M form_name> sdasadasdMklkM
D ......
M form_name> sdasadasdMklkM
i want split file based on line number by finding... (10 Replies)
Hello All ,
I have a file which needs to split based on the blank lines
Name ABC
Address London
Age 32
(4 blank new line)
Name DEF
Address London
Age 30
(4 blank new line)
Name DEF
Address London (8 Replies)
Hi,
I have a text file (attached the sample). I have also, attached the way the way the files need to be split.
We get this file, that will either have 24 Jurisdictions, or will miss some and retain some.
Like in the attached sample file, there are only Jurisdictions 03,11,14,15, 20 and 30.... (3 Replies)
Hello All,
I have records in a file in a pattern A,B,B,B,B,K,A,B,B,K
Is there any command or simple logic I can pull out records into multiple files based on A record? I want output as
File1: A,B,B,B,B,K
File2: A,B,B,K (9 Replies)
Hi
I have a requirement, where i will receive multiple files in a folder (say: /fol1/fol2/). There will be at least 14 to 16 files. The size of the files will different, some may be 80GB or 90GB, some may be less than 5 GB (and the size of the files are very unpredictable). But the names of the... (10 Replies)
Discussion started by: kpk_ds
10 Replies
LEARN ABOUT DEBIAN
csplit
CSPLIT(1) User Commands CSPLIT(1)NAME
csplit - split a file into sections determined by context lines
SYNOPSIS
csplit [OPTION]... FILE PATTERN...
DESCRIPTION
Output pieces of FILE separated by PATTERN(s) to files `xx00', `xx01', ..., and output byte counts of each piece to standard output.
Mandatory arguments to long options are mandatory for short options too.
-b, --suffix-format=FORMAT
use sprintf FORMAT instead of %02d
-f, --prefix=PREFIX
use PREFIX instead of `xx'
-k, --keep-files
do not remove output files on errors
-n, --digits=DIGITS
use specified number of digits instead of 2
-s, --quiet, --silent
do not print counts of output file sizes
-z, --elide-empty-files
remove empty output files
--help display this help and exit
--version
output version information and exit
Read standard input if FILE is -. Each PATTERN may be:
INTEGER
copy up to but not including specified line number
/REGEXP/[OFFSET]
copy up to but not including a matching line
%REGEXP%[OFFSET]
skip to, but not including a matching line
{INTEGER}
repeat the previous pattern specified number of times
{*} repeat the previous pattern as many times as possible
A line OFFSET is a required `+' or `-' followed by a positive integer.
AUTHOR
Written by Stuart Kemp and David MacKenzie.
REPORTING BUGS
Report csplit bugs to bug-coreutils@gnu.org
GNU coreutils home page: <http://www.gnu.org/software/coreutils/>
General help using GNU software: <http://www.gnu.org/gethelp/>
Report csplit translation bugs to <http://translationproject.org/team/>
COPYRIGHT
Copyright (C) 2011 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
SEE ALSO
The full documentation for csplit is maintained as a Texinfo manual. If the info and csplit programs are properly installed at your site,
the command
info coreutils 'csplit invocation'
should give you access to the complete manual.
GNU coreutils 8.12.197-032bb September 2011 CSPLIT(1)