Sponsored Content
Top Forums Shell Programming and Scripting Split File Based on Line Number Pattern Post 302241854 by era on Tuesday 30th of September 2008 12:55:42 PM
Old 09-30-2008
Perl or Python looping over a set of file handles would seem like the most i efficient approach. For a more pedestrian solution, an awk script run four times with different parameters might be acceptable even if the file is big.

Does file four only contain every tenth line, and then 11, 14, and 17 go to the first file again?

Code:
perl -MIO::File -ne 'BEGIN { map { $file[$_] = IO::File->new(">file$_") || die $!} 0..3; 
  @m = (0, 1, 2, 0, 1, 2, 0, 1, 2, 3);
}
$file[$m[$. % 9]]->print || die $!'

csplit has some fairly versatile options, you might be able to pull this off simply with a suitable csplit pattern as well.

Last edited by era; 09-30-2008 at 01:56 PM.. Reason: csplit note
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split a file based on a pattern

Dear all, I have a large file which is composed of 8000 frames, what i would like to do is split the file into 8000 single files names file.pdb.1, file.pdb.2 etc etc each frame in the large file is seperated by a "ENDMDL" flag so my thinking is to use this flag a a point to split the files... (4 Replies)
Discussion started by: Mish_99
4 Replies

2. Shell Programming and Scripting

split XML file into multiple files based on pattern

Hello, I am using awk to split a file into multiple files using command: nawk '{ if ( $1 == "<process" ) { n=split($2, arr, "\""); file=arr } print > file }' processes.xml <process name="Process1.process"> ... (3 Replies)
Discussion started by: chiru_h
3 Replies

3. Shell Programming and Scripting

Split a file based on pattern and size

Hello, I have a large file (2GB) that I would like to split based on pattern and size. I've used the following command to split the file (token is "HELLO") awk '/HELLO/{i++}{print > "file"i}' input.txt and the output is similar to the following (i included filesize in KB): 10 ... (2 Replies)
Discussion started by: jl487
2 Replies

4. Shell Programming and Scripting

Split the file based on pattern

Hi , I have huge files around 400 mb, which has clob data and have diffeent scenarios: I am trying to pass scenario number as parameter and and get required modified file based on the scenario number and criteria. Scenario 1: file name : scenario_1.txt ... (2 Replies)
Discussion started by: sol_nov
2 Replies

5. UNIX for Dummies Questions & Answers

Split a huge 7 GB File Based on Pattern into 4 files

Hi, I have a Huge 7 GB file which has around 1 million records, i want to split this file into 4 files to contain around 250k messages each. Please help me as Split command cannot work here as it might miss tags.. Format of the file is as below <!--###### ###### START-->... (6 Replies)
Discussion started by: KishM
6 Replies

6. Shell Programming and Scripting

How to split a file based on pattern line number?

Hi i have requirement like below M <form_name> sdasadasdMklkM D ...... D ..... M form_name> sdasadasdMklkM D ...... D ..... D ...... D ..... M form_name> sdasadasdMklkM D ...... M form_name> sdasadasdMklkM i want split file based on line number by finding... (10 Replies)
Discussion started by: bhaskar v
10 Replies

7. UNIX for Dummies Questions & Answers

Split file based on number of blank lines

Hello All , I have a file which needs to split based on the blank lines Name ABC Address London Age 32 (4 blank new line) Name DEF Address London Age 30 (4 blank new line) Name DEF Address London (8 Replies)
Discussion started by: Pratik4891
8 Replies

8. Shell Programming and Scripting

Split a text file into multiple pages based on pattern

Hi, I have a text file (attached the sample). I have also, attached the way the way the files need to be split. We get this file, that will either have 24 Jurisdictions, or will miss some and retain some. Like in the attached sample file, there are only Jurisdictions 03,11,14,15, 20 and 30.... (3 Replies)
Discussion started by: ebsus
3 Replies

9. UNIX for Advanced & Expert Users

Split one file to many based on pattern

Hello All, I have records in a file in a pattern A,B,B,B,B,K,A,B,B,K Is there any command or simple logic I can pull out records into multiple files based on A record? I want output as File1: A,B,B,B,B,K File2: A,B,B,K (9 Replies)
Discussion started by: deal1dealer
9 Replies

10. Shell Programming and Scripting

Split File based on number of rows

Hi I have a requirement, where i will receive multiple files in a folder (say: /fol1/fol2/). There will be at least 14 to 16 files. The size of the files will different, some may be 80GB or 90GB, some may be less than 5 GB (and the size of the files are very unpredictable). But the names of the... (10 Replies)
Discussion started by: kpk_ds
10 Replies
CSPLIT(1)						    BSD General Commands Manual 						 CSPLIT(1)

NAME
csplit -- split files based on context SYNOPSIS
csplit [-ks] [-f prefix] [-n number] file args ... DESCRIPTION
The csplit utility splits file into pieces using the patterns args. If file is a dash ('-'), csplit reads from standard input. Files are created with a prefix of ``xx'' and two decimal digits. The size of each file is written to standard output as it is created. If an error occurs whilst files are being created, or a HUP, INT, or TERM signal is received, all files previously written are removed. The options are as follows: -f prefix Create file names beginning with prefix, instead of ``xx''. -k Do not remove previously created files if an error occurs or a HUP, INT, or TERM signal is received. -n number Create file names beginning with number of decimal digits after the prefix, instead of 2. -s Do not write the size of each output file to standard output as it is created. The args operands may be a combination of the following patterns: /regexp/[[+|-]offset] Create a file containing the input from the current line to (but not including) the next line matching the given basic regular expression. An optional offset from the line that matched may be specified. %regexp%[[+|-]offset] Same as above but a file is not created for the output. line_no Create containing the input from the current line to (but not including) the specified line number. {num} Repeat the previous pattern the specified number of times. If it follows a line number pattern, a new file will be created for each line_no lines, num times. The first line of the file is line number 1 for historic reasons. After all the patterns have been processed, the remaining input data (if there is any) will be written to a new file. Requesting to split at a line before the current line number or past the end of the file will result in an error. ENVIRONMENT
The LANG, LC_ALL, LC_COLLATE and LC_CTYPE environment variables affect the execution of csplit as described in environ(7). EXIT STATUS
The csplit utility exits 0 on success, and >0 if an error occurs. EXAMPLES
Split the mdoc(7) file foo.1 into one file for each section (up to 21 plus one for the rest, if any): csplit -k foo.1 '%^.Sh%' '/^.Sh/' '{20}' Split standard input after the first 99 lines and every 100 lines thereafter: csplit -k - 100 '{19}' SEE ALSO
sed(1), split(1), re_format(7) STANDARDS
The csplit utility conforms to IEEE Std 1003.1-2001 (``POSIX.1''). HISTORY
A csplit command appeared in PWB UNIX. BUGS
Input lines are limited to LINE_MAX (2048) bytes in length. BSD
February 6, 2014 BSD
All times are GMT -4. The time now is 03:23 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy