Sponsored Content
Top Forums Shell Programming and Scripting Split a file into 10 different files Post 303009730 by Don Cragun on Tuesday 19th of December 2017 08:07:35 AM
Old 12-19-2017
I agree with gull04 that split is a better way to do this (without reinventing the wheel). If you must do it with awk, you might want to try something more like:
Code:
awk -v LN=500000 '
!((NR - 1) % LN) {
	if(NR > 1) close(f)
	f = sprintf("huge_data%03d.txt", 1 + int((NR - 1) / LN))
}
{	print > f
}' huge_data.txt

With the filenames generated by this script, you can split a file into up to 1000 files and easily process them in sequential order.

If someone wants to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Split a file into 2 or more files

Dear friends: I have a datafile contains 1 to 40 lines or i can be varied between 1 to 40. I want to split the datafile into smaller files: if the datafile has 40 lines or more, file1 contains line 1 to 12 file2 contains line 13 to 25 file3 contains line 26 to 28 file4 contains line 29... (4 Replies)
Discussion started by: bobo
4 Replies

2. Shell Programming and Scripting

Split A File Into 2 Files

i WANT TO SPLIT A FILE WHICH HAS 250 COLUMNS. and the delimiter is '|'. So , can somebody help me with the command i have to use to split the file into two files. thanks (7 Replies)
Discussion started by: dummy_needhelp
7 Replies

3. UNIX for Dummies Questions & Answers

split a file into a specified number of files

I have been googling on the 'split' unix command to see if it can split a large file into 'n' number of files. Can anyone spare an example or a code snippet? Thanks, - CB (2 Replies)
Discussion started by: ChicagoBlues
2 Replies

4. Shell Programming and Scripting

split a file into many files

Hello, Here is another one. The file type is almost same, many lines and many fields. What I need to do is to extract each line of old file and make it a new file, and in the new file, the field1 will be file name and the rest of field will be transpose to line. Say, 1, field1 field2 ... (8 Replies)
Discussion started by: ssshen
8 Replies

5. Shell Programming and Scripting

How to split a file into exactly two files by timestamp?

2009-10-29 03:39:11,720 INFO - Optimize cache for minimal puts: disabled 2009-10-29 03:39:11,720 INFO - Structured second-level cache entries: disabled 2009-10-29 03:39:22,687 WARN - Problem starting service jboss.web.deployment:war=dt-sp-fabric-delegate-ws-war-3.5.0.war,id=1483428821... (3 Replies)
Discussion started by: maheshshinde
3 Replies

6. UNIX for Advanced & Expert Users

Split a big file into two others files

Hello, i have a very big file that has more then 80 MBytes (100MBytes). So with my CVS Application I cannot commit this file (too Big) because it must have < 80 MBytes. How can I split this file into two others files, i think the AIX Unix command : split -b can do that, buit how is the right... (2 Replies)
Discussion started by: steiner
2 Replies

7. Shell Programming and Scripting

How to split a data file into separate files with the file names depending upon a column's value?

Hi, I have a data file xyz.dat similar to the one given below, 2345|98|809||x|969|0 2345|98|809||y|0|537 2345|97|809||x|544|0 2345|97|809||y|0|651 9685|98|809||x|321|0 9685|98|809||y|0|357 9685|98|709||x|687|0 9685|98|709||y|0|234 2315|98|809||x|564|0 2315|98|809||y|0|537... (2 Replies)
Discussion started by: nithins007
2 Replies

8. Shell Programming and Scripting

Split a file into multiple files based on first two digits of file.

Hi , I do have a fixedwidth flatfile that has data for 10 different datasets each identified by the first two digits in the flatfile. 01 in the first two digit position refers to Set A 02 in the first two digit position refers to Set B and so on I want to genrate 10 different files from my... (6 Replies)
Discussion started by: okkadu
6 Replies

9. Shell Programming and Scripting

How to split file into multiple files using awk based on 1 field in the file?

Good day all I need some helps, say that I have data like below, each field separated by a tab DATE NAME ADDRESS 15/7/2012 LX a.b.c 15/7/2012 LX1 a.b.c 16/7/2012 AB a.b.c 16/7/2012 AB2 a.b.c 15/7/2012 LX2 a.b.c... (2 Replies)
Discussion started by: alexyyw
2 Replies

10. Shell Programming and Scripting

Split file into 20000 files

I want to split one files (>200000000 lines) into 20000 files, when I use spilt -l 23360 -d file it shows output file suffixes exhausted, seems the maximum numbers is 100.....how to solve it? (1 Reply)
Discussion started by: wanliushao
1 Replies
SPLIT(1)						    BSD General Commands Manual 						  SPLIT(1)

NAME
split -- split a file into pieces SYNOPSIS
split [-a suffix_length] [-b byte_count[k|m] | -l line_count -n chunk_count] [file [name]] DESCRIPTION
The split utility reads the given file and breaks it up into files of 1000 lines each. If file is a single dash or absent, split reads from the standard input. file itself is not altered. The options are as follows: -a Use suffix_length letters to form the suffix of the file name. -b Create smaller files byte_count bytes in length. If 'k' is appended to the number, the file is split into byte_count kilobyte pieces. If 'm' is appended to the number, the file is split into byte_count megabyte pieces. -l Create smaller files line_count lines in length. -n Split file into chunk_count smaller files. If additional arguments are specified, the first is used as the name of the input file which is to be split. If a second additional argument is specified, it is used as a prefix for the names of the files into which the file is split. In this case, each file into which the file is split is named by the prefix followed by a lexically ordered suffix using suffix_length characters in the range ``a-z''. If -a is not speci- fied, two letters are used as the suffix. If the name argument is not specified, 'x' is used. STANDARDS
The split utility conforms to IEEE Std 1003.1-2001 (``POSIX.1''). HISTORY
A split command appeared in Version 6 AT&T UNIX. The -a option was introduced in NetBSD 2.0. Before that, if name was not specified, split would vary the first letter of the filename to increase the number of possible output files. The -a option makes this unnecessary. BSD
May 28, 2007 BSD
All times are GMT -4. The time now is 11:05 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy