Sponsored Content
Top Forums Shell Programming and Scripting Split a file into 10 different files Post 303009722 by omega3 on Tuesday 19th of December 2017 06:54:26 AM
Old 12-19-2017
Split a file into 10 different files

OS : RHEL 6.7
Shell : bash

I have a text file with 5.97 million lines.

I want to split this big file into 12 different files (in sequential order) so that each file will contain roughly 500K lines. I tried the following awk command after googling. But, it just created 2 files (huge_data.txt11 and huge_data.txt12) from the source file.

Any idea how I can split the file into 12 different files?



Code:
$ wc -l huge_data.txt
5970387 huge_data.txt

$ awk -vLN=500000 '{print > ("huge_data.txt" 12-(NR>LN))}' huge_data.txt
$
$ ls -lh
total 6.5G
-rw-rw-r-- 1 appusr appusr 3.3G Dec 16 17:04 huge_data.txt
-rw-rw-r-- 1 appusr appusr 3.0G Dec 19 11:45 huge_data.txt11
-rw-rw-r-- 1 appusr appusr 276M Dec 19 11:45 huge_data.txt12
$
$
$ wc -l huge_data.txt11
5470387 huge_data.txt11
$
$ wc -l huge_data.txt12
500000 huge_data.txt12
$

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Split a file into 2 or more files

Dear friends: I have a datafile contains 1 to 40 lines or i can be varied between 1 to 40. I want to split the datafile into smaller files: if the datafile has 40 lines or more, file1 contains line 1 to 12 file2 contains line 13 to 25 file3 contains line 26 to 28 file4 contains line 29... (4 Replies)
Discussion started by: bobo
4 Replies

2. Shell Programming and Scripting

Split A File Into 2 Files

i WANT TO SPLIT A FILE WHICH HAS 250 COLUMNS. and the delimiter is '|'. So , can somebody help me with the command i have to use to split the file into two files. thanks (7 Replies)
Discussion started by: dummy_needhelp
7 Replies

3. UNIX for Dummies Questions & Answers

split a file into a specified number of files

I have been googling on the 'split' unix command to see if it can split a large file into 'n' number of files. Can anyone spare an example or a code snippet? Thanks, - CB (2 Replies)
Discussion started by: ChicagoBlues
2 Replies

4. Shell Programming and Scripting

split a file into many files

Hello, Here is another one. The file type is almost same, many lines and many fields. What I need to do is to extract each line of old file and make it a new file, and in the new file, the field1 will be file name and the rest of field will be transpose to line. Say, 1, field1 field2 ... (8 Replies)
Discussion started by: ssshen
8 Replies

5. Shell Programming and Scripting

How to split a file into exactly two files by timestamp?

2009-10-29 03:39:11,720 INFO - Optimize cache for minimal puts: disabled 2009-10-29 03:39:11,720 INFO - Structured second-level cache entries: disabled 2009-10-29 03:39:22,687 WARN - Problem starting service jboss.web.deployment:war=dt-sp-fabric-delegate-ws-war-3.5.0.war,id=1483428821... (3 Replies)
Discussion started by: maheshshinde
3 Replies

6. UNIX for Advanced & Expert Users

Split a big file into two others files

Hello, i have a very big file that has more then 80 MBytes (100MBytes). So with my CVS Application I cannot commit this file (too Big) because it must have < 80 MBytes. How can I split this file into two others files, i think the AIX Unix command : split -b can do that, buit how is the right... (2 Replies)
Discussion started by: steiner
2 Replies

7. Shell Programming and Scripting

How to split a data file into separate files with the file names depending upon a column's value?

Hi, I have a data file xyz.dat similar to the one given below, 2345|98|809||x|969|0 2345|98|809||y|0|537 2345|97|809||x|544|0 2345|97|809||y|0|651 9685|98|809||x|321|0 9685|98|809||y|0|357 9685|98|709||x|687|0 9685|98|709||y|0|234 2315|98|809||x|564|0 2315|98|809||y|0|537... (2 Replies)
Discussion started by: nithins007
2 Replies

8. Shell Programming and Scripting

Split a file into multiple files based on first two digits of file.

Hi , I do have a fixedwidth flatfile that has data for 10 different datasets each identified by the first two digits in the flatfile. 01 in the first two digit position refers to Set A 02 in the first two digit position refers to Set B and so on I want to genrate 10 different files from my... (6 Replies)
Discussion started by: okkadu
6 Replies

9. Shell Programming and Scripting

How to split file into multiple files using awk based on 1 field in the file?

Good day all I need some helps, say that I have data like below, each field separated by a tab DATE NAME ADDRESS 15/7/2012 LX a.b.c 15/7/2012 LX1 a.b.c 16/7/2012 AB a.b.c 16/7/2012 AB2 a.b.c 15/7/2012 LX2 a.b.c... (2 Replies)
Discussion started by: alexyyw
2 Replies

10. Shell Programming and Scripting

Split file into 20000 files

I want to split one files (>200000000 lines) into 20000 files, when I use spilt -l 23360 -d file it shows output file suffixes exhausted, seems the maximum numbers is 100.....how to solve it? (1 Reply)
Discussion started by: wanliushao
1 Replies
CSPLIT(1)						    BSD General Commands Manual 						 CSPLIT(1)

NAME
csplit -- split files based on context SYNOPSIS
csplit [-ks] [-f prefix] [-n number] file args ... DESCRIPTION
The csplit utility splits file into pieces using the patterns args. If file is a dash ('-'), csplit reads from standard input. The options are as follows: -f prefix Give created files names beginning with prefix. The default is ``xx''. -k Do not remove output files if an error occurs or a HUP, INT or TERM signal is received. -n number Use number of decimal digits after the prefix to form the file name. The default is 2. -s Do not write the size of each output file to standard output as it is created. The args operands may be a combination of the following patterns: /regexp/[[+|-]offset] Create a file containing the input from the current line to (but not including) the next line matching the given basic regular expression. An optional offset from the line that matched may be specified. %regexp%[[+|-]offset] Same as above but a file is not created for the output. line_no Create containing the input from the current line to (but not including) the specified line number. {num} Repeat the previous pattern the specified number of times. If it follows a line number pattern, a new file will be created for each line_no lines, num times. The first line of the file is line number 1 for historic reasons. After all the patterns have been processed, the remaining input data (if there is any) will be written to a new file. Requesting to split at a line before the current line number or past the end of the file will result in an error. ENVIRONMENT
The LANG, LC_ALL, LC_COLLATE and LC_CTYPE environment variables affect the execution of csplit as described in environ(7). EXIT STATUS
The csplit utility exits 0 on success, and >0 if an error occurs. EXAMPLES
Split the mdoc(7) file foo.1 into one file for each section (up to 20): csplit -k foo.1 '%^.Sh%' '/^.Sh/' '{20}' Split standard input after the first 99 lines and every 100 lines thereafter: csplit -k - 100 '{19}' SEE ALSO
sed(1), split(1), re_format(7) STANDARDS
The csplit utility conforms to IEEE Std 1003.1-2001 (``POSIX.1''). HISTORY
A csplit command appeared in PWB UNIX. BUGS
Input lines are limited to LINE_MAX (2048) bytes in length. BSD
January 26, 2005 BSD
All times are GMT -4. The time now is 08:18 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy