Breaking large file into small files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Breaking large file into small files
# 1  
Old 03-05-2015
Breaking large file into small files

Dear all,
I have huge txt file with the input files for some setup_code. However for running my setup_code, I require txt files with maximum of 1000 input files
Please help me in suggesting way to break down this big txt file to small txt file of 1000 entries only.

thanks and Greetings,
Emily
# 2  
Old 03-05-2015
man split.
Did you consider the hints at the lower page border?
This User Gave Thanks to RudiC For This Post:
# 3  
Old 03-05-2015
If each entry is a line and you want 10,000 lines try using the
head and tail commands to chop up the file in 10,000 line chunks.
You can probably put it in a for loop with the wc command to see
how many lines are in the file, hence the maximum number to extract.

Code:
head -10000 file_name.txt               > file1.txt
head -20000 file_name.txt | tail -10000 > file2.txt
head -30000 file_name.txt | tail -20000 > file2.txt
head -40000 file_name.txt | tail -30000 > file2.txt

This User Gave Thanks to gandolf989 For This Post:
# 4  
Old 03-05-2015
Well, gandolf989, that appears to be a poor suggestion given that split will do this all in a single operation. You are wasting IO and CPU resources so it will be quite slow. Also, how would you know when to stop writing code or running your loop? There is no point reinventing a process that works very well already and lets you customise the beginning and end of the output files if you need to.


If the splitting up dependant on any condition other than record count, then csplit may be the tool for you, however without some sample data and rules to follow it's impossible to really know what you need.

Please wrap all code, file, input & output/errors in CODE tags as it makes them far easier to read and preserves multiple spaces and long lines in case these are important.


Thanks, in advance,
Robin
This User Gave Thanks to rbatte1 For This Post:
# 5  
Old 03-05-2015
Hello emily/gandolf989,

If we need to split a file according to lines let's say 10'000 lines per file then following may help.
Code:
awk '{FILENAME="file"int((NR-1)/10000);print >> FILENAME}' Input_file

Thanks,
R. Singh
This User Gave Thanks to RavinderSingh13 For This Post:
# 6  
Old 03-05-2015
Quote:
Originally Posted by RavinderSingh13
Hello emily/gandolf989,

If we need to split a file according to lines let's say 10'000 lines per file then following may help.
Code:
awk '{FILENAME="file"int((NR-1)/10000);print >> FILENAME}' Input_file

Thanks,
R. Singh
There seems to be a magic awk command for almost every problem.

---------- Post updated at 12:46 PM ---------- Previous update was at 12:44 PM ----------

Quote:
Originally Posted by rbatte1
Well, gandolf989, that appears to be a poor suggestion given that split will do this all in a single operation....
csplit might be a far better tool, I just haven't used it. There is certainly a size file where head/tail just won't work, but for many files it might work well enough.
This User Gave Thanks to gandolf989 For This Post:
# 7  
Old 03-06-2015
thanks all for useful input, it work fine..Smilie

Greetings,

---------- Post updated at 03:13 AM ---------- Previous update was at 02:45 AM ----------

Hello,
I am not able to provide external parameter here..which is $3 while getting the desired output files..Smilie in this line
Code:
awk '{FILENAME="$3_"int((NR-1)/200)".txt";print >> FILENAME}' $3


Code:
#!/bin/bash                                                                                                                  
#usage ./copyTextFromCastor.sh $PATH $GREP $OUTPUTFILE                                                                       

PATHNAME=$1
CONSTANT=rfio:
GREP=$2
OUTPUT=$3

echo "Copying fileName \"$1 | grep $2\" to $3"
srmls "$PATHNAME" --count 99999 --offset 2 | grep "$2" | awk -F'tier2' '{print string path $2}' string="" path=""  > "$3"

echo "progressing ... please be patient..."

## split $3 into small size files, name InputFileN.txt                                                                       
awk '{FILENAME="$3_"int((NR-1)/200)".txt";print >> FILENAME}' $3

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Split large file into 24 small files on one hour basis

I Have a large file with 24hrs log in the below format.i need to split the large file in to 24 small files on one hour based.i.e ex:from 09:55 to 10:55,10:55-11:55 can any one help me on this.! ... (20 Replies)
Discussion started by: Raghuram717
20 Replies

2. Shell Programming and Scripting

Split a large array into small chunks

Hi, I need to split a large array "@sharedArray" into 10 small arrays. The arrays should be like @sharedArray1,@sharedArray2,@sharedArray3...so on.. Can anyone help me with the logic to do so :(:confused: (6 Replies)
Discussion started by: rkrish
6 Replies

3. UNIX for Dummies Questions & Answers

Breaking a fasta formatted file into multiple files containing each gene separately

Hey, I've been trying to break a massive fasta formatted file into files containing each gene separately. Could anyone help me? I've tried to use the following code but i've recieved errors every time: for i in *.rtf.out do awk '/^>/{f=++d".fasta"} {print > $i.out}' $i done (1 Reply)
Discussion started by: Ann Mc Cartney
1 Replies

4. UNIX for Advanced & Expert Users

Splitting a file into small files

Hi Folks, Please help me in solving the problem. I want to write script in order to split a file into small pieces and send it automatically through mail. Ex. The file name is CALM*.txt . It is around 50 MB. I want to split the file into 20 MB 2-3 smaller files and send (like uuencode) it... (6 Replies)
Discussion started by: piyushbhashkar
6 Replies

5. Shell Programming and Scripting

Breaking the files as 10k recs. per file

Hi, I have a code as given below Set -A _Category="A\ B\ C" for _cat in ${_Category} do sed -e "s:<TABLE_NAME>:${_cat}:g" \ -e "s:<date>:${_dt}:g" \ ${_home}/skl/sq1.sql >> ${_dest}/del_${_dt}.sql fi ... (4 Replies)
Discussion started by: mr_manii
4 Replies

6. Shell Programming and Scripting

Breaking one file into many files based on first column?

Hi, I have a file that looks like this (tab deliminited). MAT1 YKR2 3 MAT1 YMR1 2 MAT1 YFG2 2 MAT2 YLM4 4 MAT2 YHL2 1 BAR1 YKR2 3 BAR1 YFR1 4 BAR1 YMR1 1 What I want to do is break this file down into multiple files. So the result will look like this: File 1... (2 Replies)
Discussion started by: kylle345
2 Replies

7. Shell Programming and Scripting

script to splite large file to number of small files

Dear All, Could you please help me to split a file contain around 240,000,000 line to 4 files all equally likely , note that we need to maintain that the end of each file should started by start flage (MSISDN) and ended by end flag (End), also the number of the line between the... (10 Replies)
Discussion started by: ahmed.gad
10 Replies

8. Shell Programming and Scripting

Split large file and add header and footer to each small files

I have one large file, after every 200 line i have to split the file and the add header and footer to each small file? It is possible to add different header and footer to each file? (7 Replies)
Discussion started by: ashish4422
7 Replies

9. Shell Programming and Scripting

Split a file into 16 small files

Hi I want to split a file that has 'n' number of records into 16 small files. Can some one suggest me how to do this using Unix script? Thanks rrkk (10 Replies)
Discussion started by: rrkks
10 Replies

10. Shell Programming and Scripting

Splitting large file into small files

Hi, I need to split a large file into small files based on a string. At different palces in the large I have the string ^Job. I need to split the file into different files starting from ^Job to the last character before the next ^Job. Also all the small files should be automatically named.... (4 Replies)
Discussion started by: dncs
4 Replies
Login or Register to Ask a Question