Splitting large file into small files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Splitting large file into small files
# 1  
Old 06-07-2005
Splitting large file into small files

Hi,

I need to split a large file into small files based on a string.

At different palces in the large I have the string ^Job.
I need to split the file into different files starting from ^Job to the last character before the next ^Job.
Also all the small files should be automatically named.

Please suggest.

Thanks,
Chandra
# 2  
Old 06-07-2005
Quote:
Originally Posted by dncs
Hi,

I need to split a large file into small files based on a string.

At different palces in the large I have the string ^Job.
I need to split the file into different files starting from ^Job to the last character before the next ^Job.
Also all the small files should be automatically named.

Please suggest.

Thanks,
Chandra
man csplit
# 3  
Old 06-07-2005
I have gone thru the Man csplit help but not sure how to use it for my requirement.
# 4  
Old 06-08-2005
You have omitted some necessary information:

1. can the string appear only on the begin or in the middle of lines too?

2. Is the split-string to appear in the output too or is it to be stripped?

3. is the line containing the string considered to go to the new part of the file or is the file to be split exactly at the search string? That is, in the following example:

Code:
This is the first part
still first part ^Job This is the new part

is "still first part" considered to go to first or second part? Is it even possible, according to question 1?

Assuming the split-string can appear anywhere in the file and lines are to be split exactly where the split-string occurs the solution is:

Code:
#!/bin/ksh

typeset srcfile="file"
typeset -i cnt=1
typeset line=""

exec 3>${srcfile}.part${cnt}                # define our output file
cat $srcfile | while read line ; do
                                            # we have a line with a splitter
     if [ $(print - "$line" | grep -c "\^Job") -gt 0 ] ; then
          print -u3 "${line%%^Job*}"        # put the part of the line before
                                            # the splitter to the old output
          exec 3>&-                         # close output
          (( cnt += 1 ))
          exec 3>${srcfile}.part${cnt}      # open the next part
          print -u3 "${line##*^Job}"        # output part of line after the
                                            # occurence of the splitter
     else                                   # this is a regular line, just print
          print -u3 "$line"
     fi
done
exec 3>&-                                   # close last output file

exit 0

The reason for the "exec"s is them making output to various files easier than the hassle with redirections IMHO.

To make the script more general I'd prefer putting the split-string into a variable and feed that by a commandline option. This is left as an exercise to the reader.

bakunin
# 5  
Old 06-08-2005
Hi bakunin,
Thanks for your reply.
I will try the solution provided by you.
To provide more details , the file I am going to split will be having multiple Purchase Orders and after executing the script I should have 'n' number of files one for each PO.
It looks like below
^Job
PO No1:
<Po Details>
^Job
PO No1:
<Po Details>
......
I think the logic provided by you should fit for my requirement.
Thanks for your help.
Chandra
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Split large file into 24 small files on one hour basis

I Have a large file with 24hrs log in the below format.i need to split the large file in to 24 small files on one hour based.i.e ex:from 09:55 to 10:55,10:55-11:55 can any one help me on this.! ... (20 Replies)
Discussion started by: Raghuram717
20 Replies

2. Shell Programming and Scripting

Breaking large file into small files

Dear all, I have huge txt file with the input files for some setup_code. However for running my setup_code, I require txt files with maximum of 1000 input files Please help me in suggesting way to break down this big txt file to small txt file of 1000 entries only. thanks and Greetings, Emily (12 Replies)
Discussion started by: emily
12 Replies

3. Shell Programming and Scripting

Sed: Splitting A large File into smaller files based on recursive Regular Expression match

I will simplify the explaination a bit, I need to parse through a 87m file - I have a single text file in the form of : <NAME>house........ SOMETEXT SOMETEXT SOMETEXT . . . . </script> MORETEXT MORETEXT . . . (6 Replies)
Discussion started by: sumguy
6 Replies

4. UNIX for Advanced & Expert Users

Splitting a file into small files

Hi Folks, Please help me in solving the problem. I want to write script in order to split a file into small pieces and send it automatically through mail. Ex. The file name is CALM*.txt . It is around 50 MB. I want to split the file into 20 MB 2-3 smaller files and send (like uuencode) it... (6 Replies)
Discussion started by: piyushbhashkar
6 Replies

5. Shell Programming and Scripting

Splitting large file into multiple files in unix based on pattern

I need to write a shell script for below scenario My input file has data in format: qwerty0101TWE 12345 01022005 01022005 datainala alanfernanded 26 qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28 qwerty0101TWE 12342 01022005 07022009 datainalc hitalbert 43 qwerty0101CFG 12345... (19 Replies)
Discussion started by: jimmy12
19 Replies

6. UNIX for Dummies Questions & Answers

splitting the large file into smaller files

hi all im new to this forum..excuse me if anythng wrong. I have a file containing 600 MB data in that. when i do parse the data in perl program im getting out of memory error. so iam planning to split the file into smaller files and process one by one. can any one tell me what is the code... (1 Reply)
Discussion started by: vsnreddy
1 Replies

7. Shell Programming and Scripting

script to splite large file to number of small files

Dear All, Could you please help me to split a file contain around 240,000,000 line to 4 files all equally likely , note that we need to maintain that the end of each file should started by start flage (MSISDN) and ended by end flag (End), also the number of the line between the... (10 Replies)
Discussion started by: ahmed.gad
10 Replies

8. Shell Programming and Scripting

Split large file and add header and footer to each small files

I have one large file, after every 200 line i have to split the file and the add header and footer to each small file? It is possible to add different header and footer to each file? (7 Replies)
Discussion started by: ashish4422
7 Replies

9. Shell Programming and Scripting

Splitting large files

Hi Unix gurus, We have a masterfile which is to be split into smallerfiles with names as masterfile00,masterfile01,masterfile03...etal I was able to split the file using the "Split" cmd but as masterfileaa,masterfileab.. Is it posiible to change the default suffix? or is there any other... (2 Replies)
Discussion started by: Rvbs
2 Replies

10. UNIX for Dummies Questions & Answers

Splitting a large log file

Okay, absolute newbie here... I'm on a Mac trying to split an almost 2 Gig log file on a Unix box into manageable chunks for my web-based log analysis tool. What do I need to do, what programs do I need to do it? All and any help appreciated/needed :-) Cheers (8 Replies)
Discussion started by: simmonet
8 Replies
Login or Register to Ask a Question