Sponsored Content
Top Forums Shell Programming and Scripting Splitting large file into small files Post 74146 by bakunin on Wednesday 8th of June 2005 05:24:23 AM
Old 06-08-2005
You have omitted some necessary information:

1. can the string appear only on the begin or in the middle of lines too?

2. Is the split-string to appear in the output too or is it to be stripped?

3. is the line containing the string considered to go to the new part of the file or is the file to be split exactly at the search string? That is, in the following example:

Code:
This is the first part
still first part ^Job This is the new part

is "still first part" considered to go to first or second part? Is it even possible, according to question 1?

Assuming the split-string can appear anywhere in the file and lines are to be split exactly where the split-string occurs the solution is:

Code:
#!/bin/ksh

typeset srcfile="file"
typeset -i cnt=1
typeset line=""

exec 3>${srcfile}.part${cnt}                # define our output file
cat $srcfile | while read line ; do
                                            # we have a line with a splitter
     if [ $(print - "$line" | grep -c "\^Job") -gt 0 ] ; then
          print -u3 "${line%%^Job*}"        # put the part of the line before
                                            # the splitter to the old output
          exec 3>&-                         # close output
          (( cnt += 1 ))
          exec 3>${srcfile}.part${cnt}      # open the next part
          print -u3 "${line##*^Job}"        # output part of line after the
                                            # occurence of the splitter
     else                                   # this is a regular line, just print
          print -u3 "$line"
     fi
done
exec 3>&-                                   # close last output file

exit 0

The reason for the "exec"s is them making output to various files easier than the hassle with redirections IMHO.

To make the script more general I'd prefer putting the split-string into a variable and feed that by a commandline option. This is left as an exercise to the reader.

bakunin
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Splitting a large log file

Okay, absolute newbie here... I'm on a Mac trying to split an almost 2 Gig log file on a Unix box into manageable chunks for my web-based log analysis tool. What do I need to do, what programs do I need to do it? All and any help appreciated/needed :-) Cheers (8 Replies)
Discussion started by: simmonet
8 Replies

2. Shell Programming and Scripting

Splitting large files

Hi Unix gurus, We have a masterfile which is to be split into smallerfiles with names as masterfile00,masterfile01,masterfile03...etal I was able to split the file using the "Split" cmd but as masterfileaa,masterfileab.. Is it posiible to change the default suffix? or is there any other... (2 Replies)
Discussion started by: Rvbs
2 Replies

3. Shell Programming and Scripting

Split large file and add header and footer to each small files

I have one large file, after every 200 line i have to split the file and the add header and footer to each small file? It is possible to add different header and footer to each file? (7 Replies)
Discussion started by: ashish4422
7 Replies

4. Shell Programming and Scripting

script to splite large file to number of small files

Dear All, Could you please help me to split a file contain around 240,000,000 line to 4 files all equally likely , note that we need to maintain that the end of each file should started by start flage (MSISDN) and ended by end flag (End), also the number of the line between the... (10 Replies)
Discussion started by: ahmed.gad
10 Replies

5. UNIX for Dummies Questions & Answers

splitting the large file into smaller files

hi all im new to this forum..excuse me if anythng wrong. I have a file containing 600 MB data in that. when i do parse the data in perl program im getting out of memory error. so iam planning to split the file into smaller files and process one by one. can any one tell me what is the code... (1 Reply)
Discussion started by: vsnreddy
1 Replies

6. Shell Programming and Scripting

Splitting large file into multiple files in unix based on pattern

I need to write a shell script for below scenario My input file has data in format: qwerty0101TWE 12345 01022005 01022005 datainala alanfernanded 26 qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28 qwerty0101TWE 12342 01022005 07022009 datainalc hitalbert 43 qwerty0101CFG 12345... (19 Replies)
Discussion started by: jimmy12
19 Replies

7. UNIX for Advanced & Expert Users

Splitting a file into small files

Hi Folks, Please help me in solving the problem. I want to write script in order to split a file into small pieces and send it automatically through mail. Ex. The file name is CALM*.txt . It is around 50 MB. I want to split the file into 20 MB 2-3 smaller files and send (like uuencode) it... (6 Replies)
Discussion started by: piyushbhashkar
6 Replies

8. Shell Programming and Scripting

Sed: Splitting A large File into smaller files based on recursive Regular Expression match

I will simplify the explaination a bit, I need to parse through a 87m file - I have a single text file in the form of : <NAME>house........ SOMETEXT SOMETEXT SOMETEXT . . . . </script> MORETEXT MORETEXT . . . (6 Replies)
Discussion started by: sumguy
6 Replies

9. Shell Programming and Scripting

Breaking large file into small files

Dear all, I have huge txt file with the input files for some setup_code. However for running my setup_code, I require txt files with maximum of 1000 input files Please help me in suggesting way to break down this big txt file to small txt file of 1000 entries only. thanks and Greetings, Emily (12 Replies)
Discussion started by: emily
12 Replies

10. UNIX for Beginners Questions & Answers

Split large file into 24 small files on one hour basis

I Have a large file with 24hrs log in the below format.i need to split the large file in to 24 small files on one hour based.i.e ex:from 09:55 to 10:55,10:55-11:55 can any one help me on this.! ... (20 Replies)
Discussion started by: Raghuram717
20 Replies
split(n)						       Tcl Built-In Commands							  split(n)

__________________________________________________________________________________________________________________________________________________

NAME
split - Split a string into a proper Tcl list SYNOPSIS
split string ?splitChars? _________________________________________________________________ DESCRIPTION
Returns a list created by splitting string at each character that is in the splitChars argument. Each element of the result list will con- sist of the characters from string that lie between instances of the characters in splitChars. Empty list elements will be generated if string contains adjacent characters in splitChars, or if the first or last character of string is in splitChars. If splitChars is an empty string then each character of string becomes a separate element of the result list. SplitChars defaults to the standard white-space char- acters. EXAMPLES
Divide up a USENET group name into its hierarchical components: split "comp.lang.tcl.announce" . -> comp lang tcl announce See how the split command splits on every character in splitChars, which can result in information loss if you are not careful: split "alpha beta gamma" "temp" -> al {ha b} {} {a ga} {} a Extract the list words from a string that is not a well-formed list: split "Example with {unbalanced brace character" -> Example with {unbalanced brace character Split a string into its constituent characters split "Hello world" {} -> H e l l o { } w o r l d PARSING RECORD-ORIENTED FILES Parse a Unix /etc/passwd file, which consists of one entry per line, with each line consisting of a colon-separated list of fields: ## Read the file set fid [open /etc/passwd] set content [read $fid] close $fid ## Split into records on newlines set records [split $content " "] ## Iterate over the records foreach rec $records { ## Split into fields on colons set fields [split $rec ":"] ## Assign fields to variables and print some out... lassign $fields userName password uid grp longName homeDir shell puts "$longName uses [file tail $shell] for a login shell" } SEE ALSO
join(n), list(n), string(n) KEYWORDS
list, split, string Tcl split(n)
All times are GMT -4. The time now is 01:05 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy