Sponsored Content
Top Forums Shell Programming and Scripting Splitting large file into small files Post 74146 by bakunin on Wednesday 8th of June 2005 05:24:23 AM
Old 06-08-2005
You have omitted some necessary information:

1. can the string appear only on the begin or in the middle of lines too?

2. Is the split-string to appear in the output too or is it to be stripped?

3. is the line containing the string considered to go to the new part of the file or is the file to be split exactly at the search string? That is, in the following example:

Code:
This is the first part
still first part ^Job This is the new part

is "still first part" considered to go to first or second part? Is it even possible, according to question 1?

Assuming the split-string can appear anywhere in the file and lines are to be split exactly where the split-string occurs the solution is:

Code:
#!/bin/ksh

typeset srcfile="file"
typeset -i cnt=1
typeset line=""

exec 3>${srcfile}.part${cnt}                # define our output file
cat $srcfile | while read line ; do
                                            # we have a line with a splitter
     if [ $(print - "$line" | grep -c "\^Job") -gt 0 ] ; then
          print -u3 "${line%%^Job*}"        # put the part of the line before
                                            # the splitter to the old output
          exec 3>&-                         # close output
          (( cnt += 1 ))
          exec 3>${srcfile}.part${cnt}      # open the next part
          print -u3 "${line##*^Job}"        # output part of line after the
                                            # occurence of the splitter
     else                                   # this is a regular line, just print
          print -u3 "$line"
     fi
done
exec 3>&-                                   # close last output file

exit 0

The reason for the "exec"s is them making output to various files easier than the hassle with redirections IMHO.

To make the script more general I'd prefer putting the split-string into a variable and feed that by a commandline option. This is left as an exercise to the reader.

bakunin
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Splitting a large log file

Okay, absolute newbie here... I'm on a Mac trying to split an almost 2 Gig log file on a Unix box into manageable chunks for my web-based log analysis tool. What do I need to do, what programs do I need to do it? All and any help appreciated/needed :-) Cheers (8 Replies)
Discussion started by: simmonet
8 Replies

2. Shell Programming and Scripting

Splitting large files

Hi Unix gurus, We have a masterfile which is to be split into smallerfiles with names as masterfile00,masterfile01,masterfile03...etal I was able to split the file using the "Split" cmd but as masterfileaa,masterfileab.. Is it posiible to change the default suffix? or is there any other... (2 Replies)
Discussion started by: Rvbs
2 Replies

3. Shell Programming and Scripting

Split large file and add header and footer to each small files

I have one large file, after every 200 line i have to split the file and the add header and footer to each small file? It is possible to add different header and footer to each file? (7 Replies)
Discussion started by: ashish4422
7 Replies

4. Shell Programming and Scripting

script to splite large file to number of small files

Dear All, Could you please help me to split a file contain around 240,000,000 line to 4 files all equally likely , note that we need to maintain that the end of each file should started by start flage (MSISDN) and ended by end flag (End), also the number of the line between the... (10 Replies)
Discussion started by: ahmed.gad
10 Replies

5. UNIX for Dummies Questions & Answers

splitting the large file into smaller files

hi all im new to this forum..excuse me if anythng wrong. I have a file containing 600 MB data in that. when i do parse the data in perl program im getting out of memory error. so iam planning to split the file into smaller files and process one by one. can any one tell me what is the code... (1 Reply)
Discussion started by: vsnreddy
1 Replies

6. Shell Programming and Scripting

Splitting large file into multiple files in unix based on pattern

I need to write a shell script for below scenario My input file has data in format: qwerty0101TWE 12345 01022005 01022005 datainala alanfernanded 26 qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28 qwerty0101TWE 12342 01022005 07022009 datainalc hitalbert 43 qwerty0101CFG 12345... (19 Replies)
Discussion started by: jimmy12
19 Replies

7. UNIX for Advanced & Expert Users

Splitting a file into small files

Hi Folks, Please help me in solving the problem. I want to write script in order to split a file into small pieces and send it automatically through mail. Ex. The file name is CALM*.txt . It is around 50 MB. I want to split the file into 20 MB 2-3 smaller files and send (like uuencode) it... (6 Replies)
Discussion started by: piyushbhashkar
6 Replies

8. Shell Programming and Scripting

Sed: Splitting A large File into smaller files based on recursive Regular Expression match

I will simplify the explaination a bit, I need to parse through a 87m file - I have a single text file in the form of : <NAME>house........ SOMETEXT SOMETEXT SOMETEXT . . . . </script> MORETEXT MORETEXT . . . (6 Replies)
Discussion started by: sumguy
6 Replies

9. Shell Programming and Scripting

Breaking large file into small files

Dear all, I have huge txt file with the input files for some setup_code. However for running my setup_code, I require txt files with maximum of 1000 input files Please help me in suggesting way to break down this big txt file to small txt file of 1000 entries only. thanks and Greetings, Emily (12 Replies)
Discussion started by: emily
12 Replies

10. UNIX for Beginners Questions & Answers

Split large file into 24 small files on one hour basis

I Have a large file with 24hrs log in the below format.i need to split the large file in to 24 small files on one hour based.i.e ex:from 09:55 to 10:55,10:55-11:55 can any one help me on this.! ... (20 Replies)
Discussion started by: Raghuram717
20 Replies
ECACCESS-JOB-GET(1p)					User Contributed Perl Documentation				      ECACCESS-JOB-GET(1p)

NAME
ecaccess-job-get - Download a Job Output/Input/Error File SYNOPSIS
ecaccess-job-get -version|-help|-manual ecaccess-job-get [-debug] [-input|-error] [-encrypt] [-binary] [-bufsize length] job-id local-target-file DESCRIPTION
Allow downloading the Job Output/Input/Error Files with identifier job-id. The file is downloaded localy in the local-target-file. ARGUMENTS
job-id The identifier of the ECaccess Job to retrieve. local-target-file The name of the Local Target File. OPTIONS
-input By default the Job Output File is downloaded. Using this option allow downloading the Job Input File instead. -error By default the Job Output File is downloaded. Using this option allow downloading the Job Error File instead. -encrypt By default files are downloaded through the plain text channel (http). Using this option will force the download to occurs through the SSL secure channel (https). -binary By default files are downloaded as text files. This option will download files as binary files (decode_base64 required). Please note that text files can also be downloaded in binary mode but text mode is faster. You should use this option if your job output is containing non-text characters. -bufsize length Specify the length of the buffer (in bytes) which is used to download the file. The larger the buffer the smaller the number of http/s requests. By default a buffer of 524288 bytes(512KB) is used. -version Display version number and exits. -help Print a brief help message and exits. -manual Prints the manual page and exits. -debug Display the SOAP messages exchanged. EXAMPLES
ecaccess-job-get 124356 ./ecaccess-job-124356.output Download the output of the ECaccess Job 124356 in the local ecaccess-job-124356.output file. SEE ALSO
ecaccess-job-delete, ecaccess-job-list, ecaccess-job-restart, ecaccess-job-submit and ecaccess. perl v5.14.2 2012-04-16 ECACCESS-JOB-GET(1p)
All times are GMT -4. The time now is 02:01 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy