Sponsored Content
Top Forums Shell Programming and Scripting Splitting a large file, split command will not do. Post 302381960 by jwillis0720 on Monday 21st of December 2009 02:43:45 PM
Old 12-21-2009
Splitting a large file, split command will not do.

Hello Everyone,

I have a large file that needs to be split into many seperate files, however the text in between the blank lines need to be intact. The file looks like

SomeText
SomeText
SomeText

SomeOtherText
SomeOtherText

....


Since the number of lines of text are different for each entry, my only real marker is a blank line. I have tried the following
cat largetxtfile.txt | awk -f
Code:
BEGIN{i=0}

{

if($0=="")

        {

          ++linecount;

        }

if(linecount%500 != 0)

        {

        print $0 >> i".txt"

        }

        else

        {

         ++i

        }

}

This should split the file at every 500 entries to a separate file. It sort of works but doubles up the files, I'm not sure if my logic is wrong.

Please Help.

J
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Splitting a large log file

Okay, absolute newbie here... I'm on a Mac trying to split an almost 2 Gig log file on a Unix box into manageable chunks for my web-based log analysis tool. What do I need to do, what programs do I need to do it? All and any help appreciated/needed :-) Cheers (8 Replies)
Discussion started by: simmonet
8 Replies

2. Shell Programming and Scripting

Splitting large file into small files

Hi, I need to split a large file into small files based on a string. At different palces in the large I have the string ^Job. I need to split the file into different files starting from ^Job to the last character before the next ^Job. Also all the small files should be automatically named.... (4 Replies)
Discussion started by: dncs
4 Replies

3. UNIX for Dummies Questions & Answers

splitting the large file into smaller files

hi all im new to this forum..excuse me if anythng wrong. I have a file containing 600 MB data in that. when i do parse the data in perl program im getting out of memory error. so iam planning to split the file into smaller files and process one by one. can any one tell me what is the code... (1 Reply)
Discussion started by: vsnreddy
1 Replies

4. Shell Programming and Scripting

Help with splitting a large text file into smaller ones

Hi Everyone, I am using a centos 5.2 server as an sflow log collector on my network. Currently I am using inmons free sflowtool to collect the packets sent by my switches. I have a bash script running on an infinate loop to stop and start the log collection at set intervals - currently one... (2 Replies)
Discussion started by: lord_butler
2 Replies

5. Shell Programming and Scripting

awk - splitting 1 large file into multiple based on same key records

Hello gurus, I am new to "awk" and trying to break a large file having 4 million records into several output files each having half million but at the same time I want to keep the similar key records in the same output file, not to exist accross the files. e.g. my data is like: Row_Num,... (6 Replies)
Discussion started by: kam66
6 Replies

6. Shell Programming and Scripting

splitting a large text file into paragraphs

Hello all, newbie here. I've searched the forum and found many "how to split a text file" topics but none that are what I'm looking for. I have a large text file (~15 MB) in size. It contains a variable number of "paragraphs" (for lack of a better word) that are each of variable length. A... (3 Replies)
Discussion started by: lupin..the..3rd
3 Replies

7. Shell Programming and Scripting

Problem with splitting large file based on pattern

Hi Experts, I have to split huge file based on the pattern to create smaller files. The pattern which is expected in the file is: Master..... First... second.... second... third.. third... Master... First.. second... third... Master... First... second.. second.. second..... (2 Replies)
Discussion started by: saisanthi
2 Replies

8. Shell Programming and Scripting

Splitting large file and renaming based on field

I am trying to update an older program on a small cluster. It uses individual files to send jobs to each node. However the newer database comes as one large file, containing over 10,000 records. I therefore need to split this file. It looks like this: HMMER3/b NAME 1-cysPrx_C ACC ... (2 Replies)
Discussion started by: fozrun
2 Replies

9. Shell Programming and Scripting

Help with Splitting a Large XML file based on size AND tags

Hi All, This is my first post here. Hoping to share and gain knowledge from this great forum !!!! I've scanned this forum before posting my problem here, but I'm afraid I couldn't find any thread that addresses this exact problem. I'm trying to split a large XML file (with multiple tag... (7 Replies)
Discussion started by: Aviktheory11
7 Replies

10. Shell Programming and Scripting

Splitting a large file as per date

Hi, I need a suggestion for an issue in UNIX file. I have a log file in my system where data is appending everyday and as a consequence the file is increasing heavily everyday. Now I need a logic to split this file daily basis and remove the files more than 15 days. Request you to... (3 Replies)
Discussion started by: bhaski2012
3 Replies
split(1)							   User Commands							  split(1)

NAME
split - split a file into pieces SYNOPSIS
split [-linecount | -l linecount] [-a suffixlength] [file [name]] split [-b n | nk | nm] [-a suffixlength] [file [name]] DESCRIPTION
The split utility reads file and writes it in linecount-line pieces into a set of output-files. The name of the first output-file is name with aa appended, and so on lexicographically, up to zz (a maximum of 676 files). The maximum length of name is 2 characters less than the maximum filename length allowed by the filesystem. See statvfs(2). If no output name is given, x is used as the default (output-files will be called xaa, xab, and so forth). OPTIONS
The following options are supported: -linecount | -l linecount Number of lines in each piece. Defaults to 1000 lines. -a suffixlength Uses suffixlength letters to form the suffix portion of the filenames of the split file. If -a is not specified, the default suffix length is 2. If the sum of the name operand and the suffixlength option-argument would create a filename exceeding NAME_MAX bytes, an error will result; split will exit with a diagnostic message and no files will be created. -b n Splits a file into pieces n bytes in size. -b nk Splits a file into pieces n*1024 bytes in size. -b nm Splits a file into pieces n*1048576 bytes in size. OPERANDS
The following operands are supported: file The path name of the ordinary file to be split. If no input file is given or file is -, the standard input will be used. name The prefix to be used for each of the files resulting from the split operation. If no name argument is given, x will be used as the prefix of the output files. The combined length of the basename of prefix and suffixlength cannot exceed NAME_MAX bytes. See OPTIONS. USAGE
See largefile(5) for the description of the behavior of split when encountering files greater than or equal to 2 Gbyte ( 2^31 bytes). ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of split: LANG, LC_ALL, LC_CTYPE, LC_MES- SAGES, and NLSPATH. EXIT STATUS
The following exit values are returned: 0 Successful completion. >0 An error occurred. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWesu | +-----------------------------+-----------------------------+ |CSI |Enabled | +-----------------------------+-----------------------------+ |Interface Stability |Committed | +-----------------------------+-----------------------------+ |Standard |See standards(5). | +-----------------------------+-----------------------------+ SEE ALSO
csplit(1), statvfs(2), attributes(5), environ(5), largefile(5), standards(5) SunOS 5.11 16 Apr 1999 split(1)
All times are GMT -4. The time now is 08:15 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy