Sponsored Content
Top Forums Shell Programming and Scripting spliting 4gb files to 4*1 gb each Post 302211917 by Vi-Curious on Saturday 5th of July 2008 06:47:24 AM
Old 07-05-2008
I just saw another post of yours that concerned a problem opening large files. Based on that, I'm going to assume that you will process the log file in these split pieces. If so, you need to split it based on line count.

You can use split or csplit. To determine the line count for the split files, first determine how many lines are in the logfile. You could form an expression to have the shell do the calculation and give you the final number but I won't bother with that here since I don't know which shell you use and you only asked for syntax.

> wc -l logfile
Whatever number is returned, divide by 4 and round up.

Let's say that logfile has 7607255 lines. Diving and rounding gives you approximately 1901814 lines per split-file.

Using split:
> split -1901814 biglogfile splitfile
will give you splitfileaa, splitfileab, splitfileac and splitfilead.


Using csplit:
> csplit -k -f csplitfile biglogfile 1901814 {2}
will give you csplitfile00, csplitfile01, csplitfile02 and csplitfile03.

In this case, I don't think the -k is really necessary but it doesn't hurt to include it just in case some error is encountered. On my system, the first file generated by csplit actually has 1 less line in it.


Whichever method you use, unless there is some anomaly with your log file, the file sizes should be pretty close to 1GB in size.
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help on Spliting files - urgent

Hi Script Masters I have a strange requirement. Please help. I am using C shell. I have a file like the below in sorted order 22 23 25 34 37 45 67 342 456 476 543 677 789 Now I have to split the file in such a way that first 5 of 2 digit number should be saved as aaa.in and the... (8 Replies)
Discussion started by: rajee
8 Replies

2. AIX

fiber with 4GB

Hi, It's my first time I will use a 4BG fiber with two ports on one card. I used only one port 2GB before. What will be the drivers I need to install on my AIX5.3? Is the two ports going to be just one line? Or I can use the other ports on another connection. Thanks in advance, itik (1 Reply)
Discussion started by: itik
1 Replies

3. IP Networking

4GB Dual Port HBA

I have one 4GB, dual port HBA. Is each port rated for 4GB, or is it the whole HBA rated for 4GB? Also, how do I determine which is port0 / port1 with "lscfg -vl fcs0" command. Thanks. (0 Replies)
Discussion started by: jwholey
0 Replies

4. UNIX for Dummies Questions & Answers

How to copy my system hdd usb stick from 4GB to 8GB ?

Hi, my router is my Linux embedded device. I have system installed on HDD 4GB usb stick, part1 swap, part2 /opt , part3 data. I need to copy my system to new HDD 8GB usb stick. What is a way for 4GB > 4GB HDD and what for 4GB > 8GB As I remeber, I can copy image of my 4GB HDD usb stick... (7 Replies)
Discussion started by: jack2
7 Replies

5. UNIX for Dummies Questions & Answers

Spliting of two files

hi I have a log file which contains some reports. The log file looks like this:- STARTOFREPORT /tmp file1.txt some text to be folowd ENDOFREPORT some non utilized characters STARTOFREPORT /log file2.txt more text (3 Replies)
Discussion started by: infyanurag
3 Replies

6. Linux

Need assistance to enable more that 4GB RAM on Linux 32Bit OS.

How to enable more than 4GB RAM support on Linux 32bit OS? OS: CentOS release 5.4 (Final) Kernel version: 2.6.18-53.el5 Arch: 32Bit I got solution at Innovationframes.com • View topic - How to enable more than 4GB RAM support on Linux 32bit OS? but my question is the steps given... (5 Replies)
Discussion started by: chandranjoy
5 Replies

7. AIX

Fork Function Failed on 4GB ?

Hello, I am running Oracle Database and after a while I keep getting this message whenever I execute any command. I cannot execute any command even shutdown, whenever I execute any command , I get this message /usr/bin/ksh: 0403-031 The fork function failed. There is not enough memory... (7 Replies)
Discussion started by: filosophizer
7 Replies

8. UNIX for Dummies Questions & Answers

7z command for files larger than 4GB ( unzip doesn't work)

My unzip command doesn't work for files that are greater than 4GB. Consider my file name is unzip -p -a filename.zip, the command doesn't work since the size of the file is larger. I need to know the corresponding 7z command for the same. This is my Unix shell script program: if then ... (14 Replies)
Discussion started by: chandraprakash
14 Replies

9. UNIX for Beginners Questions & Answers

32 bit process addressing more than 4GB

Hello for all, I am testing the behavior of a 32 bit application running on Solaris 5.10 (SPARC), and realize it reaches 4GB of memory and then crashes. It doesn't matter the amount of used memory as application is intended to perform many transactions; rather, what I want to achieve is to... (2 Replies)
Discussion started by: Leito7824
2 Replies
csplit(1)							   User Commands							 csplit(1)

NAME
csplit - split files based on context SYNOPSIS
csplit [-ks] [-f prefix] [-n number] file arg1... argn DESCRIPTION
The csplit utility reads the file named by the file operand, writes all or part of that file into other files as directed by the arg oper- ands, and writes the sizes of the files. OPTIONS
The following options are supported: -f prefix Names the created files prefix00, prefix01, ..., prefixn. The default is xx00 ... xxn. If the prefix argument would create a file name exceeding 14 bytes, an error results. In that case, csplit exits with a diagnostic message and no files are created. -k Leaves previously created files intact. By default, csplit removes created files if an error occurs. -n number Uses number decimal digits to form filenames for the file pieces. The default is 2. -s Suppresses the output of file size messages. OPERANDS
The following operands are supported: file The path name of a text file to be split. If file is -, the standard input will be used. The operands arg1 ... argn can be a combination of the following: /rexp/[offset] Create a file using the content of the lines from the current line up to, but not including, the line that results from the evaluation of the regular expression with offset, if any, applied. The regular expression rexp must follow the rules for basic regular expressions. Regular expressions can include the use of '/' and '\%'. These forms must be properly quoted with single quotes, since "" is special to the shell. The optional offset must be a positive or negative integer value representing a number of lines. The integer value must be preceded by + or -. If the selection of lines from an offset expression of this type would create a file with zero lines, or one with greater than the number of lines left in the input file, the results are unspecified. After the section is created, the current line will be set to the line that results from the evaluation of the regular expression with any offset applied. The pattern match of rexp always is applied from the current line to the end of the file. %rexp%[offset] This operand is the same as /rexp/[offset], except that no file will be created for the selected section of the input file. line_no Create a file from the current line up to (but not including) the line number line_no. Lines in the file will be numbered starting at one. The current line becomes line_no. {num} Repeat operand. This operand can follow any of the operands described previously. If it follows a rexp type operand, that operand will be applied num more times. If it follows a line_no operand, the file will be split every line_no lines, num times, from that point. An error will be reported if an operand does not reference a line between the current position and the end of the file. USAGE
See largefile(5) for the description of the behavior of csplit when encountering files greater than or equal to 2 Gbyte (2^31 bytes). EXAMPLES
Example 1 Splitting and combining files This example creates four files, cobol00...cobol03. example% csplit -f cobol filename '/procedure division/' /par5./ /par16./ After editing the split files, they can be recombined as follows: example% cat cobol0[0-3] > filename This example overwrites the original file. Example 2 Splitting a file into equal parts This example splits the file at every 100 lines, up to 10,000 lines. The -k option causes the created files to be retained if there are less than 10,000 lines; however, an error message would still be printed. example% csplit -k filename 100 {99} Example 3 Creating a file for separate C routines If prog.c follows the normal C coding convention (the last line of a routine consists only of a } in the first character position), this example creates a file for each separate C routine (up to 21) in prog.c. example% csplit -k prog.c '%main(%' '/^}/+1' {20} ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of csplit: LANG, LC_ALL, LC_COLLATE, LC_CTYPE, LC_MESSAGES, and NLSPATH. EXIT STATUS
The following exit values are returned: 0 Successful completion. >0 An error occurred. ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWesu | +-----------------------------+-----------------------------+ |CSI |Enabled | +-----------------------------+-----------------------------+ |Interface Stability |Standard | +-----------------------------+-----------------------------+ SEE ALSO
sed(1), split(1), attributes(5), environ(5), largefile(5), standards(5) DIAGNOSTICS
The diagnostic messages are self-explanatory, except for the following: arg - out of range The given argument did not reference a line between the current position and the end of the file. SunOS 5.11 4 Dec 2003 csplit(1)
All times are GMT -4. The time now is 02:56 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy