03-06-2012
Split file based on size
Hi Friends,
Below is my requirement. I have a file with the below structure.
0001A1....
0001B1..
....
0001L1
0002A1
0002B1
......
0002L1
..
the first 4 characters are the sequence numbers for a record, A record will start with A1 and end with L1 with same sequence number. Now the file that is generated is too big. I need to split the file into multiple files with file size not exceeding 50MB. Assume if total length of file is 332 mb i need 7 splits with each 50 mb and with one condition the records should not be broken. Example: A1..L1 should not split into two files.
Please let me know your thoughts and help me out in this.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
I have a large file with a repeating pattern in it. Now i want the file split into the block of patterns with a specified no. of lines in each file.
i.e. The file is like
1...
2...
2...
3...
1...
2...
3...
1...
2...
2...
2...
2...
2...
3...
where 1 is the start of the block... (5 Replies)
Discussion started by: sudhamacs
5 Replies
2. Shell Programming and Scripting
Dear all,
I have a large file which is composed of 8000 frames, what i would like to do is split the file into 8000 single files names file.pdb.1, file.pdb.2 etc etc
each frame in the large file is seperated by a "ENDMDL" flag so my thinking is to use this flag a a point to split the files... (4 Replies)
Discussion started by: Mish_99
4 Replies
3. UNIX for Dummies Questions & Answers
I have a few txt files in some directory and I need to check their sizes one by one. If any of them are greater than 5mb then I need to split the file in two.
Can someone help?
Thanks. (6 Replies)
Discussion started by: khanvader
6 Replies
4. Shell Programming and Scripting
Hello, I have a large file (2GB) that I would like to split based on pattern and size.
I've used the following command to split the file (token is "HELLO")
awk '/HELLO/{i++}{print > "file"i}' input.txt
and the output is similar to the following (i included filesize in KB):
10 ... (2 Replies)
Discussion started by: jl487
2 Replies
5. Shell Programming and Scripting
I need to split a file if it is over 2GB in size (or any size), preferably split on the lines. I have figured out how to get the file size using awk, and I can split the file based on the number of lines (which I got with wc -l) but I can't figure out how to connect them together in the script.
... (6 Replies)
Discussion started by: ssemple2000
6 Replies
6. Shell Programming and Scripting
Hi ,
I have huge files around 400 mb, which has clob data and have diffeent scenarios:
I am trying to pass scenario number as parameter and and get required modified file based on the scenario number and criteria.
Scenario 1:
file name : scenario_1.txt
... (2 Replies)
Discussion started by: sol_nov
2 Replies
7. UNIX for Dummies Questions & Answers
i have file1.txt
asdas|csada|130310|0423|A1|canberra
sdasd|sfdsf|130426|2328|A1|sydney
Expected output : on eaceh third and fourth colum, split into each two characters
asdas|csada|13|03|10|04|23|A1|canberra
sdasd|sfdsf|13|04|26|23|28|A1|sydney (10 Replies)
Discussion started by: radius
10 Replies
8. Shell Programming and Scripting
I have to split a file based on number of lines and the below command works fine:
split -l 2 Inputfile -d OutputfileMy input file contains header, detail and trailor info as below:
H
D
D
D
D
TMy split files for the above command contains:
First File:
H
DSecond File:
... (11 Replies)
Discussion started by: Ajay Venkatesan
11 Replies
9. Shell Programming and Scripting
I have a file that is about 7 GB in size. The requirement is I should split the file equally in such a way that the size of the split files is less than 2Gb. If the file is less than 2gb, than nothing needs to be done. ( need to done using shell script)
Thanks, (4 Replies)
Discussion started by: rudoraj
4 Replies
10. UNIX for Beginners Questions & Answers
Hi,
I have a directory in Unix and there are folders available in the directory.
Files are created on different month and now i have a requirement to calculate size of the folder on month basis.
Is there any Unix command to check this please??
Thanks (6 Replies)
Discussion started by: Nivas
6 Replies
split(1) User Commands split(1)
NAME
split - split a file into pieces
SYNOPSIS
split [-linecount | -l linecount] [-a suffixlength] [ file [name]]
split [ -b n | nk | nm] [-a suffixlength] [ file [name]]
DESCRIPTION
The split utility reads file and writes it in linecount-line pieces into a set of output-files. The name of the first output-file is name
with aa appended, and so on lexicographically, up to zz (a maximum of 676 files). The maximum length of name is 2 characters less than the
maximum filename length allowed by the filesystem. See statvfs(2). If no output name is given, x is used as the default (output-files will
be called xaa, xab, and so forth).
OPTIONS
The following options are supported:
-linecount | -l linecounNumber of lines in each piece. Defaults to 1000 lines.
-a suffixlength Uses suffixlength letters to form the suffix portion of the filenames of the split file. If -a is not specified,
the default suffix length is 2. If the sum of the name operand and the suffixlength option-argument would create a
filename exceeding NAME_MAX bytes, an error will result; split will exit with a diagnostic message and no files
will be created.
-b n Splits a file into pieces n bytes in size.
-b nk Splits a file into pieces n*1024 bytes in size.
-b nm Splits a file into pieces n*1048576 bytes in size.
OPERANDS
The following operands are supported:
file The path name of the ordinary file to be split. If no input file is given or file is -, the standard input will be used.
name The prefix to be used for each of the files resulting from the split operation. If no name argument is given, x will be used as
the prefix of the output files. The combined length of the basename of prefix and suffixlength cannot exceed NAME_MAX bytes. See
OPTIONS.
USAGE
See largefile(5) for the description of the behavior of split when encountering files greater than or equal to 2 Gbyte ( 2**31 bytes).
ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of split: LANG, LC_ALL, LC_CTYPE, LC_MES-
SAGES, and NLSPATH.
EXIT STATUS
The following exit values are returned:
0 Successful completion.
>0 An error occurred.
ATTRIBUTES
See attributes(5) for descriptions of the following attributes:
+-----------------------------+-----------------------------+
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
+-----------------------------+-----------------------------+
|Availability |SUNWesu |
+-----------------------------+-----------------------------+
|CSI |enabled |
+-----------------------------+-----------------------------+
|Interface Stability |Standard |
+-----------------------------+-----------------------------+
SEE ALSO
csplit(1), statvfs(2), attributes(5), environ(5), largefile(5), standards(5)
SunOS 5.10 16 Apr 1999 split(1)