Split a file with no pattern -- Split, Csplit, Awk
I have gone through all the threads in the forum and tested out different things. I am trying to split a 3GB file into multiple files. Some files are even larger than this.
For example:
This is very slow and it splits the file with 3 million records in each file. But I would like to give the number of files as a parameter and output the user defined file names and not xaa, xab and so on.
I am also trying awk and I know it will be very fast and simple. I read the forum and they are all splitting the files on a specific pattern and I don't require any pattern.
I have an excel file with more than 65K records... Since excel does not take more than 65K records i wan to split the file and send it as two excel files... Could some help me how to use the csplit by specifiying the no of records (7 Replies)
Hi All,
Can someone please help me write a script for the following requirement in awk, grep, sed or perl.
Buuuu xxx bbb
Kmmmm rrr ssss uuuu
Kwwww zzzz ccc
Roooowwww eeee
Bxxxx jjjj dddd
Kuuuu eeeee nnnn
Rpppp cccc vvvv cccc
Rhhhhhhyyyy tttt
Lhhhh rrrrrssssss
Bffff mmmm iiiii
Ktttt... (5 Replies)
Dear all,
I have a large file which is composed of 8000 frames, what i would like to do is split the file into 8000 single files names file.pdb.1, file.pdb.2 etc etc
each frame in the large file is seperated by a "ENDMDL" flag so my thinking is to use this flag a a point to split the files... (4 Replies)
Hello!
Have some problem with extract files from saved session.
File contains any kind of special/printable characters.
DATA NumberA DATA
DATA Begin
DATA1.1
DATA1.2 NumberB1 DATA1.3
DATA1.4
End DATA
DATA
DATA Begin
DATA2.1
DATA2.2 NumberB2 DATA2.3
DATA2.4
End DATA
DATA
... (4 Replies)
Hi all,
I'm pretty new to Shell scripting and I need some help to split a source text file into multiple files. The source has a row with pattern where the file needs to be split, and the pattern row also contains the file name of the destination for that specific piece. Here is an example:
... (2 Replies)
Hello;
I have a file consists of 4 columns separated by tab. The problem is the third fields. Some of the them are very long but can be split by the vertical bar "|". Also some of them do not contain the string "UniProt", but I could ignore it at this moment, and sort the file afterwards. Here is... (5 Replies)
Hello, I have a large file (2GB) that I would like to split based on pattern and size.
I've used the following command to split the file (token is "HELLO")
awk '/HELLO/{i++}{print > "file"i}' input.txt
and the output is similar to the following (i included filesize in KB):
10 ... (2 Replies)
Hello,
I want to split a big file into smaller ones with certain "counts". I am aware this type of job has been asked quite often, but I posted again when I came to csplit, which may be simpler to solve the problem.
Input file (fasta format):
>seq1
agtcagtc
agtcagtc
ag
>seq2
agtcagtcagtc... (8 Replies)
Hi ,
I have huge files around 400 mb, which has clob data and have diffeent scenarios:
I am trying to pass scenario number as parameter and and get required modified file based on the scenario number and criteria.
Scenario 1:
file name : scenario_1.txt
... (2 Replies)
Hello All,
I have records in a file in a pattern A,B,B,B,B,K,A,B,B,K
Is there any command or simple logic I can pull out records into multiple files based on A record? I want output as
File1: A,B,B,B,B,K
File2: A,B,B,K (9 Replies)
Discussion started by: deal1dealer
9 Replies
LEARN ABOUT SUNOS
split
split(1) User Commands split(1)NAME
split - split a file into pieces
SYNOPSIS
split [-linecount | -l linecount] [-a suffixlength] [ file [name]]
split [ -b n | nk | nm] [-a suffixlength] [ file [name]]
DESCRIPTION
The split utility reads file and writes it in linecount-line pieces into a set of output-files. The name of the first output-file is name
with aa appended, and so on lexicographically, up to zz (a maximum of 676 files). The maximum length of name is 2 characters less than the
maximum filename length allowed by the filesystem. See statvfs(2). If no output name is given, x is used as the default (output-files will
be called xaa, xab, and so forth).
OPTIONS
The following options are supported:
-linecount | -l linecounNumber of lines in each piece. Defaults to 1000 lines.
-a suffixlength Uses suffixlength letters to form the suffix portion of the filenames of the split file. If -a is not specified,
the default suffix length is 2. If the sum of the name operand and the suffixlength option-argument would create a
filename exceeding NAME_MAX bytes, an error will result; split will exit with a diagnostic message and no files
will be created.
-b n Splits a file into pieces n bytes in size.
-b nk Splits a file into pieces n*1024 bytes in size.
-b nm Splits a file into pieces n*1048576 bytes in size.
OPERANDS
The following operands are supported:
file The path name of the ordinary file to be split. If no input file is given or file is -, the standard input will be used.
name The prefix to be used for each of the files resulting from the split operation. If no name argument is given, x will be used as
the prefix of the output files. The combined length of the basename of prefix and suffixlength cannot exceed NAME_MAX bytes. See
OPTIONS.
USAGE
See largefile(5) for the description of the behavior of split when encountering files greater than or equal to 2 Gbyte ( 2**31 bytes).
ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of split: LANG, LC_ALL, LC_CTYPE, LC_MES-
SAGES, and NLSPATH.
EXIT STATUS
The following exit values are returned:
0 Successful completion.
>0 An error occurred.
ATTRIBUTES
See attributes(5) for descriptions of the following attributes:
+-----------------------------+-----------------------------+
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
+-----------------------------+-----------------------------+
|Availability |SUNWesu |
+-----------------------------+-----------------------------+
|CSI |enabled |
+-----------------------------+-----------------------------+
|Interface Stability |Standard |
+-----------------------------+-----------------------------+
SEE ALSO csplit(1), statvfs(2), attributes(5), environ(5), largefile(5), standards(5)SunOS 5.10 16 Apr 1999 split(1)