Here is a demonstration of a gathering technique that might be useful here:
producing
The short awk script in file gather collects lines belonging to a sequence into a super line. The newlines are replaced by some character not in the data, here I used "=".
Then the super lines are split into groups of 2.
Another awk script expands the super lines by replacing "=" with a real newline and re-writes the files.
Hi,
I need to split a string, either using awk or cut or basic unix commands (no programming) , with a multibyte charectar as a delimeter.
Ex:
abcd-efgh-ijkl
split by -efgh- to get two segments abcd & ijkl
Is it possible?
Thanks
A.H.S (1 Reply)
I have an excel file with more than 65K records... Since excel does not take more than 65K records i wan to split the file and send it as two excel files... Could some help me how to use the csplit by specifiying the no of records (7 Replies)
I have gone through all the threads in the forum and tested out different things. I am trying to split a 3GB file into multiple files. Some files are even larger than this.
For example:
split -l 3000000 filename.txt
This is very slow and it splits the file with 3 million records in each... (10 Replies)
Hi;
I want to write a shell script that will split a string with no delimiter.
Basically the script will read a line from a file.
For example the line it read from the file contains:
99234523
These values are never the same but the length will always be 8.
How do i split this... (8 Replies)
I’m new to Linux script and not sure how to filter out bad records from huge flat files (over 1.3GB each). The delimiter is a semi colon “;”
Here is the sample of 5 lines in the file:
Name1;phone1;address1;city1;state1;zipcode1
Name2;phone2;address2;city2;state2;zipcode2;comment... (7 Replies)
Hi,
I have a file which has many URLs delimited by space. Now i want them to move to separate files each one holding 10 URLs per file.
http://3276.e-printphoto.co.uk/guardian http://abdera.apache.org/ http://abdera.apache.org/docs/api/index.html
I have used the below code to arrange... (6 Replies)
Hi, all.
I have an input file. I would like to generate 3 types of output files.
Input:
LG10_PM_map_19_LEnd_1000560
LG10_PM_map_6-1_27101856
LG10_PM_map_71_REnd_20597718
LG12_PM_map_5_chr_118419232
LG13_PM_map_121_24341052
LG14_PM_1a_456799
LG1_MM_scf_5a_opt_abc_9029993
... (5 Replies)
Hi,
I have received a file which is 20 GB. We would like to split the file into 4 equal parts and process it to avoid memory issues.
If the record delimiter is unix new line, I could use split command either with option l or b.
The problem is that the line terminator is |##|
How to use... (5 Replies)
I have a large semicolon delimited file with thousands of columns and many thousands of line. It looks like:
ID1;ID2;ID3;ID4;A_1;B_1;C_1;A_2;B_2;C_2;A_3;B_3;C_3
AA;ax;ay;az;01;02;03;04;05;06;07;08;09
BB;bx;by;bz;03;05;33;44;15;26;27;08;09
I want to split this table in to multiple files:
... (1 Reply)
Discussion started by: trymega
1 Replies
LEARN ABOUT SUSE
csplit
CSPLIT(1) User Commands CSPLIT(1)NAME
csplit - split a file into sections determined by context lines
SYNOPSIS
csplit [OPTION]... FILE PATTERN...
DESCRIPTION
Output pieces of FILE separated by PATTERN(s) to files `xx00', `xx01', ..., and output byte counts of each piece to standard output.
Mandatory arguments to long options are mandatory for short options too.
-b, --suffix-format=FORMAT
use sprintf FORMAT instead of %02d
-f, --prefix=PREFIX
use PREFIX instead of `xx'
-k, --keep-files
do not remove output files on errors
-n, --digits=DIGITS
use specified number of digits instead of 2
-s, --quiet, --silent
do not print counts of output file sizes
-z, --elide-empty-files
remove empty output files
--help display this help and exit
--version
output version information and exit
Read standard input if FILE is -. Each PATTERN may be:
INTEGER
copy up to but not including specified line number
/REGEXP/[OFFSET]
copy up to but not including a matching line
%REGEXP%[OFFSET]
skip to, but not including a matching line
{INTEGER}
repeat the previous pattern specified number of times
{*} repeat the previous pattern as many times as possible
A line OFFSET is a required `+' or `-' followed by a positive integer.
AUTHOR
Written by Stuart Kemp and David MacKenzie.
REPORTING BUGS
Report csplit bugs to bug-coreutils@gnu.org
GNU coreutils home page: <http://www.gnu.org/software/coreutils/>
General help using GNU software: <http://www.gnu.org/gethelp/>
COPYRIGHT
Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>.
This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
SEE ALSO
The full documentation for csplit is maintained as a Texinfo manual. If the info and csplit programs are properly installed at your site,
the command
info coreutils 'csplit invocation'
should give you access to the complete manual.
GNU coreutils 7.1 July 2010 CSPLIT(1)