$ ./s1
Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution : Debian 8.6 (jessie)
bash GNU bash 4.3.30
dixf (local) 1.12
gate (local) 1.10
gate Group, collect, sift lines, separate lines into buckets. (what)
Path : ~/bin/gate
Length : 278 lines
Type : Perl script, ASCII text executable
Shebang : #!/usr/bin/perl
Help : probably available with [ ]-h
-----
Input data file data1, lines = 50:
year,min,max,Instrument,Testcase
2016,201,1003,GEOTROPH-02116,TATA7980
2016,53,1011,GEOTROPH-01963,TATA7980
2016,3,1024,GEOTROPH-02067,TATA7980
2016,203,1027,GEOTROPH-02011,TATA7980
2016,135,1016,GEOTROPH-01932,TATA3405
2016,71,1036,GEOTROPH-02030,TATA7980
2016,173,1046,GEOTROPH-02033,TATA7980
2016,183,1015,GEOTROPH-02120,TATA7980
2016,47,1037,GEOTROPH-02154,TATA7980
-----
Expected output:
TATA79802016__testcase.csv
and TATA34052016__testcase.csv
-----
Results, dry run:
9 entries in group (good) TATA3405
40 entries in group (good) TATA7980
1 entries in group (good) Testcase
-----
Results, real run:
1 Testcase
9 TATA3405
40 TATA7980
50 total
-----
Documentation for gate:
NAME
gate - Group And Transfer Entries
VERSION
1.0
USAGE
gate [-d] [-c m,n] [-f n] [-h] [-s c] [files]
REQUIRED ARGUMENTS
None.
OPTIONS
-d <off>
dry-run, determine filenames and counts. A basic sanity check on
filenames is performed. This is recommended for a first run.
-c m,n <0>
Select column mode: define columns m through n as the filenames to
which lines will be written. The columns are numbered beginning with
1 on the left.
-f number <0>
Choose field number as the group. Data will be written to filenames
corresponding to the contents of this field. The field numbering
begins with 1 on the left.
-h <off>
Print this man page.
-s character string <,>
Select one or more characters as the separator character.
files <STDIN>
Specify files to process.
DESCRIPTION
For each line read, the id value from the field or columns specified is
isolated from each and used as a filename, to which that line is
written. As many files are opened and written as are needed.
This may not be the final approach, but it would be a start for us.
Best wishes ... cheers, drl
---------- Post updated at 10:18 ---------- Previous update was at 09:46 ----------
Hi.
Quote:
Originally Posted by RavinderSingh13
... I would like to mention here wouldn't be it good to close the opened files or if there are too many files opened it may be an issue with it. ...
I seem to recall testing the timing between closing and not closing files in this kind of operation. My recollection is that it added a large amount of time to do the close-open-close... operation.
While there may be situations where this might be needed. I'd be tempted to consider splitting the input file, and do separate operations on the split pieces (for example). Multiple cores/CPU can help with parallelization in such cases, say with command parallel.
I have one large file, after every 200 line i have to split the file and the add header and footer to each small file?
It is possible to add different header and footer to each file? (1 Reply)
I have one large file, after every 200 line i have to split the file and the add header and footer to each small file?
It is possible to add different header and footer to each file? (7 Replies)
So, I have a massive file with thousands of columns
I want a list of the headers in one column in another file.
So I need to strip off the top line (can use head-1)
But how can I convert from this format:
A B C D E F G
to
A
B
C
D
E
F
G (6 Replies)
Hi All,
I have a file example.csv which looks like this
GrpID,TargetID,Signal,Avg_Num
CSCH74_1_1,2007,61,256
CSCH74_1_1,212007,647,679
CSCH74_1_1,12007,3,32
CSCH74_1_1,207,299,777
I want the output as
GrpID,TragetID,Signal-CSCH74_1_1,Avg_Num
CSCH74_1_1,2007,61,256... (4 Replies)
Hi All,
I have a file example.csv which looks like this
GrpID,TargetID,Signal,Avg_Num
CSCH74_1_1,2007,61,256
CSCH74_1_1,212007,647,679
CSCH74_1_1,12007,3,32
CSCH74_1_1,207,299,777
I want the output as
GrpID,TragetID,Signal-CSCH74_1_1,Avg_Num
CSCH74_1_1,2007,61,256... (1 Reply)
Hi All,
I have a problem with the column heading while spooling the data from the database. Since i want the column header, therefore i am not using SET HEADING OFF, and i am getting the header, but for the longer column name, some of the character from the column name is missing.
for... (7 Replies)
I need to split the file based on pattern from position 34-37 while retaining the header and trailer records in each individual split file
Also is it possible to output the TOM and PAT records in the same output file ?
I need the output file names same as xyz_pattern_Datetimestamp.txt
... (23 Replies)
Hi,
I am using SUN SOLARIS (SunOS sun4v sparc SUNW, T5240).
I have a huge data file with header and trailer. This file gets used into an ETL process. ETL skips the header record (which is the first record of the file) and loads the rest of the record. The file can be delimited (comma,... (5 Replies)
I need to split a file based on headers found
Input file
file1 content:
ADD
john
mickey
DROP
matt
sam
output of file F1
john
mickey
output of file F2
matt
sam (5 Replies)
Split large xml into mutiple files and with header and footer in file
tried below
it splits unevenly and also i need help in adding header and footer
command :
csplit -s -k -f my_XML_split.xml extrfile.xml "/<Document>/" {1}
sample xml
<?xml version="1.0" encoding="UTF-8"?><Recipient>... (36 Replies)