Thanks for the solutions but it doesn't fulfil my requirement. As I mentioned my data file contains approx 4 million records and I want to create an output file of 500,000 recs each naming like file1, file2....file10.
While spliting a file when 500,000 rec mark is reached, I want to make sure that I am not spliting records of same key e.g.(100) across 2 output files so I want to keep all same key records in same output file, can be file1 or file2 doesn't matter.
Not very neat coding but I was able to split on every 500,000 recs by following code but keeping same key records is a challenge for me.
Best regards,
K
Last edited by Scott; 01-19-2011 at 11:06 AM..
Reason: Please use code tags
All,
We receive a file with a large no of records (records can vary) and we have to split it into two files based on another file. e.g.
File1:
UHDR 2008112
"25187","00000022","00",21-APR-1991,"" ,"D",-000000519,+0000000000,"C", ,+000000000,+000000000,000000000,"2","" ... (2 Replies)
For example suppose I have a file which contains data as:
$cat data
800,2
100,9
700,3
100,9
200,8
100,3
Now I want the output as
200,8
700,3
800,2
Key is first three characters, I don't want any reords which are having duplicate keys.
Like sort +0.0 -0.3 data can we use... (9 Replies)
I need to write a shell script for below scenario
My input file has data in format:
qwerty0101TWE 12345 01022005 01022005 datainala alanfernanded 26
qwerty0101mXZ 12349 01022005 06022008 datainalb johngalilo 28
qwerty0101TWE 12342 01022005 07022009 datainalc hitalbert 43
qwerty0101CFG 12345... (19 Replies)
Hi Experts,
I have to split huge file based on the pattern to create smaller files. The pattern which is expected in the file is:
Master.....
First...
second....
second...
third..
third...
Master...
First..
second...
third...
Master...
First...
second..
second..
second..... (2 Replies)
Hello,
For the input file, I am trying to split those records which have multiple values seperated by '|' in the last input field, into multiple records and each record corresponds to the common input fields + one of the value from the last field.
I was trying with an example on this forum... (4 Replies)
I am trying to update an older program on a small cluster. It uses individual files to send jobs to each node. However the newer database comes as one large file, containing over 10,000 records. I therefore need to split this file. It looks like this:
HMMER3/b
NAME 1-cysPrx_C
ACC ... (2 Replies)
A text file has 2 fields (Data, Filename) delimited by # as below,
Data,Filename
Row1 -> abc#Test1.xml
Row2 -> xyz#Test2.xml
Row3 -> ghi#Test3.xml
The content in first field has to be written into a file where filename should be considered from second field.
So from... (4 Replies)
I will simplify the explaination a bit, I need to parse through a 87m file -
I have a single text file in the form of :
<NAME>house........
SOMETEXT
SOMETEXT
SOMETEXT
.
.
.
.
</script>
MORETEXT
MORETEXT
.
.
. (6 Replies)
Hi All,
This is my first post here. Hoping to share and gain knowledge from this great forum !!!!
I've scanned this forum before posting my problem here, but I'm afraid I couldn't find any thread that addresses this exact problem.
I'm trying to split a large XML file (with multiple tag... (7 Replies)
Hello I have a file of following format
HDR 1234 abc qwerty
abc def ghi jkl
HDR 4567 xyz qwerty
abc def ghi jkl
HDR 890 mno qwerty
abc def ghi jkl
HDR 1234 abc qwerty
abc def ghi jkl
HDR 1234 abc qwerty
abc def ghi jkl
-Need to split this into multiple files based on tag... (8 Replies)
Discussion started by: wincrazy
8 Replies
LEARN ABOUT OSF1
merge
merge(1)merge(1)NAME
merge - three-way file merge
SYNOPSIS
merge [-Llabel1 [-Llabel3]] [-p] [-q] file1 file2 file3
DESCRIPTION
merge incorporates all changes that lead from file2 to file3 into file1. The result goes to standard output if -p is present, into file1
otherwise. merge is useful for combining separate changes to an original. Suppose file2 is the original, and both file1 and file3 are
modifications of file2. Then merge combines both changes.
An overlap occurs if both file1 and file3 have changes in a common segment of lines. On a few older hosts where diff3 does not support the
-E option, merge does not detect overlaps, and merely supplies the changed lines from file3. On most hosts, if overlaps occur, merge out-
puts a message (unless the -q option is given), and includes both alternatives in the result. The alternatives are delimited as follows:
<<<<<<< file1 lines in file1 ======= lines in file3 >>>>>>> file3
If there are overlaps, the user should edit the result and delete one of the alternatives. If the -L label1 and -L label3 options are
given, the labels are output in place of the names file1 and file3 in overlap reports.
DIAGNOSTICS
Exit status is 0 for no overlaps, 1 for some overlaps, 2 for trouble.
IDENTIFICATION
Author: Walter F. Tichy.
Revision Number: 1.1.6.2; Release Date: 1993/10/07.
Copyright (C) 1982, 1988, 1989 by Walter F. Tichy.
Copyright (C) 1990, 1991 by Paul Eggert.
SEE ALSO diff3(1), diff(1), rcsmerge(1), co(1)merge(1)