10-17-2008
splitting the files
Hi,
I have some files with 2 million odd records which i need to split into chunks of 0.5 millions. I have the file sorted with a key column in order. The same key value can appear as 4 or 5 records in the file.
Hence after splitting we are checking whether all the key values are present in the same file.
After splitting we will be running all the files in parallel, so we need all the key values in one file.
For ex key column is column A
column A column B
--------- --------
AAAAAAA 1234244
BBBBBBBB 8734793
BBBBBBBB 3925873
BBBBBBBB 9085000
CCCCCCC 3094823
DDDDDDD 9084509
Here when i split this file into two. i need the first 4 columns in the same file since the the Column A is the key column and i want the key column value BBBBBBB in the same file.
When i split the file using split command, it looks like it doesnt split in the order the file is.
Is there a way to split the file in the same order as it it.
i.e
AAAAAAA 1234244
BBBBBBBB 8734793
BBBBBBBB 3925873
in one file and
BBBBBBBB 9085000
CCCCCCC 3094823
DDDDDDD 9084509
in another file. after this i can manually check for the key columns and move up or down as needed.
how can i split it in the same order as it is?
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
I'am using HP-UX.I have a input file which has 102 drop statements in it.I'am using csplit to split the files.The upper limit is 99 only.I'am using the -n 102 option.It says "suffix size not vaild".Any suggestions how to do it using csplit?
Thanx in advance,
sounder. (1 Reply)
Discussion started by: sounder123
1 Replies
2. Shell Programming and Scripting
Hi Unix gurus,
We have a masterfile which is to be split into smallerfiles with names as
masterfile00,masterfile01,masterfile03...etal
I was able to split the file using the "Split" cmd
but as masterfileaa,masterfileab..
Is it posiible to change the default suffix?
or is there any other... (2 Replies)
Discussion started by: Rvbs
2 Replies
3. UNIX for Advanced & Expert Users
Hi,
How can i split the big file by the lines?. For eg. I wanna split the file from the line 140 to 1700. (9 Replies)
Discussion started by: sharif
9 Replies
4. Shell Programming and Scripting
Hi,
I needs to split *.txt files from single directory depends on the some mutltiple input values. i have wrote the code like below
for file in *.txt
do
grep -i -h "value1|value2" $file > $file;
done.
My requirment is more input values needs to be given in grep; let us say 50... (3 Replies)
Discussion started by: arund_01
3 Replies
5. Shell Programming and Scripting
Hi,
I have an input file like:
111
abcdefgh
asdfghjk
dfghjkl
222
aaaaaaa
bbbbbb
333
djfhfgjktitjhgfkg
444
djdhfjkhfjkghjkfg
hsbfjksdbhjkgherjklg
fjkhfjklsahjgh
fkrjkgnj
I want to read this input file and make separate output files with the header as numric value like "111"... (9 Replies)
Discussion started by: saltysumi
9 Replies
6. UNIX for Dummies Questions & Answers
Hi Gurus,
Lets say i have a file with some 30 records...
How can i split that file into 3 files
Also it shud be dynamic in the sense..
I wouldnt mind if file 1 has 15, file 2 has 10 and file 3 has 5....
Please help..
Thanks (6 Replies)
Discussion started by: saggiboy10
6 Replies
7. Shell Programming and Scripting
Hello all
I have a file which has around 80 million records, I want to split it to 12 equal files, I tried using the split command but it is allowing me to split according to number of lines or by size. Is there a way i can split the file into 12 files without worrying about the number of lines... (7 Replies)
Discussion started by: Sri3001
7 Replies
8. Shell Programming and Scripting
I want a script to split my file upon the last field (15)
As file
A,b,c,.......,01
C,v,n,.......,02
C,r,v,........,01
F,s,a,........,03
X,y,d,........,99
To make output
01.txt
A,b,c,.......,01
C,r,v,........,01
02.txt
C,v,n,.......,02 (12 Replies)
Discussion started by: teefa
12 Replies
9. UNIX for Dummies Questions & Answers
I am trying to split my IRSSI logs into weekly and monthly .log files. My log format looks like this:
--- Day changed Fri Mar 04 2016
00:11 <Jack> Test
--- Day changed Sat Mar 05 2016
00:11 <Jack> Test
--- Day changed Sun Mar 06 2016
15:20 <Jack> Test
The script I have been playing... (2 Replies)
Discussion started by: Stacked
2 Replies
10. UNIX for Beginners Questions & Answers
i use the split command to split a one terabyte backup file into 10 chunks of 100 GB each. The files are split one after the other. While the files is being split, I will like to scp the files one after the other as soon as the previous one completes, from server A to Server B. Then on server B ,... (2 Replies)
Discussion started by: malaika
2 Replies
LEARN ABOUT DEBIAN
funindex
funindex(1) SAORD Documentation funindex(1)
NAME
funindex - create an index for a column of a FITS binary table
SYNOPSIS
funindex <switches> <iname> [oname]
OPTIONS
NB: these options are not compatible with Funtools processing. Please
use the defaults instead.
-c # compress output using gzip"
-a # ASCII output, ignore -c (default: FITS table)"
-f # FITS table output (default: FITS table)"
-l # long output, i.e. with key value(s) (default: long)"
-s # short output, i.e. no key value(s) (default: long)"
DESCRIPTION
The funindex script creates an index for the specified column (key) by running funtable -s (sort) and then saving the column value and the
record number for each sorted row. This index will be used automatically
by funtools filtering of that column, provided the index file's modification date is later than that of the data file.
The first required argument is the name of the FITS binary table to index. Please note that text files cannot be indexed at this time. The
second required argument is the column (key) name to index. While multiple keys can be specified in principle, the funtools index process-
ing assume a single key and will not recognize files containing multiple keys.
By default, the output index file name is [root]_[key].idx, where [root] is the root of the input file. Funtools looks for this specific
file name when deciding whether to use an index for faster filtering. Therefore, the optional third argument (output file name) should not
be used for funtools processing.
For example, to create an index on column Y for a given FITS file, use:
funindex foo.fits Y
This will generate an index named foo_y.idx, which will be used by funtools for filters involving the Y column.
SEE ALSO
See funtools(7) for a list of Funtools help pages
version 1.4.2 January 2, 2008 funindex(1)