11-19-2014
Split File based on number of rows
Hi
I have a requirement, where i will receive multiple files in a folder (say: /fol1/fol2/). There will be at least 14 to 16 files. The size of the files will different, some may be 80GB or 90GB, some may be less than 5 GB (and the size of the files are very unpredictable). But the names of the files will be have a particular format like "Table1_Insert.dat" , Table1_Update.dat, Table1_delete.dat, Table2_ins.dat, Table2_upd.dat, Table2_del.dat... like this...
I have to read one file at a time, check the size of the file (in GB), if the file size is greater than 90 GB (file size wont be more than 100GB always), then split the files into 5GB. So if the file size is 90 GB, then it should split the source file into 18 sub files (like TT_table1_ins.dataa, TT_Table1_ins.datab , TT_Table1_ins.datac... etc)
I want my script to take only one input argument - just the file name (with the path).
I know we can do this using split -l command, but i need some help. Can somebody help me with a script. I'm very new to shell scripting. I can understand the commands but cannot write a script...
Thanks
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
I'm, new to shell scripting, I have a requirement where I have to split an incoming file into separate files each containing a maximum of 3 million rows.
For e.g: if my incoming file say In.txt has 8 mn rows then I need to create 3 files, in which two will 3 mn rows and one will contain 2... (2 Replies)
Discussion started by: wahi80
2 Replies
2. Shell Programming and Scripting
Hello,
if i have file like this:
010000890306932455804 05306977653873 0520080417010520ISMS SMT ZZZZZZZZZZZZZOC30693599000 30971360000 ZZZZZZZZZZZZZZZZZZZZ202011302942311 010000890306946317387 05306977313623 0520080417010520ISMS SMT... (6 Replies)
Discussion started by: chriss_58
6 Replies
3. Shell Programming and Scripting
Hello all.
Sorry, I know this question is similar to many others, but I just can seem to put together exactly what I need.
My file is tab delimitted and contains approximately 1 million rows. I would like to send lines 1,4,& 7 to a file. Lines 2, 5, & 8 to a second file. Lines 3, 6, & 9 to... (11 Replies)
Discussion started by: shankster
11 Replies
4. Shell Programming and Scripting
Dear All,
I would like to split a file of the following format into multiple files based on the number in the 6th column (numbers 1, 2, 3...):
ATOM 1 N GLY A 1 -3.198 27.537 -5.958 1.00 0.00 N
ATOM 2 CA GLY A 1 -2.199 28.399 -6.617 1.00 0.00 ... (3 Replies)
Discussion started by: tomasl
3 Replies
5. Shell Programming and Scripting
Dear users,
I need your support, I have a file like this:
272134.548 6680572.715
272134.545 6680572.711
272134.546 6680572.713
272134.548 6680572.706
272134.545 6680572.721
272134.543 6680572.710
272134.544 6680572.715
272134.543 6680572.705
272134.540 6680572.720
272134.544... (10 Replies)
Discussion started by: Gery
10 Replies
6. UNIX for Dummies Questions & Answers
Could anybody help with this?
I have input below .....
david,39
david,39
emelie,40
clarissa,22
bob,42
bob,42
tim,32
bob,39
david,38
emelie,47
what i want to do is count how many names there are with different ages, so output would be like this ....
david,2
emelie,2
clarissa,1... (3 Replies)
Discussion started by: itsme999
3 Replies
7. UNIX for Dummies Questions & Answers
Hi,
I have a file like this one
h1 4.70650E-04 4.70650E-04 4.70650E-04
h2 1.92912E-04 1.92912E-04 1.92912E-04
h3A 3.10160E-11 2.94562E-11 2.78458E-11
h4 0.00000E+00 0.00000E+00 0.00000E+00
h1 1.18164E-12 2.74150E-12 4.35187E-12
h1 7.60813E-01 7.60813E-01 7.60813E-01... (5 Replies)
Discussion started by: f_o_555
5 Replies
8. UNIX for Dummies Questions & Answers
Hello Friends,
Can anyone help me for the below requirement.
I am having a file called Input.txt.
My requirement is first check the count that is wc -l input.txt
If the result of the wc -l Input.txt is less than 10 then don't split the Input.txt file. Where as if Input.txt >= 10 the split... (12 Replies)
Discussion started by: malaya kumar
12 Replies
9. Shell Programming and Scripting
Hi
i have requirement like below
M <form_name> sdasadasdMklkM
D ......
D .....
M form_name> sdasadasdMklkM
D ......
D .....
D ......
D .....
M form_name> sdasadasdMklkM
D ......
M form_name> sdasadasdMklkM
i want split file based on line number by finding... (10 Replies)
Discussion started by: bhaskar v
10 Replies
10. UNIX for Dummies Questions & Answers
Hello All ,
I have a file which needs to split based on the blank lines
Name ABC
Address London
Age 32
(4 blank new line)
Name DEF
Address London
Age 30
(4 blank new line)
Name DEF
Address London (8 Replies)
Discussion started by: Pratik4891
8 Replies
LEARN ABOUT DEBIAN
pymcaroitool
pymcaroitool(1) PyMca X-Ray Fluorescence Toolkit pymcaroitool(1)
NAME
pymcaroitool - PyMca region-of-interest imaging X application
SYNOPSIS
pymcaroitool [OPTIONS]... [FILE(S)]
DESCRIPTION
Start the graphical user interface of the PyMca X-Ray Fluorescence Toolkit region-of-interest imaging tool.
This tool is best suited for handling datasets that can be represented by three-dimensional arrays. Typical cases are stacks of images
(first dimension is image number) or 2D maps of 1D spectra (last dimension is spectrum channel number).
It allows to display maps of particular regions of the spectra or spectra of a particular region of the map.
A system of plugins allow to extend the capabilities of this tool. Plugins for multivariate analysis are already built in.
If FILE is given, it will be opened in the program provided its format is supported.
EXAMPLES
pymcaroitool
Start the program with a file browser to select the input files.
pymcaroitool file_0001.edf
Tries to open the file named file_0001.edf and all indexed files of the form file_????.edf where ???? is a number.
pymcaroitool --imagestack=1 file_0001.edf
Tries to open the file named file_0001.edf and all indexed files of the form file_????.edf where ???? is a number as a set of images.
pymcaroitool uncompressed_tiff_file_0001.tif
Tries to open a series of uncompressed TIFF files as an image stack.
pymcaroitool --begin=100 --end=200 --filepattern=file_%05d.edf
Start the program loading the single indexed files from file_00100.edf to file_00200.edf
pymcaroitool --begin=10,100 --end=20,200 --filepattern=row%d_col%03d.dat
Load the double indexed files from row10_col100.dat, row10_col101.dat, ... to row20_col00199.dat, row20_col00200.dat
CAVEATS
If files f_000.xxx and f_001.xxx are present in the same directory, the program will always try to load both of them unless a cumbersome
way using a file pattern is used.
SEE ALSO
HDF5, h5py
ESRF
March 2012 pymcaroitool(1)