Working with Franklin52's suggestion this is probably all you need:
I note that in your sample, file2 isn't actually a comma separated list. If that is true, then the previous command will be fine. However, if file2 is indeed a comma separated list (as the name and your description implies) then you'll need to take a different approach.
I have huge txt file having millions of trade data.
For e.g
Trade.txt (first 8 lines in the file is header info)
COB_DATE,TRADE_ID,SOURCE_SYSTEM_TRADE_ID,TRADE_GROUP_ID,
TRADE_TYPE,DEALER_NAME,EXTERNAL_COUNTERPARTY_ID,
EXTERNAL_COUNTERPARTY_NAME,DB_COUNTERPARTY_ID,... (6 Replies)
Can anyone help me to removing duplicate records from 2 separate files in UNIX?
Please find the sample records for both the files
cat Monday.dat
3FAHP0JA1AR319226MOHMED ATEK 966504453742 SAU2010DE
3LNHL2GC6AR636361HEA DEUK CHOI 821057314531 KOR2010LE
3MEHM0JG7AR652083MUTLAB NAL-NAFISAH... (4 Replies)
Hi Unix gurus,
Maybe it is too much to ask for but please take a moment and help me out. A very humble request to you gurus. I'm new to Unix and I have started learning Unix. I have this project which is way to advanced for me.
File format: CSV file
File has four columns with no header... (8 Replies)
Hi,
I want to remove duplicate records including the first line based on column1. For example
inputfile(filer.txt):
-------------
1,3000,5000
1,4000,6000
2,4000,600
2,5000,700
3,60000,4000
4,7000,7777
5,999,8888
expected output:
----------------
3,60000,4000
4,7000,7777... (5 Replies)
I was reading this thread. It looks like a simpler way to say this is to only keep uniq lines based on field or column 1.
https://www.unix.com/shell-programming-scripting/165717-removing-duplicate-records-file-based-single-column.html
Can someone explain this command please? How are there no... (5 Replies)
Hello
I have been trying to remove a row from a file which has the same first three columns as another row - I have tried lots of different combinations of suggestion on this forum but can't get it exactly right.
what I have is
900 - 1000 = 0
900 - 1000 = 2562
1000 - 1100 = 0
1000 - 1100... (7 Replies)
I have csv file with 30, 40 columns
Pasting just three column for problem description
I want to filter record if column 1 matches CN or DN then,
check for values in column 2 if column contain 1235, 1235 then in column 3 values must be sequence of 2345, 2345
and if column 2 contains 6789, 6789... (5 Replies)
Hi Experts,
I have csv file with 30, 40 columns
Pasting just 2 column for problem description.
Need to print error if below combination is not present in file
check for column-1 (DocumentNumber) and filter columns where value in DocumentNumber field is same.
For all such rows, the field... (7 Replies)
Discussion started by: as7951
7 Replies
LEARN ABOUT DEBIAN
h5repack
h5repack(1) General Commands Manual h5repack(1)NAME
h5repack - Copy an HDF5 file to a new file with or without compression/chunking
SYNOPSIS
h5repack -i file1 -o file2 [-h] [-v] [-f 'filter'] [-l 'layout'] [-m number] [-e file]
DESCRIPTION
h5repack is a command line tool that applies HDF5 filters to a input file file1, saving the output in a new file, file2.
'filter' is a string with the format <list of objects> : <name of filter> = <filter parameters>.
<list of objects> is a comma separated list of object names meaning apply compression only to those objects. If no object names are speci-
fied, the filter is applied to all objects.
<name of filter> can be:
GZIP, to apply the HDF5 GZIP filter (GZIP compression)
SZIP, to apply the HDF5 SZIP filter (SZIP compression)
SHUF, to apply the HDF5 shuffle filter
FLET, to apply the HDF5 checksum filter
NONE, to remove the filter
<filter parameters> contains the optional compression information:
SHUF (no parameter)
FLET (no parameter)
GZIP=<deflation level> from 1-9
SZIP=<pixels per block,coding> (pixels per block is a even number in 2-32 and coding method is 'EC' or 'NN')
'layout' is a string with the format
<list of objects> : <layout type>
<list of objects> is a comma separated list of object names, meaning that layout information is supplied for those objects. If no object
names are specified, the layout is applied to all objects.
<layout type> can be:
CHUNK, to apply chunking layout
COMPA, to apply compact layout
CONTI, to apply continuous layout
<layout parameters> is present for the chunk case only it is the chunk size of each dimension: <dim_1 x dim_2 x ... dim_n>
OPTIONS
file1,file2
The input and output HDF5 files
-h Print a help message
-f filter
Filter type
-l layout
Layout type
-v Verbose mode. Print output (list of objects in the file, filters and layout applied).
-e file
File with the -f and -l options (only filter and layout flags)
-d delta
Print only differences that are greater than the limit delta. delta must be a positive number. The comparison criterion is whether
the absolute value of the difference of two corresponding values is greater than delta (e.g., |a-b| > delta, where a is a value in
file1 and b is a value in file2).
-m number
Do not apply the filter to objects which size in bytes is smaller than number. If no size is specified a minimum of 1024 bytes is
assumed.
EXAMPLES
Apply GZIP compression to all objects in file1 and save the output in file2:
h5repack -i file1 -o file2 -f GZIP=1 -v
Apply SZIP compression only to object 'dset1':
h5repack -i file1 -o file2 -f dset1:SZIP=8,NN -v
Apply a chunked layout to objects 'dset1' and 'dset2':
h5repack -i file1 -o file2 -l dset1,dset2:CHUNK=20x10 -v
SEE ALSO h5dump(1), h5ls(1), h5diff(1), h5import(1), gif2h5(1), h52gif(1), h5perf(1), h5repart(1).
h5repack(1)