I think you'll need the -g option to enable full numeric sorting. Here's a quick & dirty one-liner to sort clusters of duplicate files found by duff according to the size of the files in the cluster:
Last edited by radoulov; 08-04-2011 at 11:03 AM..
Reason: Code tags.
Dear All,
Good day. Here i am facing some problem like below.
file contains
12345 0001 090112
14385 0001 090112
13255 0001 090112
11345 0001 090112
....
I want to sort ascending according to the first column. What will be the shell script. (4 Replies)
Hi,
My input file is
$cat samp
1 siva
1 raja
2 siva
1 siva
2 raja
4 venkat
i want sort this name wise...alos need to remove duplicate lines.
i am using
cat samp|awk '{print $2,$1}'|sort -u
it showing
raja 1 (3 Replies)
Hi Everybody,
I am just new to UNIX as well as to this forum. I have a text file with 10,000 coloumns and each coloumn contains values separated by space. I want to separate them into new coloumns..the file is something like this
as ad af 1 A
as ad af 1 D
...
...
1 and A are in one... (7 Replies)
Hello,
I've done
ls -ls >fileout1
When I do the sort command for +4 it sorts it bu group. When I do +5 it sorts it by date. But it's skipping the file size column. Example:
rwxr-xr-x 1 Grueben sup 65 16 Sep 13:58 cdee
How can I sort it by file size? It doesn't... (2 Replies)
Hello ,
i have a text file like this
1 a1 ,AB ,AC ;AD ,EE
2 a2 ,WE ;TR ,YT ,WW
3 a3 ;AS ,UY ;RF ,YT
i want to sort this text file based on each row , and excluding 2nd column from the sorting and not taking the comma or ; into consideration in the sorting, so it will become like this... (12 Replies)
Hi, I have a single colum file and I need to reformat the file so that it creates a new line every time it come to an IP address and the following lines are corresponding rows until it comes to the next IP address.
I want to turn this
172.xx.xx.xx
gwpusprdrp02_pv
seinwnprd03... (7 Replies)
Hello All,
I have a file which have content as below.
03/09/2014 10:35 AM 618 Admin\rick pqr_ klm2_pog12_20140309_c.xlsx
03/10/2014 10:35 AM 618 user\test01 mplz_ fgh2_lal12_20140310_c.xlsx
03/17/2014 10:35 AM 618 Admin\vick abc_ xyz2_bc12_20140317_c.xlsx
03/18/2014 ... (2 Replies)
Hello,
Here is my text data excerpted from the webpage:
input
My target is to get:
What i tried is:
sed 's/.*\(connector\)/1/' input > output
but all characters coming before the word "connector" are deleted which is not good for me.
My question: (9 Replies)
I have to sort the 4th column of an excel/csv file. I tried the following command
sort -u --field-separator=, --numeric-sort -k 2 -n dinesh.csv > test.csv
But, it's not working. Moreover, I have to do the same for more than 30 excel/csv file. So please help me to do the same. (6 Replies)
I have a csv file as shown below,
xop_thy 80 avr_njk 50 str_nyu 60
avr_irt 70 str_nhj 60 avr_ngt 50
str_tgt 80 xop_nmg 50 xop_nth 40
cyv_gty 40 cop_thl 40 vir_tyk 80
vir_plo 20 vir_thk 40 ijk_yuc 70
cop_thy 70 ijk_yuc 80 irt_hgt 80
I need to align/sort the csv file based... (7 Replies)
Discussion started by: dineshkumarsrk
7 Replies
LEARN ABOUT DEBIAN
clmmate
clm mate(1) USER COMMANDS clm mate(1)
NAME
clm mate - compute best matches between two clusterings
clmmate is not in actual fact a program. This manual page documents the behaviour and options of the clm program when invoked in mode mate.
The options -h, --apropos, --version, -set, --nop are accessible in all clm modes. They are described in the clm manual page.
SYNOPSIS
clm mate [-o fname (output file name)] [-b (omit headers)] [--one-to-many (require multiple hits in <clfile1>)] [-h (print synopsis, exit)]
[--apropos (print synopsis, exit)] [--version (print version, exit)] <clfile1> <clfile2>
DESCRIPTION
clm mate computes for each cluster X in clfile1 all clusters Y in clfile2 that have non-empty intersection and outputs a line with the data
points listed below.
overlap(X,Y) # 2 * size(meet(X,Y)) / (size(X)+size(Y))
index(X) # name of cluster
index(Y) # name of cluster
size(meet(X,Y))
size(X-Y) # size of left difference
size(Y-X) # size of right difference
size(X)
size(Y)
projection(X, clfile2) # see below
projection(Y, clfile1) # see below
The projected size of a cluster X relative to a clustering K is simply the sum of all the nodes shared between any cluster Y in K and X,
duplications allowed. For example, the projected size of (0,1) relative to {(0,2,4), (1,4,9), (1,3,5)} equals 3.
The overlap between X and Y is exactly 1.0 if the two clusters are identical, and for nearly identical clusterings the score will be close
to 1.0.
All of this information can also be obtained from the contingency matrix defined for two clusterings. The [i,j] row-column entry in a con-
tigency matrix between to clusterings gives the number of entries in the intersection between cluster i and cluster j from the respective
clusterings. The other information is implicitly present; the total number of nodes in clusters i and j for example can be obtained as the
sum of entries in row i and column j respectively, and the difference counts can then be obtained by substracting the intersection count.
The contingency matrix can easily be computed using mcx; e.g.
mcx /clfile2 lm /clfile1 lm tp mul /ting wm
will create the contingency matrix in mcl matrix format in the file ting, where columns range over the clusters in clfile1.
The output can be put to good use by sorting it numerically on that first score field. It is advisable to use a stable sort routine (use the
-s option for UNIX sort) From this information one can quickly extract the closest clusters between two clusterings.
OPTIONS
-o fname (output file name)
Specify the name of the output file.
-b (omit headers)
Batch mode, omit column names.
--one-to-many (require multiple hits in <clfile1>)
Do not output information for clusters in the first file that are subset of a cluster in the second file.
AUTHOR
Stijn van Dongen.
SEE ALSO
mclfamily(7) for an overview of all the documentation and the utilities in the mcl family.
clm mate 12-068 8 Mar 2012 clm mate(1)