There is no point in specifying a numeric sort because uniq only understands lexicographic sorts. This approach will only work when a data set's numeric sort is identical to its lexicographic sort. This is true in this case because all numbers have the same number of digits and consist of nothing but digits (no signs, no radix point).
Regards,
Alister
---------- Post updated at 07:10 PM ---------- Previous update was at 07:09 PM ----------
Quote:
Originally Posted by Scottie1954
I've got two files that each contain a 16-digit number in positions 1-16. The first file has 63,120 entries all sorted numerically. The second file has 142,479 entries, also sorted numerically.
I want to read through each file and output the entries that appear in both. So far I've had no success with comm -12
Is there something in positions beyond 16? No trailing whitespace in either file? Because, since the lexicographic sort of the data sample in post #4 is identical to its numeric sort, comm -12 should work well.
Hi,
I have one situation. I have some 6-7 no. of files in one directory & I have to extract all the lines which exist in all these files. means I need to extract all common lines from all these files & put them in a separate file.
Please help. I know it could be done with the help of... (11 Replies)
I am looking for a file with 'MCR0000000716214' in it. I tried the following command:
grep MCR0000000716214 *
The problem is that the folder I am searching in has over 87000 files and I am getting the following:
bash: /bin/grep: Arg list too long
Is there any command I can use that can... (6 Replies)
Hi! I have a large set of pairs of text files (each pair in their own subdirectory) and each pair shares head/tail (a couple of first and last lines) but differs in the middle part. I need to delete the heads/tails and keep only the middle portions in which they differ. The lengths of heads/tails... (1 Reply)
Hi ,
I have a text file in the format
DB2:
DB2:
WB:
WB:
WB:
WB:
and a second text file of the format
Time=00:00:00.473
Time=00:00:00.436
Time=00:00:00.016
Time=00:00:00.027
Time=00:00:00.471
Time=00:00:00.436
the last string in both the text files is of the... (4 Replies)
Hi
I have 2 files with following data
First file,
sp|Q676U5|A16L1_HUMAN,
Autophagy-related protein 16-1 OS=Homo sapiens GN=ATG16L1 PE=1 SV=2,
Maximum coiled-coil residue probability: 0.657 in position 163.
Maximum dimeric residue probability: 0.288 in position 163.
... (1 Reply)
Hi All,
I have two files like below:
File1
MYFILE_28012012_1112.txt|4
MYFILE_28012012_1113.txt|51
MYFILE_28012012_1114.txt|57
MYFILE_28012012_1115.txt|57
MYFILE_28012012_1116.txt|57
MYFILE_28012012_1117.txt|57
File2
MYFILE_28012012_1110.txt|57
MYFILE_28012012_1111.txt|57... (2 Replies)
I have two directories
Dir 1
/home/sid/release1
Dir 2
/home/sid/release2
I want to find the common files between the two directories
Dir 1 files
/home/sid/release1>ls -lrt
total 16
-rw-r--r-- 1 sid cool 0 Jun 19 12:53 File123
-rw-r--r-- 1 sid cool 0 Jun 19 12:53... (5 Replies)
Discussion started by: sidnow
5 Replies
LEARN ABOUT OPENSOLARIS
comm
comm(1) User Commands comm(1)NAME
comm - select or reject lines common to two files
SYNOPSIS
comm [-123] file1 file2
DESCRIPTION
The comm utility reads file1 and file2, which must be ordered in the current collating sequence, and produces three text columns as output:
lines only in file1; lines only in file2; and lines in both files.
If the input files were ordered according to the collating sequence of the current locale, the lines written will be in the collating
sequence of the original lines. If not, the results are unspecified.
OPTIONS
The following options are supported:
-1 Suppresses the output column of lines unique to file1.
-2 Suppresses the output column of lines unique to file2.
-3 Suppresses the output column of lines duplicated in file1 and file2.
OPERANDS
The following operands are supported:
file1 A path name of the first file to be compared. If file1 is -, the standard input is used.
file2 A path name of the second file to be compared. If file2 is -, the standard input is used.
USAGE
See largefile(5) for the description of the behavior of comm when encountering files greater than or equal to 2 Gbyte ( 2^31 bytes).
EXAMPLES
Example 1 Printing a list of utilities specified by files
If file1, file2, and file3 each contain a sorted list of utilities, the command
example% comm -23 file1 file2 | comm -23 - file3
prints a list of utilities in file1 not specified by either of the other files. The entry:
example% comm -12 file1 file2 | comm -12 - file3
prints a list of utilities specified by all three files. And the entry:
example% comm -12 file2 file3 | comm -23 -file1
prints a list of utilities specified by both file2 and file3, but not specified in file1.
ENVIRONMENT VARIABLES
See environ(5) for descriptions of the following environment variables that affect the execution of comm: LANG, LC_ALL, LC_COLLATE,
LC_CTYPE, LC_MESSAGES, and NLSPATH.
EXIT STATUS
The following exit values are returned:
0 All input files were successfully output as specified.
>0 An error occurred.
ATTRIBUTES
See attributes(5) for descriptions of the following attributes:
+-----------------------------+-----------------------------+
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
+-----------------------------+-----------------------------+
|Availability |SUNWesu |
+-----------------------------+-----------------------------+
|CSI |enabled |
+-----------------------------+-----------------------------+
|Interface Stability |Standard |
+-----------------------------+-----------------------------+
SEE ALSO cmp(1), diff(1), sort(1), uniq(1), attributes(5), environ(5), largefile(5), standards(5)SunOS 5.11 3 Mar 2004 comm(1)