06-24-2010
compare the similar files
I got many pair files, which only have small difference, such as more space, or more empty line, and some unreadable characters.
If list by commend "diff", I can see many many difference.
So I'd like to write a script to compare the pair files, if 95% contents are same, I will think they are similar.
Any suggestion for it?
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
I would like to know how to compare a listing of directories that begin with the same four numbers ie.
/1234cat
/1234tree
/1234fish
and move all these directories into one directory
Thanks in advance (2 Replies)
Discussion started by: tgibson2
2 Replies
2. Shell Programming and Scripting
Hi,
I have two text files.The first and the 2nd file have data in the same format
For e.g. The first file has
table_name1 column1 sum(column1) max(column1) min(column1)
table_name1 column2 sum(column2) max(column2) min(column2)
table_name1 coulmn3 sum(column3) max(column3) min(column3)
... (13 Replies)
Discussion started by: ragavhere
13 Replies
3. Shell Programming and Scripting
I have four files, I need to compare these files together.
As such i know "sdiff and comm" commands but these commands compare 2 files together. If I use sdiff command then i have to compare each file with other which will increase the codes.
Please suggest if you know some commands whcih can... (6 Replies)
Discussion started by: nehashine
6 Replies
4. Shell Programming and Scripting
Hi,
I am new in unix.
I have below requirement:
I have two files at the same directory location
File1.txt and File2.txt (just an example, real scenario we might have File2 and File3 OR File6 and File7....)
File1.txt has :
header1
record1
trailer1
File2.txt has:
header2
record2... (4 Replies)
Discussion started by: Deepak62828r
4 Replies
5. Shell Programming and Scripting
Hello all,
I have a server that is running AIX, running a tool that converts various printstreams (AFP/Metadata) to PDF. This is done using a rexx script and an off the shelf utility.
Each report (there's around 125) uses a certain script file, it's basically a text file.
I am trying... (5 Replies)
Discussion started by: jeffs42885
5 Replies
6. UNIX for Dummies Questions & Answers
Hi,
I have a file1 like this:
ABAT
ABCA1
ABCC1
ABCC5
ABCC8
ABCE1
ABHD2
ABL1
CAMTA1
ACBD3
ACCN1
And I have a second file like this:
chr19 46118590 46119564 MACS_peak_1499 3100.00 chr19 46122009 46148405 CYP2B7P1 -2445
chr1 7430312 7430990... (7 Replies)
Discussion started by: a_bahreini
7 Replies
7. Shell Programming and Scripting
{
"AFafa": "FAFA","AFafa": "FAFA"
"baseball":"soccer","wrestling":"dancing"
"rhinos":"crocodiles","roles":"foodchain"
}
I need to insert a new line before the closing brackets "}" so that the final output looks like this:
{
"AFafa": "FAFA","AFafa": "FAFA"... (6 Replies)
Discussion started by: SkySmart
6 Replies
8. Solaris
Hi,
I need to compare the /etc/passwd files from 2 servers, and extract the users that are similar in these two files. I sorted the 2 files based on the user IDs (UID) (3rd column). I first sorted the files using the username (1st column), however when I use comm to compare the files there is no... (1 Reply)
Discussion started by: anaigini45
1 Replies
9. UNIX for Beginners Questions & Answers
Hi,
I want to compare same column in two files, if values match then display the column or display "NA".
Ex :
File 1 :
123
abc
xyz
pqr
File 2:
122
aab
fdf
pqr
fff
qqq
rrr (1 Reply)
Discussion started by: hkoshekay
1 Replies
10. UNIX for Beginners Questions & Answers
Hi all,
This is my first day on Linux shell!!!
So, I am trying to write a script that that will pick up pairs of files with the same name (not the same content) but that are different in one character (one is *R1 the other is *R2)...
Something like: look ate the files, whenever they are the... (3 Replies)
Discussion started by: ALou
3 Replies
cmp(1) General Commands Manual cmp(1)
NAME
cmp - Compares two files
SYNOPSIS
cmp [-l | -s] file1 file2
STANDARDS
Interfaces documented on this reference page conform to industry standards as follows:
cmp:XCU5.0
Refer to the standards(5) reference page for more information about industry standards and associated tags.
OPTIONS
Prints the byte number (decimal) and the differing bytes (octal) for each difference. Does not print data for differing files; returns
only an exit value.
OPERANDS
The path name of a file to be compared. The path name of a file to be compared.
DESCRIPTION
The cmp command compares two files.
If file1 or file2 is - (dash), standard input is used for that file. It is an error to specify - for both files.
By default, the cmp command prints no information if the files are the same. If the files differ, cmp prints the byte and line number
where the difference occurred.
The cmp command also specifies whether one file is an initial subsequence of the other (that is, if the cmp command reads an End-of-File
character in one file before finding any differences). Usually, you use the cmp command to compare nontext files and the diff command to
compare text files.
Note that bytes and lines reported by cmp are numbered from 1.
EXIT STATUS
The following exit values are returned: The files are identical. The files differ. This includes files of different lengths that are
identical in the first part of both files. An error occurred.
EXAMPLES
To determine whether two files are identical, enter: cmp prog.o.bak prog.o
The preceding command compares the files prog.o.bak and prog.o. If the files are identical, a message is not displayed. If the
files differ, the location of the first difference is displayed. For instance: prog.o.bak prog.o differ: byte 5, line 1
If the message cmp: EOF on prog.o.bak is displayed, then the first part of prog.o is identical to prog.o.bak, but there is addi-
tional data in prog.o.
If the message cmp: EOF on prog.o is displayed, it is prog.o.bak that is the same as prog.o but also contains addition data. To
display each pair of bytes that differ, enter: cmp -l prog.o.bak prog.o
This compares the files and then displays the byte number (in decimal) and the differing bytes (in octal) for each difference. For
example, if the fifth byte is octal 101 in prog.o.bak and 141 in prog.o, then the cmp command displays: 5 101 141
.
.
.
ENVIRONMENT VARIABLES
The following environment variables affect the execution of cmp: Provides a default value for the internationalization variables that are
unset or null. If LANG is unset or null, the corresponding value from the default locale is used. If any of the internationalization vari-
ables contain an invalid setting, the utility behaves as if none of the variables had been defined. If set to a non-empty string value,
overrides the values of all the other internationalization variables. Determines the locale for the interpretation of sequences of bytes
of text data as characters (for example, single-byte as opposed to multibyte characters in arguments). Determines the locale for the for-
mat and contents of diagnostic messages written to standard error. Determines the location of message catalogues for the processing of
LC_MESSAGES.
SEE ALSO
Commands: comm(1), bdiff(1), diff(1), diff3(1), sdiff(1)
Standards: standards(5)
cmp(1)