I'm not aware of any packages that do it, how about this script to find identical files:
Code:
if [ $# -ne 1 ]
then
echo "usage: $0 <directory>"
exit 1
fi
find $1 -type f -ls | awk '$7 > 0 { if($7 in sizes) { sizes[$7]=sizes[$7] SUBSEP $11; dup[$7]++} else sizes[$7]=$11 } END {for(i in dup) print sizes[i] }' | while read
do
OIFS="$IFS"
IFS=$(printf \\034)
F=( $REPLY )
IFS="$OIFS"
i=0
while [ $i -lt ${#F[@]} ]
do
let j=i+1
while [ $j -lt ${#F[@]} ]
do
if cmp -s "${F[i]}" "${F[j]}"
then
echo ${F[i]} and ${F[j]} are identical
fi
let j=j+1
done
let i=i+1
done
done
It does a find dir -type f -ls and stores filename in an awk array with size as the index. If a file with the same size is found the size is written into the dup array. At the end all duplicate sizes are output in a SUBSEP list.
This list is read into the F array and cmp with the -s flag (ie no output, just exit status) is used to compare files - this command will stop comparing as soon as the first difference is found which is better than calculating CRCs for each file.
Note: if files A B C are identical you will get output
A is identical to B
A is identical to C
B is identical to C
This can be fixed with more logic, but I don't consider it an issue.
Last edited by Chubler_XL; 03-28-2011 at 07:20 PM..
Hello,
I am a total linux newbie and I can't seem to find a solution to this little problem.
I have two text files with a huge list of URLS. Let's call them file1.txt and file2.txt
What I want to do is grab an URL from file2.txt, search file1.txt for the URL and if found, delete it from... (11 Replies)
Hi:
I've been searching the net but didnt find a clue. I have a file in which, for some records, some fields coincide. I want to compare one (or more) of the dissimilar fields and retain the one record that fulfills a certain condition. For example, on this file:
99 TR 1991 5 06 ... (1 Reply)
Hello Guys, Greetings to All.
I am stuck in my work here today while trying to comapre paragraphs between two files, I need your help on urgent basis, without your inputs I can not proceed. Kindly find some time to answer my question, I'll be grateful to you for ever. My detailed issue is as... (10 Replies)
You have two files to compare by searching keyword from one file into another file
File A
23 >pp_ANSWER
24 >aa hello
25 >jau head wear
66 >jss oops
872 >aqq olps ploww oww sss
722 >GG_KILLER
..... large files
File B
Beta done
KILLER
John Mayor
calix meyers
... (5 Replies)
Dear All,
I have file with 4 columns:
1 AA 0 21
2 BB 0 31
3 AA 0 21
4 CC 0 41
I would like to find the duplicate record based on column 2 and replace the 4th column of the duplicate by a new value. So, the output will be:
1 AA 0 21
2 BB 0 31
3 AA 0 -21
4 CC 0 41
Any suggestions... (3 Replies)
Hi All,
I have a 2 path, one with oldfile path in which has several sub folders,each sub folders contains a config file(basically text file), likewise there will be another newfile path which will have sub folders, each sub folders contains a config file.
Need to read files from oldfile... (6 Replies)
All,
i have a file text.log:
cover6
cover3
cover2
cover4
other file is abc.log as :
0
0
1
0
Then I have a excel file result.xls that contains:
Name Path Pass
cover2
cover3
cover6
cover4 (1 Reply)
Hi,
I've written a script to search for an Oracle ORA- error on a log file, print that line and the .trc file associated with it as well as the dateline of when I assumed the error occured. In most it is the first dateline previous to the error.
Unfortunately, this is not a fool proof script.... (2 Replies)
Hi,
I want to search only duplicate sequence number in file e.g
4757610
4757610
should display only duplicate sequence number in file.
file contain is:
4757610 6zE:EXPNL ORDER_PRIORITY='30600022004757610' ORDER_IDENTIFIER='4257771056' MM_ASK_VOLUME='273' MM_ASK_PRICE='1033.0000' m='GBX'... (5 Replies)
Hi ,
I had a requirement to compare two files whether the two files are same or different .... like(files contaisn of two columns each)
file1.txt
121343432213 1234
64564564646 2345
343423424234 2456
file2.txt
121343432213 1234
64564564646 2345
31231313123 3455
how to... (2 Replies)
Discussion started by: hemanthsaikumar
2 Replies
LEARN ABOUT FREEBSD
tcopy
TCOPY(1) BSD General Commands Manual TCOPY(1)NAME
tcopy -- copy and/or verify mag tapes
SYNOPSIS
tcopy [-cvx] [-s maxblk] [src [dest]]
DESCRIPTION
The tcopy utility is designed to copy magnetic tapes. The only assumption made about the tape layout is that there are two sequential EOF
marks at the end. By default, the tcopy utility will print information about the sizes of records and files found on the /dev/sa0 tape, or
on the tape specified by the src argument. If a destination tape is also specified by the dest argument, a copy of the source tape will be
made. The blocking on the destination tape will be identical to that used on the source tape. Copying a tape will yield the same program
output as if just printing the sizes.
The following options are available:
-c Copy src to dest and then verify that the two tapes are identical.
-s maxblk Specify a maximum block size, maxblk.
-v Given the two tapes src and dest, verify that they are identical.
-x Output all informational messages to the standard error instead of the standard output. This option is useful when dest is given
as /dev/stdout.
SEE ALSO mt(1), mtio(4)HISTORY
The tcopy command appeared in 4.3BSD.
BUGS
Writing an image of a tape to a file does not preserve much more than the raw data. Block size(s) and tape EOF marks are lost which would
otherwise be preserved in a tape-to-tape copy.
End of data (EOD) is determined by two sequential EOF marks with no data between them. There used to be old systems which typically wrote
three EOF's between tape files. The tcopy utility will erroneously stop copying early in this case.
When using the copy/verify option -c, tcopy does not rewind the tapes prior to start. A rewind is performed after writing, prior to the ver-
ification stage. If one does not start at the beginning-of-tape (BOT) then the comparison may not be of the intended data.
BSD December 20, 2006 BSD