07-16-2010
Quote:
Originally Posted by
rdcwayx
Guess there are no duplicate lines in same files.
2 means from 2 files, 3 means from 3 files.
[...]
Just to clarify that I wrote all that code only because of this requirement:
Quote:
Getting which files have the common-line would be nice too.
This User Gave Thanks to radoulov For This Post:
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi,
I have one situation. I have some 6-7 no. of files in one directory & I have to extract all the lines which exist in all these files. means I need to extract all common lines from all these files & put them in a separate file.
Please help. I know it could be done with the help of... (11 Replies)
Discussion started by: The Observer
11 Replies
2. Shell Programming and Scripting
Hello guys,
I need a script to get the common lines from two files with a criteria that if the first two columns match then I keep the maximum value of the 3rd column.(tab separated columns)
Sample input:
file1:
111 222 0.1
333 444 0.5
555 666 0.4
file 2:
111 222 0.7
555 666... (5 Replies)
Discussion started by: jaysean
5 Replies
3. Shell Programming and Scripting
Hello guys,
I need a script to get the common lines from two files with a criteria that if the first two columns match then I keep the maximum value of the 5th column.(tab separated columns) . 3rd and 4th columns corresponds to the row which has highest value for the 5th column.
Sample... (2 Replies)
Discussion started by: jaysean
2 Replies
4. Shell Programming and Scripting
I've been a Unix admin for nearly 30 years and never learned AWK. I've seen several similar posts here, but haven't been able to adapt the answers to my situation. AWK is so damn cryptic! ;)
I have a single file with ~900 lines (CSV list). Each line starts with an ID, but with different stuff... (6 Replies)
Discussion started by: protosd
6 Replies
5. Shell Programming and Scripting
Hello everyone
A few years Ago the user radoulov posted a fancy solution for a problem, which was about finding common lines (gene variation names) between multiple samples (files). The code was:
awk 'END {
for (R in rec) {
n = split(rec, t, "/")
if (n > 1)
dup = dup ?... (5 Replies)
Discussion started by: bibb
5 Replies
6. Shell Programming and Scripting
I have this code
awk 'NR==FNR{a=$1;next} a' file1 file2
which does what I need it to do, but for only two files. I want to make it so that I can have multiple files (for example 30) and the code will return only the items that are in every single one of those files and ignore the ones... (7 Replies)
Discussion started by: castrojc
7 Replies
7. UNIX for Dummies Questions & Answers
Thanks everyone. I got that problem solved.
I require one more help here. (Yes, UNIX definitely seems to be fun and useful, and I WILL eventually learn it for myself. But I am now on a different project and don't really have time to go through all the basics. So, I will really appreciate some... (6 Replies)
Discussion started by: latsyrc
6 Replies
8. Shell Programming and Scripting
Hi
I have a file like
1 2
1 2 3
1 5 6
11 12
10 2
7 5
17 12
I would like to have an output as
1 2 3 5 6 10 7
11 12 17
any help would be highly appreciated
Thanks (4 Replies)
Discussion started by: Harrisham
4 Replies
9. Shell Programming and Scripting
Hello,
I have a file with 2 columns ( tableName , ColumnName) delimited by a Pipe like below . File is sorted by ColumnName.
Table1|Column1
Table2|Column1
Table5|Column1
Table3|Column2
Table2|Column2
Table4|Column3
Table2|Column3
Table2|Column4
Table5|Column4
Table2|Column5
From... (6 Replies)
Discussion started by: nv186000
6 Replies
10. Shell Programming and Scripting
Could it be possible to find common lines between all of the files in one folder? Just like comm -12 . So all of the files two at a time. I would like all of the outcomes to be written to a different files, and the file names could be simply numbers - 1 , 2 , 3 etc. All of the file names contain... (19 Replies)
Discussion started by: Eve
19 Replies
FDUPES(1) General Commands Manual FDUPES(1)
NAME
fdupes - finds duplicate files in a given set of directories
SYNOPSIS
fdupes [ options ] DIRECTORY ...
DESCRIPTION
Searches the given path for duplicate files. Such files are found by comparing file sizes and MD5 signatures, followed by a byte-by-byte
comparison.
OPTIONS
-r --recurse
include files residing in subdirectories
-s --symlinks
follow symlinked directories
-H --hardlinks
normally, when two or more files point to the same disk area they are treated as non-duplicates; this option will change this behav-
ior
-n --noempty
exclude zero-length files from consideration
-f --omitfirst
omit the first file in each set of matches
-1 --sameline
list each set of matches on a single line
-S --size
show size of duplicate files
-q --quiet
hide progress indicator
-d --delete
prompt user for files to preserve, deleting all others (see CAVEATS below)
-v --version
display fdupes version
-h --help
displays help
SEE ALSO
md5sum(1)
NOTES
Unless -1 or --sameline is specified, duplicate files are listed together in groups, each file displayed on a separate line. The groups are
then separated from each other by blank lines.
When -1 or --sameline is specified, spaces and backslash characters () appearing in a filename are preceded by a backslash character.
CAVEATS
If fdupes returns with an error message such as fdupes: error invoking md5sum it means the program has been compiled to use an external
program to calculate MD5 signatures (otherwise, fdupes uses interal routines for this purpose), and an error has occurred while attempting
to execute it. If this is the case, the specified program should be properly installed prior to running fdupes.
When using -d or --delete, care should be taken to insure against accidental data loss.
When used together with options -s or --symlink, a user could accidentally preserve a symlink while deleting the file it points to.
Furthermore, when specifying a particular directory more than once, all files within that directory will be listed as their own duplicates,
leading to data loss should a user preserve a file without its "duplicate" (the file itself!).
AUTHOR
Adrian Lopez <adrian2@caribe.net>
FDUPES(1)