01-29-2007
Could it be as simple as
sort -u
?
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hi,
I have a scenario here where I have created a flatfile with the below mentioned information. File as you can see is dispalyed in three columns
1st column is FileNameString
2nd column is Report_Name (this has spaces)
3rd column is Flag
Result file needed is, removal of duplicate... (1 Reply)
Discussion started by: Student37
1 Replies
2. Shell Programming and Scripting
Hi,
Please help!
I have a file having duplicate words in some line and I want to remove the duplicate words.
The order of the words in the output file doesn't matter.
INPUT_FILE
pink_kite red_pen ball pink_kite ball
yellow_flower white no white no
cloud nine_pen pink cloud pink nine_pen... (6 Replies)
Discussion started by: sam_2921
6 Replies
3. Shell Programming and Scripting
I've been working on a script (/bin/sh) in which I have requested and received help here (in which I am very grateful for!). The client has modified their requirements (a tad), so without messing up the script to much, I come once again for assistance.
Here are the file.dat contents:
ABC1... (4 Replies)
Discussion started by: petersf
4 Replies
4. UNIX for Dummies Questions & Answers
Hi ALL
I need a help
I need to retain only the first line of 035 if I have two line before =040 , if only one then need to take that
Eg:
Input
=035 (ABC)12324141241
=035 (XYZPQR)704124
=040 AB$QS$WEWR
=035 (ABC)08080880809
=035 (XYZPQR)9809314
=040 ... (4 Replies)
Discussion started by: umapearl
4 Replies
5. Shell Programming and Scripting
So, I've been working on a project which takes layer 7 metadata from pcap dumps and archives it. However, there is a lot of dataless information that I don't want in my output. I know of ways to produce the output I want from the input file below, but I want a method of doing this, regardless of... (2 Replies)
Discussion started by: eh3civic
2 Replies
6. Shell Programming and Scripting
Hi All,
I'm trying to figure out which are the trusted-ips and which are not using a script file.. I have a file named 'ip-list.txt' which contains some ip addresses and another file named 'trusted-ip-list.txt' which also contains some ip addresses. I want to read a line from... (4 Replies)
Discussion started by: mjavalkar
4 Replies
7. UNIX for Dummies Questions & Answers
I want a command equivalent to
sed 'p;p' file
My input file contains
1
2
3
If i want to input file like this
1
1
1
2
2
2
3
3
3
I use (2 Replies)
Discussion started by: nsuresh316
2 Replies
8. Shell Programming and Scripting
Dear All,
I have file input like this:
INP901 5173 4114
INP902 5227
INP903 5284
INP904 5346
INP905 5400
INP906 5456
INP907 5511
INP908 5572
INP909 5622
INP910 5678
INP911 5739
INP912 5796
INP913 5845
INP914 5910
INP915 5965 (2 Replies)
Discussion started by: attila
2 Replies
9. Shell Programming and Scripting
I have a script that builds a database ~30 million lines, ~3.7 GB .cvs file. After multiple optimzations It takes about 62 min to bring in and parse all the files and used to take 10 min to remove duplicates until I was requested to add another column. I am using the highly optimized awk code:
awk... (34 Replies)
Discussion started by: Michael Stora
34 Replies
10. UNIX for Beginners Questions & Answers
Input file:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18 (6 Replies)
Discussion started by: Sagar Singh
6 Replies
LEARN ABOUT DEBIAN
fdupes
FDUPES(1) General Commands Manual FDUPES(1)
NAME
fdupes - finds duplicate files in a given set of directories
SYNOPSIS
fdupes [ options ] DIRECTORY ...
DESCRIPTION
Searches the given path for duplicate files. Such files are found by comparing file sizes and MD5 signatures, followed by a byte-by-byte
comparison.
OPTIONS
-r --recurse
for every directory given follow subdirectories encountered within
-R --recurse:
for each directory given after this option follow subdirectories encountered within (note the ':' at the end of option; see the
Examples section below for further explanation)
-s --symlinks
follow symlinked directories
-H --hardlinks
normally, when two or more files point to the same disk area they are treated as non-duplicates; this option will change this behav-
ior
-n --noempty
exclude zero-length files from consideration
-f --omitfirst
omit the first file in each set of matches
-A --nohidden
exclude hidden files from consideration
-1 --sameline
list each set of matches on a single line
-S --size
show size of duplicate files
-m --summarize
summarize duplicate files information
-q --quiet
hide progress indicator
-d --delete
prompt user for files to preserve, deleting all others (see CAVEATS below)
-N --noprompt
when used together with --delete, preserve the first file in each set of duplicates and delete the others without prompting the user
-v --version
display fdupes version
-h --help
displays help
SEE ALSO
md5sum(1)
NOTES
Unless -1 or --sameline is specified, duplicate files are listed together in groups, each file displayed on a separate line. The groups are
then separated from each other by blank lines.
When -1 or --sameline is specified, spaces and backslash characters () appearing in a filename are preceded by a backslash character.
EXAMPLES
fdupes a --recurse: b
will follow subdirectories under b, but not those under a.
fdupes a --recurse b
will follow subdirectories under both a and b.
CAVEATS
If fdupes returns with an error message such as fdupes: error invoking md5sum it means the program has been compiled to use an external
program to calculate MD5 signatures (otherwise, fdupes uses internal routines for this purpose), and an error has occurred while attempting
to execute it. If this is the case, the specified program should be properly installed prior to running fdupes.
When using -d or --delete, care should be taken to insure against accidental data loss.
When used together with options -s or --symlink, a user could accidentally preserve a symlink while deleting the file it points to.
Furthermore, when specifying a particular directory more than once, all files within that directory will be listed as their own duplicates,
leading to data loss should a user preserve a file without its "duplicate" (the file itself!).
AUTHOR
Adrian Lopez <adrian2@caribe.net>
FDUPES(1)