Proceed in 2 steps :
1. Log size and filename in a tempfile (removing the path from the filename).
2. Then sort it and get the duplicates
Note that for processing such a number of objects it would be advisable to use a database instead.
Last edited by ctsgnb; 01-01-2014 at 05:32 PM..
Reason: Remove sed clause (Thx Rudi)
I have the files logged in the file system with names in the format of : filename_ordernumber_date_time
eg:
file_1_12012007_1101.txt
file_2_12022007_1101.txt
file_1_12032007_1101.txt
I need to find out all the files that are logged multiple times with same order number. In the above eg, I... (1 Reply)
I want to duplicate a row if found two or more values in a particular column for corresponding row which is delimitted by comma.
Input
abc,line one,value1
abc,line two, value1, value2
abc,line three,value1
needs to converted to
abc,line one,value1
abc,line two, value1
abc,line... (8 Replies)
I have a few txt files in some directory and I need to check their sizes one by one. If any of them are greater than 5mb then I need to split the file in two.
Can someone help?
Thanks. (6 Replies)
Hi
I have been struggling with a script for removing duplicate messages from a shared mailbox.
I would like to search for duplicate messages based on the “Message-ID” string within the messages files.
I have managed to find the duplicate “Message-ID” strings and (if I would like) delete... (1 Reply)
I have several files in a folder and I would like to delete the ones that do not contain all the required information (size) let say 1kb.
Any ideas? (4 Replies)
Hi!
I want to find duplicate files (criteria: file size) in my download folder.
I try it like this:
find /Users/frodo/Downloads \! -type d -exec du {} \; | sort > /Users/frodo/Desktop/duplicates_1.txt;
cut -f 1 /Users/frodo/Desktop/duplicates_1.txt | uniq -d | grep -hif -... (9 Replies)
I am new to this forum and this is my first post.
I am looking at an old post with exactly the same name. Can not paste URL because I do not have 5 posts
My requirement is exactly opposite.
I want to get rid of duplicate rows and try to append the values of columns in those rows
... (10 Replies)
Hello Community!
Im newbie on shell programming and its my first post.
Im trying to make a bash shell script that it removes files of subdirectory.
it is called : rms -{g|l|b} size1 dir
-g means : remove file or files in dir that is above size1
-l means: remove file or files in dir that... (1 Reply)
Hi,
In a file, I have to mark duplicate records as 'D' and the latest record alone as 'C'.
In the below file, I have to identify if duplicate records are there or not based on Man_ID, Man_DT, Ship_ID and I have to mark the record with latest Ship_DT as "C" and other as "D" (I have to create... (7 Replies)
Gents,
I have a file like this.
1 1
1 2
2 3
2 4
2 5
3 6
3 7
4 8
5 9
I would like to get something like it
1 1 2
2 3 4 5
3 6 7
Thanks in advance for your support :b: (8 Replies)
Discussion started by: jiam912
8 Replies
LEARN ABOUT NETBSD
uniq
UNIQ(1) BSD General Commands Manual UNIQ(1)NAME
uniq -- report or filter out repeated lines in a file
SYNOPSIS
uniq [-cdu] [-f fields] [-s chars] [input_file [output_file]]
DESCRIPTION
The uniq utility reads the standard input comparing adjacent lines, and writes a copy of each unique input line to the standard output. The
second and succeeding copies of identical adjacent input lines are not written. Repeated lines in the input will not be detected if they are
not adjacent, so it may be necessary to sort the files first.
The following options are available:
-c Precede each output line with the count of the number of times the line occurred in the input, followed by a single space.
-d Don't output lines that are not repeated in the input.
-f fields
Ignore the first fields in each input line when doing comparisons. A field is a string of non-blank characters separated from adja-
cent fields by blanks. Field numbers are one based, i.e. the first field is field one.
-s chars
Ignore the first chars characters in each input line when doing comparisons. If specified in conjunction with the -f option, the
first chars characters after the first fields fields will be ignored. Character numbers are one based, i.e. the first character is
character one.
-u Don't output lines that are repeated in the input.
If additional arguments are specified on the command line, the first such argument is used as the name of an input file, the second is used
as the name of an output file.
The uniq utility exits 0 on success, and >0 if an error occurs.
COMPATIBILITY
The historic +number and -number options have been deprecated but are still supported in this implementation.
SEE ALSO sort(1)STANDARDS
The uniq utility is expected to be IEEE Std 1003.2 (``POSIX.2'') compatible.
BSD January 6, 2007 BSD