Sponsored Content
Top Forums Shell Programming and Scripting Removing duplicates depending on file size Post 302830547 by krishmaths on Tuesday 9th of July 2013 05:25:08 AM
Old 07-09-2013
@Error404, Please try below solution.

cd to the directory where you have the files and execute below command. You may redirect the output to a temporary file.


Code:
ls -l|sort -k9 | awk '{OFS="."}{print $5,$9}' | awk -F"." 'BEGIN{row=$0;T=$2;} {if ($2==T) {if($1>max){max=$1;row=$0;}} else {print row;row=$0;max=0}; T=$2} END{print row}'

The command first lists all the files under the directory and picks the filename ($9) and size ($5). You may adjust this if you are getting the filename and size in different positions.

The fiesize is output as first field and the filename follows. I have used "." as an output delimiter to easily fetch the file with maximum size.

I created below files in a directory called tempdir:
Code:
LAJ.g.gif-1.JPEG                    4
LAJ.g.gif-2.JPEG                   12
LKJFDA01.gf.gif-1.JPEG           0
LKJFDA01.gf.gif-2.JPEG           0
LKJFDA01.gif-3.JPEG               4
OLUSDN.gf.gif-1.JPEG             0


The output was as below.
Code:
12.LAJ.g.gif-2.JPEG
4.LKJFDA01.gif-3.JPEG
0.OLUSDN.gf.gif-1.JPEG

The first field in the output is the maximum size of the file starting with 2nd field (i.e., LAJ, etc) in bytes.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

removing duplicates from a file

i have a file with some 1000 entries it will contain entries like 1000,ram 2000,pankaj 1001,rahim 1000,ram 2532,govind 2000,pankaj 3000,venkat 2532,govind what i want is i want to extract only the distinct rows from this file so my output should contain only 1000,ram... (2 Replies)
Discussion started by: trichyselva
2 Replies

2. Shell Programming and Scripting

Removing duplicates in a sorted file by field.

I have data like this: It's sorted by the 2nd field (TID). envoy,90000000000000634600010001,04/11/2008,23:19:27,RB00266,0015,DETAIL,ERROR, envoy,90000000000000634600010001,04/12/2008,04:23:45,RB00266,0015,DETAIL,ERROR,... (1 Reply)
Discussion started by: kinksville
1 Replies

3. UNIX for Dummies Questions & Answers

removing duplicates of a pattern from a file

hey all, I need some help. I have a text file with names in it. My target is that if a particular pattern exists in that file more than once..then i want to rename all the occurences of that pattern by alternate patterns.. for e.g if i have PATTERN occuring 5 times then i want to... (3 Replies)
Discussion started by: ashisharora
3 Replies

4. Shell Programming and Scripting

Removing duplicates from log file?

I have a log file with posts looking like this: -- Messages can be delivered by different systems at different times. The id number is used to sort out duplicate messages. What I need is to strip the arrival time from each post, sort posts by id number, and reattach arrival time to respective... (2 Replies)
Discussion started by: Ilja
2 Replies

5. Shell Programming and Scripting

Removing Duplicates from file

Hi Experts, Please check the following new requirement. I got data like the following in a file. FILE_HEADER 01cbbfde7898410| 3477945| home| 1 01cbc275d2c122| 3478234| WORK| 1 01cbbe4362743da| 3496386| Rich Spare| 1 01cbc275d2c122| 3478234| WORK| 1 This is pipe separated file with... (3 Replies)
Discussion started by: tinufarid
3 Replies

6. Shell Programming and Scripting

formatting a file and removing duplicates

Hi, I have a file that I want to change the format of. It is a large file in rows but I want it to be comma separated (comma then a space). The current file looks like this: HI, Joe, Bob, Jack, Jack After I would want to remove any duplicates so it would look like this: HI, Joe,... (2 Replies)
Discussion started by: kylle345
2 Replies

7. UNIX for Dummies Questions & Answers

Removing duplicates from a file

Hi All, I am merging files coming from 2 different systems ,while doing that I am getting duplicates entries in the merged file I,01,000131,764,2,4.00 I,01,000131,765,2,4.00 I,01,000131,772,2,4.00 I,01,000131,773,2,4.00 I,01,000168,762,2,2.00 I,01,000168,763,2,2.00... (5 Replies)
Discussion started by: Sri3001
5 Replies

8. UNIX for Dummies Questions & Answers

Grep from pattern file without removing duplicates?

I have been using grep to output whole lines using a pattern file with identifiers (fileA): fig|562.2322.peg.1 fig|562.2322.peg.3 fig|562.2322.peg.3 fig|562.2322.peg.3 fig|562.2322.peg.7 From fileB with corresponding identifiers in the second column: NODE_0 fig|562.2322.peg.1 peg ... (2 Replies)
Discussion started by: Mauve
2 Replies

9. Shell Programming and Scripting

Removing duplicates from new file

i hav two files like i want to remove/delete all the duplicate lines in file2 which are viz unix,unix2,unix3 (2 Replies)
Discussion started by: sagar_1986
2 Replies

10. Shell Programming and Scripting

Removing duplicates from new file

i hav two files like i want to remove/delete all the duplicate lines in file2 which are viz unix,unix2,unix3.I have tried previous post also,but in that complete line must be similar.In this case i have to verify first column only regardless what is the content in succeeding columns. (3 Replies)
Discussion started by: sagar_1986
3 Replies
MORE(1) 						    BSD General Commands Manual 						   MORE(1)

NAME
more -- file perusal filter for crt viewing SYNOPSIS
more [-dlfpcsu] [-num] [+/pattern] [+linenum] [file ...] DESCRIPTION
More is a filter for paging through text one screenful at a time. This version is especially primitive. Users should realize that less(1) provides more(1) emulation and extensive enhancements. OPTIONS
Command line options are described below. Options are also taken from the environment variable MORE (make sure to precede them with a dash (``-'')) but command line options will override them. -num This option specifies an integer which is the screen size (in lines). -d more will prompt the user with the message "[Press space to continue, 'q' to quit.]" and will display "[Press 'h' for instructions.]" instead of ringing the bell when an illegal key is pressed. -l more usually treats ^L (form feed) as a special character, and will pause after any line that contains a form feed. The -l option will prevent this behavior. -f Causes more to count logical, rather than screen lines (i.e., long lines are not folded). -p Do not scroll. Instead, clear the whole screen and then display the text. -c Do not scroll. Instead, paint each screen from the top, clearing the remainder of each line as it is displayed. -s Squeeze multiple blank lines into one. -u Suppress underlining. +/ The +/ option specifies a string that will be searched for before each file is displayed. +num Start at line number num. COMMANDS
Interactive commands for more are based on vi(1). Some commands may be preceded by a decimal number, called k in the descriptions below. In the following descriptions, ^X means control-X. h or ? Help: display a summary of these commands. If you forget all the other commands, remember this one. SPACE Display next k lines of text. Defaults to current screen size. z Display next k lines of text. Defaults to current screen size. Argument becomes new default. RETURN Display next k lines of text. Defaults to 1. Argument becomes new default. d or ^D Scroll k lines. Default is current scroll size, initially 11. Argument becomes new default. q or Q or INTERRUPT Exit. s Skip forward k lines of text. Defaults to 1. f Skip forward k screenfuls of text. Defaults to 1. b or ^B Skip backwards k screenfuls of text. Defaults to 1. Only works with files, not pipes. ' Go to place where previous search started. = Display current line number. /pattern Search for kth occurrence of regular expression. Defaults to 1. n Search for kth occurrence of last r.e. Defaults to 1. !<cmd> or :!<cmd> Execute <cmd> in a subshell v Start up an editor at current line. The editor is taken from the environment variable VISUAL if defined, or EDITOR if VISUAL is not defined, or defaults to "vi" if neither VISUAL nor EDITOR is defined. ^L Redraw screen :n Go to kth next file. Defaults to 1. :p Go to kth previous file. Defaults to 1. :f Display current file name and line number . Repeat previous command ENVIRONMENT
More utilizes the following environment variables, if they exist: MORE This variable may be set with favored options to more. SHELL Current shell in use (normally set by the shell at login time). TERM Specifies terminal type, used by more to get the terminal characteristics necessary to manipulate the screen. SEE ALSO
vi(1), less(1) AUTHORS
Eric Shienbrood, UC Berkeley Modified by Geoff Peck, UCB to add underlining, single spacing Modified by John Foderaro, UCB to add -c and MORE environment variable HISTORY
The more command appeared in 3.0BSD. This man page documents more version 5.19 (Berkeley 6/29/88), which is currently in use in the Linux community. Documentation was produced using several other versions of the man page, and extensive inspection of the source code. AVAILABILITY
The more command is part of the util-linux-ng package and is available from ftp://ftp.kernel.org/pub/linux/utils/util-linux-ng/. Linux 0.98 December 25, 1992 Linux 0.98
All times are GMT -4. The time now is 09:12 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy