Remove duplicate rows when >10 based on single column value Post: 302590795

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Remove duplicate rows of a file based on a value of a column

Hi, I am processing a file and would like to delete duplicate records as indicated by one of its column. e.g. COL1 COL2 COL3 A 1234 1234 B 3k32 2322 C Xk32 TTT A NEW XX22 B 3k32 ...

2. Shell Programming and Scripting

how to delete duplicate rows based on last column

hii i have a huge amt of data stored in a file.Here in this file i need to remove duplicates rows in such a way that the last column has different data & i must check for greatest among last colmn data & print the largest data along with other entries but just one of other duplicate entries is...

3. Shell Programming and Scripting

Remove duplicate line detail based on column one data

My input file: AVI.out <detail>named as the RRM .</detail> AVI.out <detail>Contains 1 RRM .</detail> AR0.out <detail>named as the tellurite-resistance.</detail> AWG.out <detail>Contains 2 HTH .</detail> ADV.out <detail>named as the DENR family.</detail> ADV.out ...

4. Shell Programming and Scripting

duplicate row based on single column

I am a newbie to shell scripting .. I have a .csv file. It has 1000 some rows and about 7 columns... but before I insert this data to a table I have to parse it and clean it ..basing on the value of the first column..which a string of phone number type... example below.. column 1 ...

5. Shell Programming and Scripting

remove duplicates based on single column

Hello, I am new to shell scripting. I have a huge file with multiple columns for example: I have 5 columns below. HWUSI-EAS000_29:1:105 + chr5 76654650 AATTGGAA HHHHG HWUSI-EAS000_29:1:106 + chr5 76654650 AATTGGAA B@HYL HWUSI-EAS000_29:1:108 + ...

6. Shell Programming and Scripting

Removing duplicate records in a file based on single column

Hi, I want to remove duplicate records including the first line based on column1. For example inputfile(filer.txt): ------------- 1,3000,5000 1,4000,6000 2,4000,600 2,5000,700 3,60000,4000 4,7000,7777 5,999,8888 expected output: ---------------- 3,60000,4000 4,7000,7777...

7. Shell Programming and Scripting

Removing duplicate records in a file based on single column explanation

I was reading this thread. It looks like a simpler way to say this is to only keep uniq lines based on field or column 1. https://www.unix.com/shell-programming-scripting/165717-removing-duplicate-records-file-based-single-column.html Can someone explain this command please? How are there no...

8. UNIX for Dummies Questions & Answers

merging rows into new file based on rows and first column

I have 2 files, file01= 7 columns, row unknown (but few) file02= 7 columns, row unknown (but many) now I want to create an output with the first field that is shared in both of them and then subtract the results from the rest of the fields and print there e.g. file 01 James|0|50|25|10|50|30...

9. Shell Programming and Scripting

Converting Single Column into Multiple rows, but with strings to specific tab column

Dear fellows, I need your help. I'm trying to write a script to convert a single column into multiple rows. But it need to recognize the beginning of the string and set it to its specific Column number. Each Line (loop) begins with digit (RANGE). At this moment it's kind of working, but it...

10. Shell Programming and Scripting

Remove duplicate rows based on one column

Dear members, I need to filter a file based on the 8th column (that is id), and does not mather the other columns, because I want just one id (1 line of each id) and remove the duplicates lines based on this id (8th column), and does not matter wich duplicate will be removed. example of my file...

LEARN ABOUT DEBIAN

fdupes

FDUPES(1)						      General Commands Manual							 FDUPES(1)

NAME

       fdupes - finds duplicate files in a given set of directories

SYNOPSIS

       fdupes [ options ] DIRECTORY ...

DESCRIPTION

       Searches  the  given  path for duplicate files. Such files are found by comparing file sizes and MD5 signatures, followed by a byte-by-byte
       comparison.

OPTIONS

       -r --recurse
	      for every directory given follow subdirectories encountered within

       -R --recurse:
	      for each directory given after this option follow subdirectories encountered within (note the ':' at the	end  of  option;  see  the
	      Examples section below for further explanation)

       -s --symlinks
	      follow symlinked directories

       -H --hardlinks
	      normally, when two or more files point to the same disk area they are treated as non-duplicates; this option will change this behav-
	      ior

       -n --noempty
	      exclude zero-length files from consideration

       -f --omitfirst
	      omit the first file in each set of matches

       -A --nohidden
	      exclude hidden files from consideration

       -1 --sameline
	      list each set of matches on a single line

       -S --size
	      show size of duplicate files

       -m --summarize
	      summarize duplicate files information

       -q --quiet
	      hide progress indicator

       -d --delete
	      prompt user for files to preserve, deleting all others (see CAVEATS below)

       -N --noprompt
	      when used together with --delete, preserve the first file in each set of duplicates and delete the others without prompting the user

       -v --version
	      display fdupes version

       -h --help
	      displays help

SEE ALSO

       md5sum(1)

NOTES

       Unless -1 or --sameline is specified, duplicate files are listed together in groups, each file displayed on a separate line. The groups are
       then separated from each other by blank lines.

       When -1 or --sameline is specified, spaces and backslash characters  () appearing in a filename are preceded by a backslash character.

EXAMPLES

       fdupes a --recurse: b
	      will follow subdirectories under b, but not those under a.

       fdupes a --recurse b
	      will follow subdirectories under both a and b.

CAVEATS

       If  fdupes  returns  with  an error message such as fdupes: error invoking md5sum it means the program has been compiled to use an external
       program to calculate MD5 signatures (otherwise, fdupes uses internal routines for this purpose), and an error has occurred while attempting
       to execute it. If this is the case, the specified program should be properly installed prior to running fdupes.

       When using -d or --delete, care should be taken to insure against accidental data loss.

       When used together with options -s or --symlink, a user could accidentally preserve a symlink while deleting the file it points to.

       Furthermore, when specifying a particular directory more than once, all files within that directory will be listed as their own duplicates,
       leading to data loss should a user preserve a file without its "duplicate" (the file itself!).

AUTHOR

       Adrian Lopez <adrian2@caribe.net>

																	 FDUPES(1)

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Remove duplicate rows of a file based on a value of a column

Discussion started by: risk_sly

2. Shell Programming and Scripting

how to delete duplicate rows based on last column

Discussion started by: reva

3. Shell Programming and Scripting

Remove duplicate line detail based on column one data

Discussion started by: patrick87

4. Shell Programming and Scripting

duplicate row based on single column

Discussion started by: mitr