Sponsored Content
Full Discussion: Awk: Remove Duplicates
Top Forums Shell Programming and Scripting Awk: Remove Duplicates Post 302884997 by siramitsharma on Thursday 23rd of January 2014 01:52:46 AM
Old 01-23-2014
Awk: Remove Duplicates

I have the following code for removing duplicate records based on fields in inputfile file & moves the duplicate records in duplicates file(1st Awk) & in 2nd awk i fetch the non duplicate entries in inputfile to tmp file and use move to update the original file.

Requirement:
Can both the awk be combined in single call? or is there any efficient way to do the same?

Code:
awk -F, 'dupentries[$1,$2,$3,$4,$5,$6,$7,$8]++' inputfile >> Duplicates
awk -F, '!dupentries[$1,$2,$3,$4,$5,$6,$7,$8]++' inputfile > inputfile.tmp
mv inputfile.tmp inputfile

 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How to remove duplicates without sorting

Hello, I can remove duplicate entries in a file by: sort File1 | uniq > File2 but how can I remove duplicates without sorting the file? I tried cat File1 | uniq > File2 but it doesn't work thanks (4 Replies)
Discussion started by: orahi001
4 Replies

2. Shell Programming and Scripting

Remove duplicates

Hello Experts, I have two files named old and new. Below are my example files. I need to compare and print the records that only exist in my new file. I tried the below awk script, this script works perfectly well if the records have exact match, the issue I have is my old file has got extra... (4 Replies)
Discussion started by: forumthreads
4 Replies

3. Shell Programming and Scripting

remove duplicates and sort

Hi, I'm using the below command to sort and remove duplicates in a file. But, i need to make this applied to the same file instead of directing it to another. Thanks (6 Replies)
Discussion started by: dvah
6 Replies

4. Shell Programming and Scripting

bash - remove duplicates

I need to use a bash script to remove duplicate files from a download list, but I cannot use uniq because the urls are different. I need to go from this: http://***/fae78fe/file1.wmv http://***/39du7si/file1.wmv http://***/d8el2hd/file2.wmv http://***/h893js3/file2.wmv to this: ... (2 Replies)
Discussion started by: locoroco
2 Replies

5. Shell Programming and Scripting

awk remove first duplicates

Hi All, I have searched many threads for possible close solution. But I was unable to get simlar scenario. I would like to print all duplicate based on 3rd column except the first occurance. Also would like to print if it is single entry(non-duplicate). i/P file 12 NIL ABD LON 11 NIL ABC... (6 Replies)
Discussion started by: sybadm
6 Replies

6. Shell Programming and Scripting

Remove duplicates

I have a file with the following format: fields seperated by "|" title1|something class|long...content1|keys title2|somhing class|log...content1|kes title1|sothing class|lon...content1|kes title3|shing cls|log...content1|ks I want to remove all duplicates with the same "title field"(the... (3 Replies)
Discussion started by: dtdt
3 Replies

7. Shell Programming and Scripting

Remove top 3 duplicates

hello , I have a requirement with input in below format abc 123 xyz bcd 365 kii abc 987 876 cdf 987 uii abc 456 yuu bcd 654 rrr Expecting Output abc 456 yuu bcd 654 rrr cdf 987 uii (1 Reply)
Discussion started by: Tomlight
1 Replies

8. Shell Programming and Scripting

Sort and Remove duplicates

Here is my task : I need to sort two input files and remove duplicates in the output files : Sort by 13 characters from 97 Ascending Sort by 1 characters from 96 Ascending If duplicates are found retain the first value in the file the input files are variable length, convert... (4 Replies)
Discussion started by: ysvsr1
4 Replies

9. Shell Programming and Scripting

Remove duplicates

Hi I have a below file structure. 200,1245,E1,1,E1,,7611068,KWH,30, ,,,,,,,, 200,1245,E1,1,E1,,7611070,KWH,30, ,,,,,,,, 300,20140223,0.001,0.001,0.001,0.001,0.001 300,20140224,0.001,0.001,0.001,0.001,0.001 300,20140225,0.001,0.001,0.001,0.001,0.001 300,20140226,0.001,0.001,0.001,0.001,0.001... (1 Reply)
Discussion started by: tejashavele
1 Replies

10. Shell Programming and Scripting

awk - Remove duplicates during array build

Greetings Experts, Issue: Within awk script, remove the duplicate occurrences that are space (1 single space character) separated Description: I am processing 2 files using awk and during processing, I am building an array and there are duplicates on this; how can I delete the duplicates... (3 Replies)
Discussion started by: chill3chee
3 Replies
hardlink(1)						      General Commands Manual						       hardlink(1)

NAME
hardlink - Consolidate duplicate files via hardlinks SYNOPSIS
hardlink [-c] [-n] [-v] [-vv] [-h] directory1 [ directory2 ... ] DESCRIPTION
This manual page documents hardlink, a program which consolidates duplicate files in one or more directories using hardlinks. hardlink traverses one or more directories searching for duplicate files. When it finds duplicate files, it uses one of them as the mas- ter. It then removes all other duplicates and places a hardlink for each one pointing to the master file. This allows for conservation of disk space where multiple directories on a single filesystem contain many duplicate files. Since hard links can only span a single filesystem, hardlink is only useful when all directories specified are on the same filesystem. OPTIONS
-c Compare only the contents of the files being considered for consolidation. Disregards permission, ownership and other differ- ences. -f Force hardlinking across file systems. -n Do not perform the consolidation; only print what would be changed. -v Print summary after hardlinking. -vv Print every hardlinked file and bytes saved. Also print summary after hardlinking. -h Show help. AUTHOR
hardlink was written by Jakub Jelinek <jakub@redhat.com>. Man page written by Brian Long. Man page updated by Jindrich Novy <jnovy@redhat.com> BUGS
hardlink assumes that its target directory trees do not change from under it. If a directory tree does change, this may result in hardlink accessing files and/or directories outside of the intended directory tree. Thus, you must avoid running hardlink on potentially changing directory trees, and especially on directory trees under control of another user. hardlink(1)
All times are GMT -4. The time now is 03:38 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy