The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Remove duplicate lines in log files karthikn7974 Shell Programming and Scripting 4 03-21-2009 06:41 PM
Duplicate Files Finder 0.8.0 (Default branch) iBot Software Releases - RSS News 0 06-02-2008 10:30 AM
removinf files containing duplicate data asinha63 AIX 1 03-14-2006 03:07 PM
remove duplicate files in a directory asinha63 Shell Programming and Scripting 1 03-13-2006 03:46 PM
Removing duplicate files from list with different path vino Shell Programming and Scripting 10 05-12-2005 08:44 AM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 12-09-2005
moxxx68's Avatar
moxxx68 moxxx68 is offline
Registered User
  
 

Join Date: Mar 2004
Posts: 301
getting rid of duplicate files

i have a bad problem with multiple occurances of the same file in
different directories.. how this happened i am not sure! but I know
that i can use awk to scan multiple directory trees to find an
occurance of the same file... some of these files differ somwhat
but that does not matter! the name of the files are the same and
the context is basically the same....
i have seen an awk script that can be run on the command line using
a syntax where var=file:r and dup=var++ and var < 1 or to the
extent of this but can not remember exactly how this works.......
using the C shell;
i need to find occurances of var and if they are greater than one
and remove them leaving one occurance .. .
any examples or clues as to how to piece this together would be
appreciated since i don't use awk that often.
moxxx68
  #2 (permalink)  
Old 12-09-2005
jim mcnamara jim mcnamara is offline Forum Staff  
...@...
  
 

Join Date: Feb 2004
Location: NM
Posts: 5,749
Start with something like this to find actual duplicated names.
Then use the file to find the paths to get full file names.
Code:
find /path -print -exec basename {} \; | awk 'arr[$0]++' > file
  #3 (permalink)  
Old 12-09-2005
moxxx68's Avatar
moxxx68 moxxx68 is offline
Registered User
  
 

Join Date: Mar 2004
Posts: 301
thanx,
will try!
moxxx68.......
  #4 (permalink)  
Old 12-09-2005
moxxx68's Avatar
moxxx68 moxxx68 is offline
Registered User
  
 

Join Date: Mar 2004
Posts: 301
too many occurances to rm manually!
will this work...

find ./path -iname "basename" | awk 'BEGIN{arr[$0]++}{var=[$0];var > 0; var++}END{i=[var++]}' | xargs mv --target-directory=dup-dir

%rm dup-dir

looks like it would work if i could use an array to parse the
occurances starting with occurance 1 instead of 0... obviously
this could be in any directory but the way the tree is configured
it doesn't really matter as long as i have one occurance left!

could please use some help...
moxxx68
  #5 (permalink)  
Old 12-09-2005
moxxx68's Avatar
moxxx68 moxxx68 is offline
Registered User
  
 

Join Date: Mar 2004
Posts: 301
find ./path -print -exec basename {} \; | awk -v var=arr 'arr[$0]++;i >= 2; i$var' | xargs mv --target-directory=test..


this worked to a certain extent although I am getting some error
messages with the cp command and the basename is contingious to
all names of the same type (ex, file.txt-{1,2,3,4}) and I am not
sure if this gives the exact result as far as leaving one file
although i tried diff two test directories against each other it
seemed real close.. please leave affirmation of any correct syntax
used (if any?)..
moxxx68
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 09:06 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0