Sponsored Content
Top Forums Shell Programming and Scripting Remove duplicate files in same directory Post 302392155 by cmf1985 on Wednesday 3rd of February 2010 12:39:32 PM
Old 02-03-2010
Lightbulb

coolatt, if I'm reading this right, you want to delete from the backup server, for each xyz,2* group, all but the most recent file pertaining to each group.

I recreated your directory with the following files. Please note the order in which they were created (ls -rt):

1265199975.P6583Q0M174865.ecs,S=623:2,
1265199975.P6583Q0M174865.ecs,S=623:2,F
1265198625.P6233Q0M875762.ecs,S=639:2,S
1265199975.P6583Q0M174865.ecs,S=623:2,S
1265198625.P6233Q0M875762.ecs,S=639:2,FS
1265198625.P6233Q0M875762.ecs,S=639:2,F

I created a script looking like this ($FILEDIR being the directory where you have the files that are to be checked, on the backup server):

Code:
#!/bin/bash

ls  -rt $FILEDIR > filelist.txt

awk  -F ':2,' '{print $1}' filelist.txt | sort -u > searchbase.txt

cat searchbase.txt | while read line
do
        grep "$line" filelist.txt | head --lines=-1
done

Please let me know if this outputs the correct files (it does for me).

If it does, I recon simply adding a |xargs rm after the head command should delete the older files.

BTW, the script also works if you have just one file for a group (say the original xyz:2, file), in the sense that it will not delete the backup.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

remove duplicate files in a directory

Hi ppl. I have to check for duplicate files in a directory . the directory has following files /the/folder /containing/the/file a1.yyyymmddhhmmss a1.yyyyMMddhhmmss b1.yyyymmddhhmmss b2.yyyymmddhhmmss c.yyyymmddhhmmss d.yyyymmddhhmmss d.yyyymmddhhmmss where the date time stamp can be... (1 Reply)
Discussion started by: asinha63
1 Replies

2. Shell Programming and Scripting

script that detects duplicate files in directory

I need help with a script which accepts one argument and goes through all the files under a directory and prints a list of possible duplicate files As its output, it prints zero or more lines, each one containing a space-separated list of filenames. All the files listed on one line have the same... (1 Reply)
Discussion started by: trueman82
1 Replies

3. Shell Programming and Scripting

remove all duplicate lines from all files in one folder

Hi, is it possible to remove all duplicate lines from all txt files in a specific folder? This is too hard for me maybe someone could help. lets say we have an amount of textfiles 1 or 2 or 3 or... maximum 50 each textfile has lines with text. I want all lines of all textfiles... (8 Replies)
Discussion started by: lowmaster
8 Replies

4. Shell Programming and Scripting

Remove duplicate files based on text string?

Hi I have been struggling with a script for removing duplicate messages from a shared mailbox. I would like to search for duplicate messages based on the “Message-ID” string within the messages files. I have managed to find the duplicate “Message-ID” strings and (if I would like) delete... (1 Reply)
Discussion started by: spangberg
1 Replies

5. Shell Programming and Scripting

Remove Duplicate Files On Remote Servers

Hello, I wrote a basic script that works however I am was wondering if it could be sped up. I am comparing files over ssh to remove the file from the source server directory if a match occurs. Please Advise me on my mistakes. #!/bin/bash for file in `ls /export/home/podcast2/"$1" ` ; do ... (5 Replies)
Discussion started by: jaysunn
5 Replies

6. Shell Programming and Scripting

perl/shell need help to remove duplicate lines from files

Dear All, I have multiple files having number of records, consist of more than 10 columns some column values are duplicate and i want to remove these duplicate values from these files. Duplicate values may come in different files.... all files laying in single directory.. Need help to... (3 Replies)
Discussion started by: arvindng
3 Replies

7. Shell Programming and Scripting

[uniq + awk?] How to remove duplicate blocks of lines in files?

Hello again, I am wanting to remove all duplicate blocks of XML code in a file. This is an example: input: <string-array name="threeItems"> <item>item1</item> <item>item2</item> <item>item3</item> </string-array> <string-array name="twoItems"> <item>item1</item> <item>item2</item>... (19 Replies)
Discussion started by: raidzero
19 Replies

8. Shell Programming and Scripting

Remove duplicate files

Hi, In a directory, e.g. ~/corpus is a lot of files and subdirectories. Some of the files are named: 12345___PP___0902___AA.txt 12346___PP___0902___AA. txt 12347___PP___0902___AA. txt The amount of files varies. I need to keep the highest (12347___PP___0902___AA. txt) and remove... (5 Replies)
Discussion started by: corfuitl
5 Replies

9. Windows & DOS: Issues & Discussions

Remove duplicate lines from text files.

So, I have text files, one "fail.txt" And one "color.txt" I now want to use a command line (DOS) to remove ANY line that is PRESENT IN BOTH from each text file. Afterwards there shall be no duplicate lines. (1 Reply)
Discussion started by: pasc
1 Replies

10. Shell Programming and Scripting

Remove all but newest two files (Not a duplicate post)

TARGET_DIR='/media/andy/MAXTOR_SDB1/Ubuntu_Mate_18.04/' REGEX='{4}-{2}-{2}_{2}:{2}' # regular expression that match to: date '+%Y-%m-%d_%H:%M' LATEST_FILE="$(ls "$TARGET_DIR" | egrep "^${REGEX}$" | tail -1)" find "$TARGET_DIR" ! -name "$LATEST_FILE" -type f -regextype egrep -regex... (7 Replies)
Discussion started by: drew77
7 Replies
svn-fast-backup(1)					      General Commands Manual						svn-fast-backup(1)

NAME
svn-fast-backup - very fast backup for Subversion fsfs repositories. SYNOPSIS
svn-fast-backup [-q] [-k{N|all}] [-f] [-t] [-s] repos_path backup_dir DESCRIPTION
svn-fast-backup uses rsync snapshots for very fast backup of a Subversion fsfs repository at repos_path to backup_dir/repos-rev, the latest revision number in the repository. Multiple fsfs backups share data via hardlinks, so old backups are almost free, since a newer revision of a repository is almost a complete superset of an older revision. This is good for replacing incremental log-dump+restore-style backups because it is just as space-conserving and even faster; there is no inter-backup state (old backups are essentially caches); each backup directory is self-contained. It has the same command-line interface as svn-hot-backup(1) (if you use --force), but only works for fsfs repositories. svn-fast-backup keeps 64 backups by default and deletes backups older than these; this can be adjusted with the -k option. OPTIONS
-h, --help Shows some brief help text. -q, --quiet Quieter-than-usual operation. -k, --keep=N Keep a specified number of backups; the default is to keep 64. -k, --keep=all Do not delete any old backups at all. -f, --force Make a new backup even if one with the current revision exists. -t, --trace Show actions. -s, --simulate Don't perform actions. AUTHOR
Voluntary contributions made by many individuals. Copyright (C) 2006 CollabNet. 2006-11-09 svn-fast-backup(1)
All times are GMT -4. The time now is 03:37 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy