Sponsored Content
Top Forums Shell Programming and Scripting Fastest way to delete duplicates from a large filelist..... Post 302537465 by alister on Friday 8th of July 2011 09:26:54 AM
Old 07-08-2011
Perhaps this may be of use:
Code:
$ cat filenames
no1
no2
$ cat paths
yes0
/path/to/file/yes1
/path/to/file/yes2
/path/to/file/no1
/path/to/file/no2
/path/to/file/no2/yes3
$ awk -F/ 'FNR==NR {fn[$0]; next} !($NF in fn)' filenames paths
yes0
/path/to/file/yes1
/path/to/file/yes2
/path/to/file/no2/yes3

Regards,
Alister
This User Gave Thanks to alister For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

fastest way to remove duplicates.

I have searched the FAQ - by using sort, duplicates, etc.... but I didn't get any articles or results on it. Currently, I am using: sort -u file1 > file2 to remove duplicates. For a file size of 1giga byte approx. time taken to remove duplicates is 1hr 21 mins. Is there any other faster way... (15 Replies)
Discussion started by: radhika
15 Replies

2. UNIX for Dummies Questions & Answers

Fastest way to traverse through large directories

Hi! I have thousands of sub-directories, and hundreds of thousands of files in them. What is the fast way to find out which files are older than a certain date? Is the "find" command the fastest? Or is there some other way? Right now I have a C script that traverses through and checks... (5 Replies)
Discussion started by: sreedharange
5 Replies

3. Shell Programming and Scripting

how to delete/remove directory in fastest way

hello i need help to remove directory . The directory is not empty ., it contains several sub directories and files inside that.. total number of files in one directory is 12,24,446 . rm -rf doesnt work . it is prompting for every file .. i want to delete without prompting and... (6 Replies)
Discussion started by: getdpg
6 Replies

4. Shell Programming and Scripting

An interactive way to delete duplicates

1)I am trying to write a script that works interactively lists duplicated records on certain field/column and asks user to delete one or more. And finally it deletes all the records the used has asked for. I have an idea to store those line numbers in an array, not sure how to do this in... (3 Replies)
Discussion started by: chvs2000
3 Replies

5. Shell Programming and Scripting

how can I delete duplicates in the log?

I have a log file and I am trying to run a script against it to search for key issues such as invalid users, errors etc. In one part, I grep for session closed and get a lot of the same thing,, ie. root username etc. I want to remove the multiple root and just have it do a count, like wc -l ... (5 Replies)
Discussion started by: taekwondo
5 Replies

6. Shell Programming and Scripting

Fastest way to delete line

I have a 5 GB text file(log/debug) I want to delete all lines containing 'TRACE' Command used sed -i '/TRACE/d' mylog.txt Is there any other fastest way to do this? (1 Reply)
Discussion started by: johnbach
1 Replies

7. Shell Programming and Scripting

Delete duplicates via script?

Hello, i have the following problem: there are two folders with a lot of files. Example: FolderA contains AAA, BBB, CCC FolderB contains DDD, EEE, AAA How can i via script identify AAA as duplicate in Folder B and delete it there? So that only DDD and EEE remain, in Folder B? Thank you... (16 Replies)
Discussion started by: Y-T
16 Replies

8. Shell Programming and Scripting

Delete duplicates in CA bundle

I do have a big CA bundle certificate file and each time if i get request to add new certificate to the existing bundle i need to make sure it is not present already. How i can validate the duplicates. The alignment of the certificate within the bundle seems to be different. Example: Cert 1... (7 Replies)
Discussion started by: diva_thilak
7 Replies

9. Shell Programming and Scripting

Delete only if duplicates found in each record

Hi, i have another problem. I have been trying to solve it by myself but failed. inputfile ;; ID T08578 NAME T08578 SBASE 30696 EBASE 32083 TYPE P func just test func chronology func cholesterol func null INT 30765-37333 INT 37154-37318 Link 5546 Link 8142 (4 Replies)
Discussion started by: redse171
4 Replies

10. Shell Programming and Scripting

To Delete the duplicates using Part of File Name

I am using the below script to delete duplicate files but it is not working for directories with more than 10k files "Argument is too long" is getting for ls -t. Tried to replace ls -t with find . -type f \( -iname "*.xml" \) -printf '%T@ %p\n' | sort -rg | sed -r 's/* //' | awk... (8 Replies)
Discussion started by: gold2k8
8 Replies
MRENAME(1)						      General Commands Manual							MRENAME(1)

NAME
mrename - program to rename files SYNOPSIS
mrename 'pattern' prefix [option] DESCRIPTION
This manual page documents briefly the mrename command. This manual page was written for the Debian GNU/Linux distribution because the original program does not have a manual page. mrename is a tool for easy and automatic renaming of many files. The 'pattern' is the pattern to search files to rename (quoted to avoid that bash resolve it), and prefix is the prefix that will be added to the name of each file. The two alternative options for copying or moving files in the new name are explained below. All parameters are needed, and you have to stay and launch the script in the same direc- tory of the files to be renamed. The program should be able to write in this directory. OPTIONS
There are only the following three options. -c The option -c will copy each file with the new filename. -m The option -m will move each file in the new filename. -h Display help. EXAMPLE
If you have a directory with two jpeg images prof.jpg and forp.jpg and you want to add them a prefix like item0, item1 etc.. (that is item0prof.jpg, item1forp.jpg etc..) do this: cd /path/to/the/images mrename '*.jpg' item -c to copy each matching file into another with the new name mrename '*.jpg' item -m to rename each file without keeping a copy with the previous name Word-Wide-Web: http://alfalinux.sourceforge.net/mrename.php3 AUTHOR
: Giancarlo -rofus- Erra e-mail: rofus@mindless.com This manual page was written by Dr. Guenter Bechly <gbechly@debian.org>, for the Debian GNU/Linux system (but may be used by others). It is distributed under the GPL just like mrename itself. October 22, 2000 MRENAME(1)
All times are GMT -4. The time now is 09:49 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy