List dublicated files into a file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting List dublicated files into a file
# 1  
Old 11-19-2015
List dublicated files into a file

Dear All,

I have many files in a directory similar in below format (in order to understand each group files are separated from others by blank lines ). I want to find duplicate filenames and write them into a new file line by line. I tried several scripts but I couldn't be successful.

Do you have any suggestion?

Code:
2222.00.AAA.AHE.DAT
2222.00.AAA.AHN.DAT
2222.00.AAA.AHZ.DAT

2222.01.BBB.AHE.DAT
2222.02.BBB.AHE.DAT
2222.03.BBB.AHE.DAT
2222.01.BBB.AHN.DAT
2222.02.BBB.AHN.DAT
2222.03.BBB.AHN.DAT
2222.04.BBB.AHN.DAT
2222.01.BBB.AHZ.DAT
2222.02.BBB.AHZ.DAT

2222.00.CCC.AHE.DAT
2222.00.CCC.AHN.DAT
2222.00.CCC.AHZ.DAT

2222.01.DDD.AHE.DAT
2222.02.DDD.AHE.DAT
2222.03.DDD.AHE.DAT
2222.04.DDD.AHE.DAT
2222.01.DDD.AHN.DAT
2222.02.DDD.AHN.DAT
2222.01.DDD.AHZ.DAT
2222.02.DDD.AHZ.DAT
2222.03.DDD.AHZ.DAT

It should be below format after scripting.

Code:
2222.01.BBB.AHE.DAT
2222.02.BBB.AHE.DAT
2222.03.BBB.AHE.DAT
2222.01.BBB.AHN.DAT
2222.02.BBB.AHN.DAT
2222.03.BBB.AHN.DAT
2222.04.BBB.AHN.DAT
2222.01.BBB.AHZ.DAT
2222.02.BBB.AHZ.DAT
2222.01.DDD.AHE.DAT
2222.02.DDD.AHE.DAT
2222.03.DDD.AHE.DAT
2222.04.DDD.AHE.DAT
2222.01.DDD.AHN.DAT
2222.02.DDD.AHN.DAT
2222.01.DDD.AHZ.DAT
2222.02.DDD.AHZ.DAT
2222.03.DDD.AHZ.DAT

# 2  
Old 11-19-2015
You can try something like this:

Code:
 find . | awk -F"/" '{print $NF}' | sort | uniq -d > dupfiles.txt

You can do a second sort as the output is not in the required order.

Last edited by mjf; 11-19-2015 at 01:42 PM..
# 3  
Old 11-19-2015
There's not a single duplicate file name in your sample. The file system wouldn't allow it, btw.
# 4  
Old 11-19-2015
Dear RudiC,

You are right. It seems there is no dublicate file name. But actually filenames including BBB and DDD strings are parts of a single file. These are dublicate files for me. These files were created by a conversion program and added some sequence numbers to filenames. Eg.

Below files are parts of 2222.00.BBB.AHE.DAT
Code:
2222.01.BBB.AHE.DAT 
2222.02.BBB.AHE.DAT 
2222.03.BBB.AHE.DAT

I want to find these kind of files and write into a file as a list.

Thanks
# 5  
Old 11-19-2015
Expressing this a bit differently: the second "field" may not be zero. Would this be of some usefulness?
Code:
while IFS="." read  A B C D E REST; do [ 0"$B" -gt 0 ] && printf "%s.%s.%s.%s.%s\n" $A $B $C $D $E; done < file4
2222.01.BBB.AHE.DAT
2222.02.BBB.AHE.DAT
2222.03.BBB.AHE.DAT
2222.01.BBB.AHN.DAT
2222.02.BBB.AHN.DAT
2222.03.BBB.AHN.DAT
2222.04.BBB.AHN.DAT
2222.01.BBB.AHZ.DAT
2222.02.BBB.AHZ.DAT
2222.01.DDD.AHE.DAT
2222.02.DDD.AHE.DAT
2222.03.DDD.AHE.DAT
2222.04.DDD.AHE.DAT
2222.01.DDD.AHN.DAT
2222.02.DDD.AHN.DAT
2222.01.DDD.AHZ.DAT
2222.02.DDD.AHZ.DAT
2222.03.DDD.AHZ.DAT

# 6  
Old 11-19-2015
This code works but is there another way only considering BBB.AHE , BBB.AHN, BBB.AHZ strings? Number of BBB.AHE and others show that those are dublicate files.
Maybe in your script [0"$B" -gt 0 ] part can be modifed but how?

Thanks again
# 7  
Old 11-20-2015
As much as I would like to help, I can't as I don't understand what you want. Show meticulously what input becomes what output and describe the algorithm/logics/reasoning behind it.
This User Gave Thanks to RudiC For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

While loop a file containing list of file names until the files are found?

Hi, I have a control file which will contain all filenames(300) files. Loop through all the file names in the control files and check the existence of this file in another directory(same server). I need to infinitely(2 hrs) run this while loop until all the files are found. Once a file is found,... (5 Replies)
Discussion started by: laknar
5 Replies

2. Shell Programming and Scripting

How to list files that are not first two files date & last file date for every month-year?

Hi All, I need to find all files other than first two files dates & last file date for month and month/year wise list. lets say there are following files in directory Mar 19 2012 c.txt Mar 19 2012 cc.txt Mar 21 2012 d.txt Mar 22 2012 f.txt Mar 24 2012 h.txt Mar 25 2012 w.txt Feb 12... (2 Replies)
Discussion started by: Makarand Dodmis
2 Replies

3. Shell Programming and Scripting

File that Contains a List of Files

I have a first text file (LoopFiles.txt) that contains another list of text files. I need to run NAWK commands on each of the files that are listed in the first text file. I have proven the existence of the first file with ls -l But I get a message that my first file doesnt exist. cd... (5 Replies)
Discussion started by: he204035
5 Replies

4. Shell Programming and Scripting

File name from a List of files

Hi, Greetings!! I'm grepping a string from a series of files, using the below code (how ever the awk is not grepping between 'from' & 'to' time!) awk '$0>=$from&&$0<=$to' from=$START_TIME to=$STOP_TIME $logpath/UL_`date +%Y%m%d`_Scheduler*.log.csv > temp-grep.txt Out of 50 files,... (9 Replies)
Discussion started by: bhargav_k
9 Replies

5. Shell Programming and Scripting

Take a list if strings from a file and search them in a list of files and report them

I have a file 1.txt with the below contents. -----cat 1.txt----- 1234 5678 1256 1234 1247 ------------------- I have 3 more files in a folder -----ls -lrt------- A1.txt A2.txt A3.txt ------------------- The contents of those three files are similar format with different data values... (8 Replies)
Discussion started by: realspirituals
8 Replies

6. Shell Programming and Scripting

Delete old files but with exclusion with file list

Hello Can you please help and check what im missing on script below the goal is to delete the old files more than 7 days old but not the excluded file list inside excluded.dat file #!/bin/sh EXCLUDE=/path/to/exclude/exclude.dat FIND=/bin/find for xfile in '(read $EXCLUDE)' do $FIND... (9 Replies)
Discussion started by: angst_nu
9 Replies

7. Shell Programming and Scripting

List the file or files with last modification date

hi. I need help my programing friends :p I need to list all the files with a certain name (for example FileName) by last modification date but only the one with the last date. If there are two files with the same name and same modification date it should print the both. For example in this set... (6 Replies)
Discussion started by: KitFisto
6 Replies

8. Shell Programming and Scripting

ls > file - Creating file containing the list of all files present in a directory

Hi All, I need to create a file which contains the list of all the files present in that directory. e.g., ls /export/home/user/*.dat > list_file.dat but what i am getting is: $ ls /export/home/user/*.dat > list_file.dat /export/home/user/*.dat: No such file or directory But I have... (1 Reply)
Discussion started by: pranavagarwal
1 Replies

9. UNIX for Dummies Questions & Answers

how to list the files using File Descriptors

hello, I have written a script named listall.sh with the following codes init. #!/bin/bash PATH="/proj/cmon/$1" echo $PATH if ; then echo "Usage: $0 ***" exit 1 else ls -l $PATH/*.sc fi Here there are 3 subdirectories (namely - src, data and jobs)under /proj/cmon, so... (2 Replies)
Discussion started by: shyjuezy
2 Replies

10. UNIX for Dummies Questions & Answers

filesize from a file which has the list of files.

i have a file myfile. it has the below entries /temp/firstfile /temp/secondfile and many more.. okay, now, i want to addup all the space occupied by this file hmmm, but i met with a problem in getting each file out. i did a silly command like more myfile | grep temp | ls -ltr and it... (3 Replies)
Discussion started by: yls177
3 Replies
Login or Register to Ask a Question