Finding duplicate lines and deleting folders based on them

10-31-2008

Registered User

1, 0

Join Date: Oct 2008

Last Activity: 31 December 2008, 4:24 AM EST

Posts: 1

Thanks Given: 0

Thanked 0 Times in 0 Posts

Finding duplicate lines and deleting folders based on them

Hi,

I have research data, which is organized to 100 folders numbered 00-99. I have many sets of 100 folders, for different values of initial parameters. For some reason, the computer that ran the program to gather the data, didn't always create a unique seed for each folder. I anticipated that this could happen, so the seed number is saved to a file called seed.txt.
I need to delete folders which have duplicate seeds, so that each folder has a unique seed. I've used this kind of command

cat */seed.txt | sort | uniq -c | grep '2 '

to find out the duplicate seeds. There are some problems with this command. Firstly, it won't find any seeds that appear more than twice. Secondly, I won't know in which folders those duplicate seeds are.
How should I proceed from here? I guess I'll have to start learning some AWK. Could I do this by saving the seeds to an array, looping through the seeds and looking for each seed? When found, delete the folder in which the seed is found and proceed with the next seed.

Thank you for your help.

Jopi

View Public Profile for Jopi

Find all posts by Jopi

10-31-2008

Registered User

55, 0

Join Date: Oct 2008

Last Activity: 14 May 2014, 4:24 AM EDT

Location: HYDERABAD INDIA

Posts: 55

Thanks Given: 11

Thanked 0 Times in 0 Posts

instead of uniq -c , try uniq -d
but for that the file should be sorted before that
cat file | sort | uniq -d > outputfile
then open outputfile and insert rmdir at ^
and run the file ' sh outputfile '
i hope that is what u wanted

paresh n doshi

View Public Profile for paresh n doshi

Find all posts by paresh n doshi

Shell Programming and Scripting

Finding duplicate lines and deleting folders based on them

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing duplicate lines on first column based with pipe delimiter

Discussion started by: parithi06

2. Shell Programming and Scripting

Remove duplicate lines from file based on fields

Discussion started by: Lord Spectre

3. Shell Programming and Scripting

Help with a deleting lines based on a pattern

Discussion started by: rbaggio666

4. UNIX for Dummies Questions & Answers

awk solution to duplicate lines based on column

Discussion started by: torchij

5. Shell Programming and Scripting

Deleting lines based on a condition for a group of files

Discussion started by: anurupa777

6. Shell Programming and Scripting

Scripting to Duplicate Lines Based on Variable

Discussion started by: sjrupp

7. Shell Programming and Scripting

Remove duplicate lines based on field and sort

Discussion started by: cokedude

8. UNIX for Dummies Questions & Answers

[Solved] deleting pattern based lines in sed

Discussion started by: pandeesh

9. UNIX for Dummies Questions & Answers

remove duplicate lines based on two columns and judging from a third one

Discussion started by: TheTransporter

10. UNIX for Dummies Questions & Answers

Delete lines with duplicate strings based on date

Discussion started by: mattv