Finding duplicate lines and deleting folders based on them Post: 302253123

Sponsored Content

Top Forums Shell Programming and Scripting Finding duplicate lines and deleting folders based on them Post 302253123 by Jopi on Friday 31st of October 2008 04:05:17 AM

10-31-2008

Registered User

Finding duplicate lines and deleting folders based on them

Hi,

I have research data, which is organized to 100 folders numbered 00-99. I have many sets of 100 folders, for different values of initial parameters. For some reason, the computer that ran the program to gather the data, didn't always create a unique seed for each folder. I anticipated that this could happen, so the seed number is saved to a file called seed.txt.
I need to delete folders which have duplicate seeds, so that each folder has a unique seed. I've used this kind of command

cat */seed.txt | sort | uniq -c | grep '2 '

to find out the duplicate seeds. There are some problems with this command. Firstly, it won't find any seeds that appear more than twice. Secondly, I won't know in which folders those duplicate seeds are.
How should I proceed from here? I guess I'll have to start learning some AWK. Could I do this by saving the seeds to an array, looping through the seeds and looking for each seed? When found, delete the folder in which the seed is found and proceed with the next seed.

Thank you for your help.

Jopi

View Public Profile for Jopi

Find all posts by Jopi

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Delete lines with duplicate strings based on date

Hey all, a relative bash/script newbie trying solve a problem. I've got a text file with lots of lines that I've been able to clean up and format with awk/sed/cut, but now I'd like to remove the lines with duplicate usernames based on time stamp. Here's what the data looks like 2007-11-03...

2. UNIX for Dummies Questions & Answers

remove duplicate lines based on two columns and judging from a third one

hello all, I have an input file with four columns like this with a lot of lines and for example, line 1 and line 5 match because the first 4 characters match and the fourth column matches too. I want to keep the line that has the lowest number in the third column. So I discard line 5....

3. UNIX for Dummies Questions & Answers

[Solved] deleting pattern based lines in sed

HI, My input file contains below data: DFHDR 12345110 1,200 2,-100 1,100 2,123 12345110 1,300 2,200 DFTLR In the above data, the first line and last lines should be remove as well as the lines in which contains 110 as position(6,7,8 position) should also be removed, How we...

4. Shell Programming and Scripting

Remove duplicate lines based on field and sort

I have a csv file that I would like to remove duplicate lines based on field 1 and sort. I don't care about any of the other fields but I still wanna keep there data intact. I was thinking I could do something like this but I have no idea how to print the full line with this. Please show any method...

5. Shell Programming and Scripting

Scripting to Duplicate Lines Based on Variable

Greeting all! I could use some assistance please. :) I've been searching for the best way to duplicate a line based on a variable in the next line. Sample Data: Nov 22 00:00:19 10.10.10.1 "%ASA-4-313005: No matching connection for ICMP error message: icmp src Outside:1.2.3.4 dst...

6. Shell Programming and Scripting

Deleting lines based on a condition for a group of files

hi i have a set of similar files. i want to delete lines until certain pattern appears in those files. for a single file the following command can be used but i want to do it for all the files at a time since the number is in thousands. awk '/PATTERN/{i++}i' file

7. UNIX for Dummies Questions & Answers

awk solution to duplicate lines based on column

Hi experts, I have a tab-delimited file with one column containing values separated by a comma. I wish to duplicate the entire line for every value in that comma-delimited field. For example: $cat file 4444 4444 4444 4444 9990 2222,7777 6666 2222 ...

8. Shell Programming and Scripting

Help with a deleting lines based on a pattern

I have a header-detail file that goes like this: SHP00288820131021110921 ORDER0156605920131021110921INMMMMFN DETAIL0004 4C2Z 10769 AAFC 0000009600000094 4C2Z 10769 AAFC 0000672107 OIL DETAIL0002 ER3Z 14300 E 0000001300000012 ER3Z 14300 E 0000672107 OIL...

9. Shell Programming and Scripting

Remove duplicate lines from file based on fields

Dear community, I have to remove duplicate lines from a file contains a very big ammount of rows (milions?) based on 1st and 3rd columns The data are like this: Region 23/11/2014 09:11:36 41752 Medio 23/11/2014 03:11:38 4132 Info 23/11/2014 05:11:09 4323...

10. Shell Programming and Scripting

Removing duplicate lines on first column based with pipe delimiter

Hi, I have tried to remove dublicate lines based on first column with pipe delimiter . but i ma not able to get some uniqu lines Command : sort -t'|' -nuk1 file.txt Input : 38376KZ|09/25/15|1.057 38376KZ|09/25/15|1.057 02006YB|09/25/15|0.859 12593PS|09/25/15|2.803...

LEARN ABOUT DEBIAN

btsethttpseeds

btsethttpseeds(1)					      General Commands Manual						 btsethttpseeds(1)

NAME

       btsethttpseeds -- sets http-seeds information in torrents

SYNOPSIS

       btsethttpseeds  seedURLlist file [file ...]

DESCRIPTION

       This manual page documents briefly the btsethttpseeds command.

       This manual page was written for the Debian distribution because the original program does not have a manual page.

       btsethttpseeds is a program which will change the http seed URLs of an existing torrent file. The already existing torrent specified by the
       file argument will be modified to use the new http seed URLs given by the seedURLlist argument.	These alternate URLs will be used to  seed
       the file if there are no other seeds available.

       The seedURLlist can be '0', or formatted as follows:

       URL[|URL ...]

       if the seedURLlist is '0', any http seeds currently in file will be stripped.

SEE ALSO

       btcopyannounce(1), btreannounce(1), btrename(1), btshowmetainfo(1), bittorrent-downloader(1).

AUTHOR

       This  manual  page  was	written  by  Cameron  Dale  <camrdale@gmail.com>  (based  on  the  original  man  pages  written by Micah Anderson
       <micah@debian.org>) for the Debian system (but may be used by others).  Permission is granted to copy, distribute and/or modify this  docu-
       ment under the terms of the GNU General Public License, Version 2 or any later version published by the Free Software Foundation.

       On Debian systems, the complete text of the GNU General Public License can be found in /usr/share/common-licenses/GPL.

																 btsethttpseeds(1)

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Delete lines with duplicate strings based on date

Discussion started by: mattv

2. UNIX for Dummies Questions & Answers

remove duplicate lines based on two columns and judging from a third one

Discussion started by: TheTransporter

3. UNIX for Dummies Questions & Answers

[Solved] deleting pattern based lines in sed

Discussion started by: pandeesh

4. Shell Programming and Scripting

Remove duplicate lines based on field and sort

Discussion started by: cokedude