10-31-2008
Finding duplicate lines and deleting folders based on them
Hi,
I have research data, which is organized to 100 folders numbered 00-99. I have many sets of 100 folders, for different values of initial parameters. For some reason, the computer that ran the program to gather the data, didn't always create a unique seed for each folder. I anticipated that this could happen, so the seed number is saved to a file called seed.txt.
I need to delete folders which have duplicate seeds, so that each folder has a unique seed. I've used this kind of command
cat */seed.txt | sort | uniq -c | grep '2 '
to find out the duplicate seeds. There are some problems with this command. Firstly, it won't find any seeds that appear more than twice. Secondly, I won't know in which folders those duplicate seeds are.
How should I proceed from here? I guess I'll have to start learning some AWK. Could I do this by saving the seeds to an array, looping through the seeds and looking for each seed? When found, delete the folder in which the seed is found and proceed with the next seed.
Thank you for your help.
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Hey all, a relative bash/script newbie trying solve a problem.
I've got a text file with lots of lines that I've been able to clean up and format with awk/sed/cut, but now I'd like to remove the lines with duplicate usernames based on time stamp. Here's what the data looks like
2007-11-03... (3 Replies)
Discussion started by: mattv
3 Replies
2. UNIX for Dummies Questions & Answers
hello all,
I have an input file with four columns like this with a lot of lines
and for example, line 1 and line 5 match because the first 4 characters match and the fourth column matches too. I want to keep the line that has the lowest number in the third column. So I discard line 5.... (5 Replies)
Discussion started by: TheTransporter
5 Replies
3. UNIX for Dummies Questions & Answers
HI,
My input file contains below data:
DFHDR
12345110
1,200
2,-100
1,100
2,123
12345110
1,300
2,200
DFTLR
In the above data, the first line and last lines should be remove as well as the lines in which contains 110 as position(6,7,8 position) should also be removed,
How we... (0 Replies)
Discussion started by: pandeesh
0 Replies
4. Shell Programming and Scripting
I have a csv file that I would like to remove duplicate lines based on field 1 and sort. I don't care about any of the other fields but I still wanna keep there data intact. I was thinking I could do something like this but I have no idea how to print the full line with this. Please show any method... (8 Replies)
Discussion started by: cokedude
8 Replies
5. Shell Programming and Scripting
Greeting all! I could use some assistance please. :)
I've been searching for the best way to duplicate a line based on a variable in the next line.
Sample Data:
Nov 22 00:00:19 10.10.10.1 "%ASA-4-313005: No matching connection for ICMP error message: icmp src Outside:1.2.3.4 dst... (3 Replies)
Discussion started by: sjrupp
3 Replies
6. Shell Programming and Scripting
hi
i have a set of similar files. i want to delete lines until certain pattern appears in those files. for a single file the following command can be used but i want to do it for all the files at a time since the number is in thousands.
awk '/PATTERN/{i++}i' file (6 Replies)
Discussion started by: anurupa777
6 Replies
7. UNIX for Dummies Questions & Answers
Hi experts, I have a tab-delimited file with one column containing values separated by a comma. I wish to duplicate the entire line for every value in that comma-delimited field.
For example:
$cat file
4444 4444 4444 4444
9990 2222,7777 6666 2222 ... (3 Replies)
Discussion started by: torchij
3 Replies
8. Shell Programming and Scripting
I have a header-detail file that goes like this:
SHP00288820131021110921
ORDER0156605920131021110921INMMMMFN
DETAIL0004 4C2Z 10769 AAFC 0000009600000094 4C2Z 10769 AAFC 0000672107 OIL
DETAIL0002 ER3Z 14300 E 0000001300000012 ER3Z 14300 E 0000672107 OIL... (3 Replies)
Discussion started by: rbaggio666
3 Replies
9. Shell Programming and Scripting
Dear community,
I have to remove duplicate lines from a file contains a very big ammount of rows (milions?) based on 1st and 3rd columns
The data are like this:
Region 23/11/2014 09:11:36 41752
Medio 23/11/2014 03:11:38 4132
Info 23/11/2014 05:11:09 4323... (2 Replies)
Discussion started by: Lord Spectre
2 Replies
10. Shell Programming and Scripting
Hi,
I have tried to remove dublicate lines based on first column with pipe delimiter . but i ma not able to get some uniqu lines
Command : sort -t'|' -nuk1 file.txt
Input :
38376KZ|09/25/15|1.057
38376KZ|09/25/15|1.057
02006YB|09/25/15|0.859
12593PS|09/25/15|2.803... (2 Replies)
Discussion started by: parithi06
2 Replies
LEARN ABOUT DEBIAN
math::random::oo
Math::Random::OO(3pm) User Contributed Perl Documentation Math::Random::OO(3pm)
NAME
Math::Random::OO - Consistent object-oriented interface for generating random numbers
SYNOPSIS
# Using factory functions
use Math::Random::OO qw( Uniform UniformInt );
push @prngs, Uniform(), UniformInt(1,6);
# Explicit creation of subclasses
use Math::Random::OO::Normal;
push @prngs, Math::Random::OO::Normal->new(0,2);
$_->seed(23) for (@prngs);
print( $_->next(), "
") for (@prngs);
DESCRIPTION
CPAN contains many modules for generating random numbers in various ways and from various probability distributions using pseudo-random
number generation algorithms or other entropy sources. (The "SEE ALSO" section has some examples.) Unfortunately, no standard interface
exists across these modules. This module defines an abstract interface for random number generation. Subclasses of this model will
implement specific types of random number generators or will wrap existing random number generators.
This consistency will come at the cost of some efficiency, but will enable generic routines to be written that can manipulate any provided
random number generator that adheres to the interface. E.g., a stochastic simulation could take a number of user-supplied parameters, each
of which is a Math::Random::OO subclass object and which represent a stochastic variable with a particular probability distribution.
USAGE
Factory Functions
use Math::Random::OO qw( Uniform UniformInt Normal Bootstrap );
$uniform = Uniform(-1,1);
$uni_int = UniformInt(1,6);
$normal = Normal(1,1);
$boot = Bootstrap( 2, 3, 3, 4, 4, 4, 5, 5, 5 );
In addition to defining the abstract interface for subclasses, this module imports subclasses and exports factory functions upon request to
simplify creating many random number generators at once without typing "Math::Random::OO::Subclass->new()" each time. The factory function
names are the same as the suffix of the subclass following "Math::Random::OO". When called, they pass their arguments directly to the
"new" constructor method of the corresponding subclass and return a new object of the subclass type. Supported functions and their
subclasses include:
o "Uniform" -- Math::Random::OO::Uniform (uniform distribution over a range)
o "UniformInt" -- Math::Random::OO::UniformInt (uniform distribution of integers over a range)
o "Normal" -- Math::Random::OO::Normal (normal distribution with specified mean and standard deviation)
o "Bootstrap" -- Math::Random::OO::Bootstrap (bootstrap resampling from a non-parameteric distribution)
INTERFACE
All Math::Random::OO subclasses must follow a standard interface. They must provide a "new" method, a "seed" method, and a "next" method.
Specific details are left to each interface.
"new"
This is the standard constructor. Each subclass will define parameters specific to the subclass.
"seed"
$prng->seed( @seeds );
This method takes seed (or list of seeds) and uses it to set the initial state of the random number generator. As some subclasses may
optionally use/require a list of seeds, the interface mandates that a list must be acceptable. Generators requiring a single seed must use
the first value in the list.
As seeds may be passed to the built-in "srand()" function, they may be truncated as integers, so 0.12 and 0.34 would be the same seed.
"next"
$rnd = $prng->next();
This method returns the next random number from the random number generator. It does not take (and must not use) any parameters.
BUGS
Please report bugs using the CPAN Request Tracker at http://rt.cpan.org/NoAuth/Bugs.html?Dist=Math-Random-OO
AUTHOR
David A Golden <dagolden@cpan.org>
http://dagolden.com/
COPYRIGHT
Copyright (c) 2004, 2005 by David A. Golden
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
The full text of the license can be found in the LICENSE file included with this module.
SEE ALSO
This is not an exhaustive list -- search CPAN for that -- but represents some of the more common or established random number generators
that I've come across.
Math::Random -- multiple random number generators for different distributions (a port of the C randlib)
Math::Rand48 -- perl bindings for the drand48 library (according to perl56delta, this may already be the default after perl 5.005_52 if
available)
Math::Random::MT -- The Mersenne Twister PRNG (good and fast)
Math::TrulyRandom -- an interface to random numbers from interrupt timing discrepancies
perl v5.10.0 2009-05-02 Math::Random::OO(3pm)