Finding items that occur more than once in a vector
I have some vectors to evaluate and develop a list of values that occur in more than one element.
This is the simplest case. I have a vector,
I need to know that that 5 occurs twice in the vector. If I do a double loop,
The vector occurs_twice contains the value 5 in four elements.
I have identified the repeat value, but I don't have a unique list. I suppose I could check and see if 5 is already a member of occurs_twice before adding it.
This works, and could be cleaned up so it stops checking when m != 0, etc, but I expect there is a more efficient way to do this. Is there a library function to give values that occur more than once in a vector, or a function that would take my occurs_twice vector and return only the unique elements?
Thanks,
LMHmedchem
Last edited by LMHmedchem; 05-07-2011 at 05:15 PM..
Do you want to eliminate duplicate elements in your vector, right? Do the duplicate elements appear consecutively or can they appear somewhere in the vector?
Would be OK to get a vector with unique elements, but whose order may change, i.e.
Thanks, Loïc
Last edited by Loic Domaigne; 05-08-2011 at 04:09 PM..
Reason: grammar
Do you want to eliminate duplicate elements in your vector, right?
I need to remove elements that occur more than once, but I need to remove all the instances, not just one of a duplicate pair, etc. I also need the duplicate elements in a separate vector.
If I have,
I need,
The basic issue is that elements that appear two or more times get processed in a different way than elements that appear once, so I need separate lists.
Quote:
Originally Posted by Loic Domaigne
Do the duplicate elements appear consecutively or can they appear somewhere in the vector?
The duplicates can appear in any order and are not necessarily consecutive. The order is more or less random.
Quote:
Originally Posted by Loic Domaigne
Would be OK to get a vector with unique elements, but whose order may change, i.e.
Thanks, Loïc
The final order doesn't matter at all. I will probably sort the vector at some point anyway.
These data structures are quite small. I would be very surprised to find more than 3-5 elements in any of them, so the solution doesn't need to be ultra efficient.
These are graph verticies that get accumulated for various reasons. Those that occur on the same list more than once have different significance and get processed differently. I need to split up these vectors into a list of verticies that occur only once, and a second list of verticies that occur more than once. Theses are very small data structures, so I am trying to keep things simple. I was hoping there was a library function that I could send the vector to and get back a list of dups, but no such luck it would appear. I can do this with a double loop, but that seems un-elegant.
A map<int,int>, or a multiset are two ideas at the moment. I tried briefly with a multiset, but couldn't get it to compile. I will try again in a bit.
If you can, store the vertices list using a map<int,int>. The key would be the vertex (that may have duplicate), and the data would be the number of occurrence. Using this data
structure, you have readily all information that you need:
the element without duplicate has only one occurrence
I think this is the solution I want, but I'm not really sure how to store the data in the map to begin with. All the maps I have used are static, meaning I have hard coded data there to use as a lookup. If I have some int in scope, myInt, what is the syntax for dynamically adding a map element on that key? I presume I have to do find on the map to see if the key already exists, if not add the key and set the value to 1, if it does exist, increment the value, etc?
The map code I have written in the past looks like,
If I call checkType("06-2-20-2000"), I get the return value of 3, etc.
I'm not sure how to create an empty map and add to it.
LMHmedchem
Last edited by LMHmedchem; 05-09-2011 at 06:20 PM..
Hello everyone. Upon reading the recent news about the NSA paying RSA to use a faulty cipher suite for it's default, it got me thinking... During a connection say for SSL, what is it that "knows" the rules for ciphers? Are these rules stored on the NIC? can they be edited, changed or appended? ... (3 Replies)
I'm a scripting newbie and I'm trying to learn. No better way than being assigned a project.
So basically, I'm trying to come up with a script that can periodically check a URL and then notify when changes occur to the file.
So what I'm thinking is that I need to devise a PERL script that... (3 Replies)
Hello,
I want to only print lines where variables occur more than once using grep.
For eg:
Input:
$this is a comment
int a,b,c,b;
int b,c;
int d,e;
int f,g,f;
x=y+5;
For the above input, the output would be
int a,b,c,b;
int f,g,f;
I have done grep... (3 Replies)
Hi all,
i create 2 file
Config
path1 5 group1
path2 6 group2
path3 10 group1
path4 15 group2
Confine
group1 andrew
group2 alan
In my C shell script i write like this:
set line_array = (`cat $app_dir/config`)
set line_array_2 =... (0 Replies)
To find out number of "|" symbol is available in file:
Input:
a|b|c|d|z
Ouput:
4
I am using below set of commands,It is working... Anybody have anyother solution using sed / awk.
cnt=`wc -c <1.txt`
cnt1=`tr -d "|" <1.txt >c.dat`
cnt2=`wc -c <c.dat`
outp=`expr $cnt... (19 Replies)
I have a need to compare 2 files, then print results to file. Need to find items from file2 that are not found in file 1. thanks in advance!
example:
file 1:
abcde=12
fffff=6
bbbb=35
file2:
abcde=12
fffff=6
bbbb=35
ccccc=10
kkkkk=45 (8 Replies)
OS=HP-UX ksh
The following works, except I want to include the <start> and <end> in the output.
awk -F '<start>' 'BEGIN{RS="<end>"; OFS="\n"; ORS=""} {print $2} somefile.log'
The following work in bash but not in ksh
sed -n '/^<start>/,/^<end>/{/LABEL$/!p}' somefile.log (4 Replies)
i want a shell script program for how many times a word occur in a file.
i need not the line number but i want the counts of the particular word
for eg:-
hai how r u..
i am from andhra pradesh..
i am from tenali.i need this answer.i need it urgently..
i hope u will answer this ...
... (9 Replies)
Hi Gurus,
I am executing my Datastage jobs on UNIX operating System. While running the jobs i am getting the following error:
main_program: Unexpected termination by Unix signal 9(SIGKILL)
Can any one please let me know what are the possible situations where this SIGKILL will arrise?
... (9 Replies)