I need to scan my 'vector<string> mystrings' for duplicate entries.


 
Thread Tools Search this Thread
Top Forums Programming I need to scan my 'vector<string> mystrings' for duplicate entries.
# 1  
Old 12-31-2010
I need to scan my 'vector<string> mystrings' for duplicate entries.

Basically I'm trying to avoid writing paragraphs to do it. It is possible I could convert over to a vector<int>'s if that may help. Does anyone know how to do this without really turning it into a huge brainbuster? I have considered using a for() loop but I wasn't able to put anything into practice just yet.

Thanks so much for any help.
Eric Smilie
# 2  
Old 12-31-2010
The only way to avoid N^2/2 cost is to remember the items in a more efficient way or sort them all, which is almost saying the same thing. For instance, put them is a hash map or a tree. Pardon the pseudo code.
Code:
for ( i = 0 to n-1 ){                // N^2/2
for ( j = i+1 to n-1 ){
 if ( mystrings[i] = mystrings[j] ){
   mystrings[j].delete() ;
   n-- ;
}}}

map<string> smap ;                                // map
for ( i = 0 to n-1 ){
  if ( !smap.insert_key( mystrings[i] ){
   // insert returns 0 if unique key already in map
   mystrings[j].delete() ;
   n-- ;
}}

Now, if the original order is moot, you can sort them in place, but it does not beat the map for large cases. Since the map is unique by nature, the obvious thing is not to have a vector -- wrong container for requirements. If sorted output is needed, a tree can do that at a slight disadvantage of having log(n) speed. Since map's hash, they do not store things in order: abcd might hash to bucket 137 and bcdef hash to 136.
This User Gave Thanks to DGPickett For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Shell script to get duplicate string

Hi All, I have a requirement where I have to get the duplicate string count and uniq error message. Below is my file: Rejected - Error on table TableA, column ColA. Error String 1. Rejected - Error on table TableA, column ColB. Error String 2. Rejected - Error on table TableA, column... (6 Replies)
Discussion started by: Deekhari
6 Replies

2. Shell Programming and Scripting

Highlighting duplicate string on a line

Hi all I have a grep written to pull out values; below (in the code snip-it) is an example of the output. What I'm struggling to do, and looking for assistance on, is identifying the lines that have duplicate strings. For example 74859915K74859915K in the below is 74859915K repeated twice but... (8 Replies)
Discussion started by: brighty
8 Replies

3. Shell Programming and Scripting

Highlighting duplicate string on a line

Hi all I have a grep written to pull out values; below (in the code snip-it) is an example of the output. What I'm struggling to do, and looking for assistance on, is identifying the lines that have duplicate strings. For example 74859915K74859915K in the below is 74859915K repeated twice but... (3 Replies)
Discussion started by: brighty
3 Replies

4. Shell Programming and Scripting

Remove not only the duplicate string but also the keyword of the string in Perl

Hi Perl users, I have another problem with text processing in Perl. I have a file below: Linux Unix Linux Windows SUN MACOS SUN SUN HP-AUX I want the result below: Unix Windows SUN MACOS HP-AUX so the duplicate string will be removed and also the keyword of the string on... (2 Replies)
Discussion started by: askari
2 Replies

5. Shell Programming and Scripting

find duplicate string in many different files

I have more than 100 files like this: SVEAVLTGPYGYT 2 SVEGNFEETQY 10 SVELGQGYEQY 28 SVERTGTGYT 6 SVGLADYNEQF 21 SVGQGYEQY 32 SVKTVLGYEQF 2 SVNNEQF 12 SVRDGLTNSPLH 3 SVRRDREGLEQF 11 SVRTSGSYEQY 17 SVSVSGSPLQETQY 78 SVVHSTSPEAF 59 SVVPGNGYT 75 (4 Replies)
Discussion started by: xshang
4 Replies

6. Shell Programming and Scripting

Delete duplicate in certain number of string

Hi, do you have awk or sed sommand taht will delete duplicate lines like. sample: server1-log1-14 server1-log2-14 superserver-time-2 superserver-log-2 output: server-log1-14 superserver-time-2 thansk (2 Replies)
Discussion started by: kenshinhimura
2 Replies

7. Shell Programming and Scripting

filtering out duplicate substrings, regex string from a string

My input contains a single word lines. From each line data.txt prjtestBlaBlatestBlaBla prjthisBlaBlathisBlaBla prjthatBlaBladpthatBlaBla prjgoodBlaBladpgoodBlaBla prjgood1BlaBla123dpgood1BlaBla123 Desired output --> data_out.txt prjtestBlaBla prjthisBlaBla... (8 Replies)
Discussion started by: kchinnam
8 Replies

8. Programming

Shared Memory and String,Vector

How can i user string or vector ins shared memory ? For example i have a structure sharedInfo like below struct sharedInfo { string szName; int iAge; string szAddrees; }; if i use this notation my program crashes. And if i use char szName, it work fine, what is wrong with... (4 Replies)
Discussion started by: helpmeforlinux
4 Replies

9. Shell Programming and Scripting

Python - Scan for string

Hi i have a variable 'reform' and store the lines like reform= { record string(8) ID; string(4) PRD; date("YYMMDD", split = "800101") DateofManufact; string(4) PRDC_MODULE_NUM; string(1) END_OF_RECORD = "\n"; } I need to search for the character "\n"in the above variable... (1 Reply)
Discussion started by: dhanamurthy
1 Replies

10. Programming

vector<string> with insert cmd

How do I correct this vector<string> insert. I am geeting segmintation dump. #include <algorithm> #include <cstdio> #include <cstdlib> #include <cctype> #include <cmath> #include <iostream> //#include <sstream> #include <string> #include <utility> #include <vector> using... (1 Reply)
Discussion started by: photon
1 Replies
Login or Register to Ask a Question