Counting duplicate chars in C


 
Thread Tools Search this Thread
Top Forums Programming Counting duplicate chars in C
# 1  
Old 04-25-2010
Counting duplicate chars in C

Hi, im trying to create a C program that will count the number of characters, duplicate characters and non duplicate characters in a file and output this to the screen. Here is my code so far:
Code:
#include <stdio.h>

int main( void )

{
char c;
int duplicate = 0;
int nonduplicate = 0;
int count = 0;
char lastChar;

while ( (c = getchar()) != EOF)
       count ++;
       lastChar += c;
{
  if(c = lastChar)
    duplicate ++;
  else( duplicate == duplicate);
}

nonduplicate = (count - duplicate);
printf( "%d %d %d \n", count, duplicate, nonduplicate );

}

This outputs the answer: 67 1 66

It is counting the total characters correctly, but the number i have for my particular test file is 6 duplicates, so i need the answer to display 67 6 61, my main problem is i cant seem to be able to store the last read character so it can find out if the next char is the same. Im probably leaving/forgetting something simple here.
Any help would be appreciated.
# 2  
Old 04-25-2010
Is this a duplicate letter "A"
ABCDEA

Or is this your meaning of duplicate?
AABCDE
# 3  
Old 04-25-2010
Quote:
Originally Posted by jim mcnamara
Is this a duplicate letter "A"
ABCDEA

Or is this your meaning of duplicate?
AABCDE
I think the last...
Quote:
Originally Posted by DavoMan
.... my main problem is i cant seem to be able to store the last read character so it can find out if the next char is the same.
To compare the input with the previous character the loop should be something like:
Code:
while ((c = getchar()) != EOF) {
  count++;
  if(c == lastChar){
    duplicate++;
  }
  else {
    lastChar = c;
  }
}

# 4  
Old 04-25-2010
Lots of problems in that code.

This is one:
Code:
while ( (c = getchar()) != EOF)
       count ++;
       lastChar += c;
{
  if(c = lastChar)
    duplicate ++;
  else( duplicate == duplicate);
}

What that really does is this:
Code:
while ( (c = getchar()) != EOF)
{
       count ++;
}

       lastChar += c;
{
  if(c = lastChar)
    duplicate ++;
  else( duplicate == duplicate);
}

That's a big difference.

Note also that getchar() returns an int, not a char. So you can't read the valid character input that corresponds to the truncated integer value of EOF, whatever that value happens to be on your system.

Also, you're not properly initializing your lastChar variable, so you never know what you're comparing it too when you start. Because you could read any character from the stream, your lastChar variable should probably be an int instead of a char, and you could initialize it to something like EOF to make sure it starts with a value that isn't a valid char value.

Not only that, you're doing an assignment instead of a comparison in your if statement. This:

Code:
  if(c = lastChar)
    duplicate ++;

should be:

Code:
  if(c == lastChar)
    duplicate ++;

I'd do something like this:
Code:
int main( void )
{
    int c;
    int last = EOF;
    int dupes = 0;
    int count = 0;

    while ( 1 )
    {
        c = getchar();
        if ( EOF == c )
        {
            break;
        }

        count++;

        if ( last == c )
        {
            dupes++;
        }

        last = c;
    }

    printf( "%d %d %d\n", count, dupes, count - dupes );

    return( 0 );
}

You will note that I deliberately avoided placing ANY assignment statement inside a conditional clause. Most compilers in my experience provide for a warning when they encounter an assignment in a conditional clause. If you make a point of never putting assignment statements in a conditional clause, you can then use that warning capability to find and fix mistakes in your code.

Because you will make mistakes when you code. Anyone who says they write mistake-free and bug-free code is lying or deluded, plain and simple.

Note also that I always used { and } after my conditional clauses. Always using braces even for one-line conditional code makes mistakes like the one you made with your while-loop impossible to make.

Don't pay attention to how code is laid out in textbooks. In textbooks, whitespace and extra lines cost the publisher money.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Shell script to split data with a delimiter having chars and special chars

Hi Team, I have a file a1.txt with data as follows. dfjakjf...asdfkasj</EnableQuotedIDs><SQL><SelectStatement modified='1' type='string'><! The delimiter string: <SelectStatement modified='1' type='string'><! dlm="<SelectStatement modified='1' type='string'><! The above command is... (7 Replies)
Discussion started by: kmanivan82
7 Replies

2. Shell Programming and Scripting

Find duplicate values in specific column and delete all the duplicate values

Dear folks I have a map file of around 54K lines and some of the values in the second column have the same value and I want to find them and delete all of the same values. I looked over duplicate commands but my case is not to keep one of the duplicate values. I want to remove all of the same... (4 Replies)
Discussion started by: sajmar
4 Replies

3. Shell Programming and Scripting

Remove duplicate chars and sort string [SED]

Hi, INPUT: DCBADD OUTPUT: ABCD The SED script should alphabetically sort the chars in the string and remove the duplicate chars. (5 Replies)
Discussion started by: jds93
5 Replies

4. Shell Programming and Scripting

Find duplicate based on 'n' fields and mark the duplicate as 'D'

Hi, In a file, I have to mark duplicate records as 'D' and the latest record alone as 'C'. In the below file, I have to identify if duplicate records are there or not based on Man_ID, Man_DT, Ship_ID and I have to mark the record with latest Ship_DT as "C" and other as "D" (I have to create... (7 Replies)
Discussion started by: machomaddy
7 Replies

5. Shell Programming and Scripting

Counting duplicate entries in a file using awk

Hi, I have a very big (with around 1 million entries) txt file with IPv4 addresses in the standard format, i.e. a.b.c.d The file looks like 10.1.1.1 10.1.1.1 10.1.1.1 10.1.2.4 10.1.2.4 12.1.5.6 . . . . and so on.... There are duplicate/multiple entries for some IP... (3 Replies)
Discussion started by: sajal.bhatia
3 Replies

6. Shell Programming and Scripting

Counting a chars IF == "x"

I'm new at this script stuff... only have minor exposure to java. My problem is largely syntax and being unable to figure out what the manuals are telling me what each option does. Basically I have a hard time understanding the documentation and need help with what awk is capable of on the shell... (4 Replies)
Discussion started by: silkiechicken
4 Replies

7. Shell Programming and Scripting

find 4 chars on 2nd line, 44 chars over

I know this should be simple, but I've been manning sed awk grep and find and am stupidly stumped :( I'm trying to use sed (or awk, find, etc) to find 4 characters on the second line of a file.txt 44-47 characters in. I can find lots of sed things for lines, but not characters. (4 Replies)
Discussion started by: unclecameron
4 Replies

8. Shell Programming and Scripting

How to convert C source from 8bit chars to 16bit chars?

I was using the following bash command inside the emacs compile command to search C++ source code: grep -inr --include='*.h' --include='*.cpp' '"' * | sed "/include/d" | sed "/_T/d" | sed '/^ *\/\//d' | sed '/extern/d' Emacs will then position me in the correct file and at the correct line... (0 Replies)
Discussion started by: siegfried
0 Replies

9. Shell Programming and Scripting

replace chars,

:rolleyes: Hi, I want to replace the particular text in the middle of the line. Ex. In the line 40-50 wanted to replace the char as 'x' (7 Replies)
Discussion started by: Jairaj
7 Replies

10. UNIX for Dummies Questions & Answers

Counting The Number Of Duplicate Lines In a File

Hello. First time poster here. I have a huge file of IP numbers. I am trying to output only the class b of the IPs and rank them by most common and output the total # of duplicate class b's before the class b. An example is below: 12.107.1.1 12.107.9.54 12.108.3.89 12.109.109.4 12.109.6.3 ... (2 Replies)
Discussion started by: crunchtime
2 Replies
Login or Register to Ask a Question