| Help | unix | grep | sort | uniq - Different output from what I thought would be the same

10-12-2009

Registered User

17, 0

Join Date: Oct 2009

Last Activity: 4 December 2010, 6:41 PM EST

Posts: 17

Thanks Given: 0

Thanked 0 Times in 0 Posts

Hello,

I'm having an consistency issue....

Code:

 
grep 'a' /usr/share/dict/words

1) This will highlight every 'a' in each word.

Code:

 
grep 'a\{1,\}' /usr/share/dict/words

2) This will highlight 'a' if it occurs at least once in a sequence. So every 'a'.

Output of 1) I would think would be identical to 2)

I output the both results into seperate text files:
8647903 Oct 12 21:34 holding2a.txt this is where 1) went
8642625 Oct 12 21:34 holding2b.txt this is where 2) went
Why is there difference in file size?

more holding2a.txt | wc -l results 276975
more holding2a.txt | wc -l results 276975
this holds true for holding2b.txt

So I wanted to compare the text in each file.

more holding2a.txt | sort revealed this output

Code:

aaa
aa
aah
aahed
aahing
aahs
aal
aalii
aaliis
aals

more holding2b.txt | sort revealed this output

Code:

abaca
abacas
abacate
abacaxi
abacay
abacinate
abacination
abacterial
abactinal
abactinally

I checked the word and line count on both files again and they were the same.

I finally I merged the two files:

Code:

 
more holding2a.txt > holdingall.txt
more holding2b.txt | cat >> holdingall.txt
holdingall.txt | sort | uniq -u | wc -w -l

The result is 798 for both -w and -l.

Sample output:

Code:

 
holdingall.txt | sort | uniq -u -c

Top part.

Code:

  
      1 aaa
      1 aa
      1 aah
      1 aahed
      1 aahing
      1 aahs
      1 aal
      1 aalii
      1 aaliis
      1 aals
      1 aam
      1 aardwolf
      1 aardwolves
      1 aargh
      1 aaron
      1 aaronic
      1 aarrgh
      1 aarrghh
      1 aas
      1 aasvogel
      1 aasvogels
      1 aardvark
      1 aardvarks
      1 advocaat
      1 advocaat
      1 afrikaans
      1 afrikaans
      1 ahaaina
      1 ahaaina
      1 akaakai
      1 akaakai
      1 amaas
      1 amaas
      1 assbaa
      1 assbaa
      1 aa
      1 aah
      1 aahed
      1 aahing
      1 aahs
      1 aal
      1 aalii
      1 aaliis
      1 aals
      1 aam
      1 aardwolf
      1 aardwolves
      1 aargh
      1 aaron
      1 aaronic

Bottom few.

Code:

      
      1 Wraac
      1 Wraac
      1 Yaakov
      1 Yaakov
      1 Zaandam
      1 Zaandam
      1 Zitvaa
      1 Zitvaa

1) I don't know why uniq -u isn't removing the what appears to be duplicates.

2) I don't know why sort isn't sorting properly, I mean 'aa' is in two different places at the top of the list, I would think they would be together.

I tried sort -d and sort -s which resulted in what appeared to be the order. sort -d did take noticably longer to finish.

MykC

View Public Profile for MykC

Find all posts by MykC

10-13-2009

Registered User

175, 2

Join Date: Mar 2009

Last Activity: 18 July 2012, 9:53 PM EDT

Posts: 175

Thanks Given: 0

Thanked 2 Times in 2 Posts

Code:

cat holdingall.txt | sort | uniq -c > uniq_file

I am not sure if you are redirecting the sorted values to a file as implicitly it doesnot over write the file

HTH,
PL

daptal

View Public Profile for daptal

Find all posts by daptal

UNIX for Dummies Questions & Answers

| Help | unix | grep | sort | uniq - Different output from what I thought would be the same

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Uniq and sort -u

Discussion started by: senhia83

2. Shell Programming and Scripting

Uniq or sort -u or similar only between { }

Discussion started by: fugitivus

3. Shell Programming and Scripting

Sort uniq or awk

Discussion started by: LDHB2012

4. Shell Programming and Scripting

Sort field and uniq

Discussion started by: sabercats

5. Shell Programming and Scripting

Sort and uniq after comparision

Discussion started by: nua7

6. Shell Programming and Scripting

Help with Uniq and sort

Discussion started by: pinnacle

7. Shell Programming and Scripting

Sort, Uniq, Duplicates

Discussion started by: Amruta Pitkar

8. Shell Programming and Scripting

sort and uniq in perl

Discussion started by: reggiej

9. UNIX for Dummies Questions & Answers

Help with Last,uniq, sort and cut

Discussion started by: jay1228

10. UNIX for Dummies Questions & Answers

sort/uniq

Discussion started by: jimmyflip