uniq -c in the pipeline


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting uniq -c in the pipeline
# 8  
Old 05-20-2012
Whats your definition of irony?

Mine is when the intended outcome or meaning juxtaposes significantly enough from the actual outcome or meaning for me to seek out the unix.com forums, become a member, post a problem, read a WORKAROUND solution, then be told the WORKAROUND is intended somehow as a subtle misinterpretation of the man page (missed that the lines must be ADJACENT) for uniq to do its -c switch as documented in the manpage:

-c Precede each output line with the count of the number of times the line occurred in the input, fol-
lowed by a single space.


Imagine if I implemented "sort" and said "applies only to letters I R O N and Y" (but buried that subtly with one word in a man page)

At least the uniq man page should clarify this with a note and a switch to presort (on the penalty of performance) to deliver the EXPECTED -c results?

I would like to see the uniq source code - is there a reference?

thanks
# 9  
Old 05-20-2012
You are making a really big issue out of this trivial matter and trying to blame the tool, instead of making it a learning experience.

This has nothing to do with the -c switch. -c just adds a number. This is a default behavior of uniq -- it filters only adjacent (consecutive) lines.

Quote:
Imagine if I implemented "sort" and said "applies only to letters I R O N and Y" (but buried that subtly with one word in a man page)
What are you trying to say with this comment? The fact that it operates on consecutive lines makes it more general and useful, not less.
So how would you write uniq, if you took the effort? How would you deal with the repeated lines? Would you rather slurp the whole file into memory and make this completely useless for large files? Or do you have a better solution? I'd be very interested to hear it.


Quote:
At least the uniq man page should clarify this with a note
But it does! Didn't you read my post? :
Code:
Note:  'uniq'  does  not detect repeated lines unless they are adjacent.   You may want to sort the input first, or use `sort -u' without `uniq'.

Which uniq do you have installed? What does your man page say?

Quote:
I would like to see the uniq source code - is there a reference?
Of course, help yourself:
GNU Project Archives
Again, I do not know whether it's GNU coreutils that you are using.
# 10  
Old 05-20-2012
uniq duplicates solution fix

No - no, I thank you for the solution - and I think you miss the irony - to use the uniq command for unique results one is require to presort the input.
What self respecting computer scientist would ever make such a 1/2 assed implementation without a full disclosure for the Big O tradeoff / duplicate results and offer a switch for the slower, yet accurate version of uniq -c is beyond me.

Everyone:

if you have duplicates in uniq -c - this is a feature, not a bug, since the lines must be ADJACENT to be considered.

If you want your expected results, first sort, then uniq, then sort again.

May the google duplicate uniq sort fix solution find you

thanks to mirni - I owe you a vBeer.
# 11  
Old 05-20-2012
Quote:
If you want your expected results, first sort, then uniq, then sort again.
No need for the last sort, it's already sorted.

O(nlog(n)) is so close to O(n), that sorting does not make much difference at all. And everything is fully disclosed in the documentation, you just have to read carefully -- every word can have significant meaning.

It is not half-assed at all, again you are missing an important point -- this is so that you can filter huge outputs without worrying about memory limitation. It is cleverly designed to be as useful as possible.


Glad I could help. (And I don't drink, but thanks! Smilie )
This User Gave Thanks to mirni For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Using cat and pipeline to execute script

hi this is a homework assignment i need some help with it mostly works. script file #!/usr/bin/env bash #create dictionary file grep -E '.{3}' /usr/share/dict/british-english > db.txt #create remove_word to test file touch removeW.txt #palindrome function palin() { ... (1 Reply)
Discussion started by: crepe6
1 Replies

2. Shell Programming and Scripting

Command pipeline trouble

Hello, I am attempting to ssh to a server and run a set of commands on a remote set of servers. I am getting the following error below, I am thinking quotes may be the problem. This command works on the local machine in bash. Not when I ssh to a remote server. Basically the command should... (3 Replies)
Discussion started by: jaysunn
3 Replies

3. Shell Programming and Scripting

If statement with pipeline

Hi Can anybody please explain me the following script in detail Value=`echo "if ( ${FACTOR} >= 1 ) {1}" | bc` What does "{1}" mean to here ? (3 Replies)
Discussion started by: Priya Amaresh
3 Replies

4. Shell Programming and Scripting

Change the delimiter from Comma to Pipeline

Hello All, I need to convert a csv file to pipeline delimiter file in UNIX. The data in file itself contains comma with double qouted qualifier apart from the comma separator. Let me know how to do it. Appreciate any help if awk can be used to do it. Mentioned below is the sample record of... (14 Replies)
Discussion started by: Arun Mishra
14 Replies

5. Shell Programming and Scripting

Shell pipeline help for a n00b

I need to read input from a file, and make sure nothing prints after column 72. basically, ignore input after character 72 until the next newline character. Any help is appreciated. I have been searching forever! (10 Replies)
Discussion started by: Gbear
10 Replies

6. Shell Programming and Scripting

Retaining Pipeline values

Hi, I am trying to calculate a few values using the below code but it dosent seem to be working. for i in 1 2 3 4 5 6 7 8 do j=`expr $i + 3` x =`head -$j temp1|tail -1|cut -f24 -d","` y =`head -$j temp1|tail -1|cut -f25 -d","` c =`expr $x / $y` echo "$c" >> cal_1 done I am not... (4 Replies)
Discussion started by: sachinnayyar
4 Replies

7. Shell Programming and Scripting

Comments within a shell pipeline

I've got a very ugly pipeline for analyzing web server logs (but nevermind the application; I've come across this in other scripts as well). I want to nicely comment the steps in the pipeline, but I can't seem to do it. I know, for instance that in csh/sh/bash, a # begins a comment, and any... (2 Replies)
Discussion started by: otheus
2 Replies

8. UNIX for Dummies Questions & Answers

Unix Pipeline help

Does anyone know how to answer this? I have tried many different commands, I just cant get it right..... Search the file 'data' for all of the lines that contain the pattern 'unx122' and put those lines in the file 'matches'. (2 Replies)
Discussion started by: netmaster
2 Replies

9. Programming

C program help please! input from pipeline

I have a project where I have to use bzcat to uncompress a file and use that output as the data to run another program on. I understand that you would do (bzcat filename.bz2 ! program name) but then how do you access that data in the c program??? Please help thanks (2 Replies)
Discussion started by: kinggizmo
2 Replies
Login or Register to Ask a Question