Get first column value uniq


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Get first column value uniq
# 1  
Old 12-22-2016
Get first column value uniq

Hi All,

I have a directory and sub-directory that having ‘n' number of .log file in nearly 1GB.
The file is comma separated file. I need to recursively grep and uniq first column values only.
I did in perl. But i wish to know more command line utilities to calculate the time for grep and uniq.

sample contents of *.log file

Code:
value1,100,99,98
value1,99,97,98
value2,50,51,52
value3,10,11,12
value2,60,61,62
value3,70,71

Expected output
Code:
value1
value2
value3

# 2  
Old 12-22-2016
Hi, try:
Code:
find . -name '*.log' -type f -exec cat {} + | awk -F, '!A[$1]++{print $1}'

# 3  
Old 12-22-2016
Could this be more efficient?
Code:
find . -name '*.log' -type f -exec awk -F, '!A[$1]++{print $1}' {} +

.... or if multiple input files to awk would confuse things, try:-
Code:
find . -name '*.log' -type f -exec awk -F, '!A[$1]++{print $1}' {} \;

An alternate (which may be horribly slow, I don't know) could be:-
Code:
cut -f1 -d, *.log | sort -u

.... although this will fail for excessive number of input files because the command grows too long. I suppose you could also wrap it in a find like this:-
Code:
find . -type f -name "*.log" -exec cut -f1 -d, {} + | sort -u

It will be one of those that you have to try variations to see which one works best for your data.




Robin
# 4  
Old 12-22-2016
Quote:
Originally Posted by rbatte1
Could this be more efficient?
Code:
find . -name '*.log' -type f -exec awk -F, '!A[$1]++{print $1}' {} +

.... or if multiple input files to awk would confuse things, try:-
Code:
find . -name '*.log' -type f -exec awk -F, '!A[$1]++{print $1}' {} \;

[..]
Hi Robin, it depends, how the OP's question should be interpreted. I interpreted it to be the unique values among all of the files in the directory and in its subdirectories. Then my solution would be most efficient and it will provide the right answer.

If the idea is to list the unique values per file then your second option should be used, although I think for that to be of use the filename should be printed as well.

Your first option cannot be used in either case, it might happen to provide the right answer if the total number is such that the awk is only called once for all those files. If it is called multiple times then the answer will be incorrect.
This User Gave Thanks to Scrutinizer For This Post:
# 5  
Old 12-22-2016
I wonder

Code:
grep -rhoE '^\w+' *.log | sort -u

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need help in awk: running a loop with one column and segregate data 4 each uniq value in that field

Hi All, I have a file like this(having 2 column). Column 1: like a,b,c.... Column 2: having numbers. I want to segregate those numbers based on column 1. Example: file. a 5 b 9 b 620 a 710 b 230 a 330 b 1910 (4 Replies)
Discussion started by: Raza Ali
4 Replies

2. Shell Programming and Scripting

HELP - uniq values per column

Hi All, I am trying to output uniq values per column. see file below. can you please assist? Thank you in advance. cat names joe allen ibm joe smith ibm joe allen google joe smith google rachel allen google desired output is: joe allen google rachel smith ibm (5 Replies)
Discussion started by: Apollo
5 Replies

3. Shell Programming and Scripting

Bring values in the second column into single line (comma sep) for uniq value in the first column

I want to bring values in the second column into single line for uniq value in the first column. My input jvm01, Web 2.0 Feature Pack Library jvm01, IBM WebSphere JAX-RS jvm01, Custom01 Shared Library jvm02, Web 2.0 Feature Pack Library jvm02, IBM WebSphere JAX-RS jvm03, Web 2.0 Feature... (10 Replies)
Discussion started by: kchinnam
10 Replies

4. Shell Programming and Scripting

Uniq count second column

Hello How can I get a number of occurrence count for this file; ERR315389.1000156 CTTGAAGAAGAATTGAAAACTGTGACGAACAACTTGAAGTCACTGGAGGCTCAGGCTGAGAAGTACTCGCAGAAGGAAGACAGATATGAGGAAGAG ERR315389.1000281 ... (3 Replies)
Discussion started by: Wan Fahmi
3 Replies

5. Shell Programming and Scripting

awk uniq and longest string of a column as index

I met a challenge to filter ~70 millions of sequence rows and I want using awk with conditions: 1) longest string of each pattern in column 2, ignore any sub-string, as the index; 2) all the unique patterns after 1); 3) print the whole row; input: 1 ABCDEFGHI longest_sequence1 2 ABCDEFGH... (12 Replies)
Discussion started by: yifangt
12 Replies

6. UNIX for Dummies Questions & Answers

Re: How To Use UNIQ UNIX Command On single Column

Hi , Can You Please let Know How use unix uniq command on a single column for deleting records from file with Below Structure.Pipe Delimter File . Source Name | Account_Id A | 101 B... (2 Replies)
Discussion started by: anudeepkumar123
2 Replies

7. UNIX for Dummies Questions & Answers

deleteing duplicate lines sing uniq while ignoring a column

I have a data set that has 4 columns, I want to know if I can delete duplicate lines while ignoring one of the columns, for example 10 chr1 ASF 30 15 chr1 ASF 20 5 chr1 ASF 30 6 chr2 EBC 15 4 chr2 EBC 30 ... I want to know if I can delete duplicate lines while ignoring column 1, so the... (5 Replies)
Discussion started by: japaneseguitars
5 Replies

8. Shell Programming and Scripting

Changing one column of delimited file column to fixed width column

Hi, Iam new to unix. I have one input file . Input file : ID1~Name1~Place1 ID2~Name2~Place2 ID3~Name3~Place3 I need output such that only first column should change to fixed width column of 15 characters of length. Output File: ID1<<12 spaces>>Name1~Place1 ID2<<12... (5 Replies)
Discussion started by: manneni prakash
5 Replies

9. Shell Programming and Scripting

Column sum group by uniq records

Dear All, I want to get help for below case. I have a file like this. saman 1 gihan 2 saman 4 ravi 1 ravi 2 so i want to get the result, saman 5 gihan 2 ravi 3 like this. Pls help me. (17 Replies)
Discussion started by: Nayanajith
17 Replies

10. UNIX for Dummies Questions & Answers

Difference between plain "uniq" and "uniq -u"

Dear all, It's not entirely clear to me from manpage the difference between them. Why we still need "-u" flag? - monkfan (3 Replies)
Discussion started by: monkfan
3 Replies
Login or Register to Ask a Question