Sponsored Content
Top Forums Shell Programming and Scripting Awk to print distinct col values Post 302224840 by summer_cherry on Thursday 14th of August 2008 04:46:13 AM
Old 08-14-2008
awk:

Code:
col=$1
nawk -v col="$col" 'BEGIN{FS=","}
{
	arr[$col]++
}
END{
print "Column "col" has distinct value:"
for (i in arr)
print i
}' file

perl:

Code:
$col=shift;
open(FH,"<file");
while(<FH>){
	@arr=split(",",$_);
	$hash{$arr[$col-1]}++;
}
close(FH);
print "Colum $col has distinct value:\n";
for $key (keys %hash){
	print $key,"\n";
}

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Getting Distinct values from second field in a file....

Hi I have a pipe delimited file. I am trying to grab the DISTINCT value from the second field. The file is something like: 1233|apple|ron 1234|apple|elephant 1235|egg|man the output I am trying to get from second field is apple,egg (apple coming only once) Thanks simi (4 Replies)
Discussion started by: simi28
4 Replies

2. Shell Programming and Scripting

grep distinct values

this is a little more complex than that. I have a text file and I need to find all the distinct words that appear in a line after the word TABLESPACE when I grep for just the word tablespace, I get: how do i parse this a little better so i have a smaller file to read? This is just an... (4 Replies)
Discussion started by: guessingo
4 Replies

3. UNIX for Dummies Questions & Answers

distinct values of all the fields

I am a beginner to scripting, please help me in this regard. How do I create a script that provides a count of distinct values of all the fields in the pipe delimited file ? I have 20 different files with multiple columns in each file. I needed to write a generic script where I give the number... (1 Reply)
Discussion started by: vukkusila
1 Replies

4. Shell Programming and Scripting

distinct values of all the fields

I am a beginner to scripting, please help me in this regard. How do I create a script that provides a count of distinct values of all the fields in the pipe delimited file ? I have 20 different files with multiple columns in each file. I needed to write a generic script where I give the number... (2 Replies)
Discussion started by: vukkusila
2 Replies

5. Shell Programming and Scripting

average of distinct values with awk

Hi guys, I am not an expert in shell and I need help with awk command. I have a file with values like 200 1 1 200 7 2 200 6 3 200 5 4 300 3 1 300 7 2 300 6 3 300 4 4 I need resulting file with averages of... (3 Replies)
Discussion started by: saif
3 Replies

6. UNIX for Advanced & Expert Users

Print line based on highest value of col (B) and repetion of values in col (A)

Hello everyone, I am writing a script to process data from the ATP world tour. I have a file which contains: t=540 y=2011 r=1 p=N409 t=540 y=2011 r=2 p=N409 t=540 y=2011 r=3 p=N409 t=540 y=2011 r=4 p=N409 t=520 y=2011 r=1 p=N409 t=520 y=2011 r=2 p=N409 t=520 y=2011 r=3 p=N409 The... (4 Replies)
Discussion started by: imahmoud
4 Replies

7. UNIX for Dummies Questions & Answers

count number of distinct values in each column with awk

Hi ! input: A|B|C|D A|F|C|E A|B|I|C A|T|I|B As the title of the thread says, I would need to get: 1|3|2|4 I tried different variants of this command, but I don't manage to obtain what I need: gawk 'BEGIN{FS=OFS="|"}{for(i=1; i<=NF; i++) a++} END {for (b in a) print b}' input ... (2 Replies)
Discussion started by: beca123456
2 Replies

8. Shell Programming and Scripting

Find distinct values

Hi, I have two files of the following format file1 chr1:345-456 chr2:123-456 chr2:455-678 chr3:456-789 chr3:444-555 file2 chr1:345-456 chr2:123-456 chr3:456-789 output (2 Replies)
Discussion started by: jacobs.smith
2 Replies

9. Shell Programming and Scripting

Modifying col values based on another col

Hi, Please help with this. I have several excel files (with and .xlsx format) with 10-15 columns each. They all have the same type of data but the columns are not ordered in the same way. Here is a 3 column example. What I want to do add the alphabet from column 2 to column 3, provided... (9 Replies)
Discussion started by: newbie83
9 Replies

10. Emergency UNIX and Linux Support

Read values in each col starting 3rd row.Print occurrence value.

Hello Friends, Hope all are doing fine. Here is a tricky issue. my input file is like this 07 10 14 20 21 03 15 27 30 32 01 10 11 19 30 02 06 14 15 17 01 06 20 25 29 Logic: 1. Please print another column as "0-0-0-0-0" for the first and second rows. 2. Read the first column... (4 Replies)
Discussion started by: jacobs.smith
4 Replies
COL(1)							    BSD General Commands Manual 						    COL(1)

NAME
col -- filter reverse line feeds from input SYNOPSIS
col [-bfhpx] [-l num] DESCRIPTION
The col utility filters out reverse (and half reverse) line feeds so that the output is in the correct order with only forward and half for- ward line feeds, and replaces white-space characters with tabs where possible. This can be useful in processing the output of nroff(1) and tbl(1). The col utility reads from the standard input and writes to the standard output. The options are as follows: -b Do not output any backspaces, printing only the last character written to each column position. -f Forward half line feeds are permitted (``fine'' mode). Normally characters printed on a half line boundary are printed on the fol- lowing line. -h Do not output multiple spaces instead of tabs (default). -l num Buffer at least num lines in memory. By default, 128 lines are buffered. -p Force unknown control sequences to be passed through unchanged. Normally, col will filter out any control sequences from the input other than those recognized and interpreted by itself, which are listed below. -x Output multiple spaces instead of tabs. The control sequences for carriage motion that col understands and their decimal values are listed in the following table: ESC-7 reverse line feed (escape then 7) ESC-8 half reverse line feed (escape then 8) ESC-9 half forward line feed (escape then 9) backspace moves back one column (8); ignored in the first column carriage return (13) newline forward line feed (10); also does carriage return shift in shift to normal character set (15) shift out shift to alternate character set (14) space moves forward one column (32) tab moves forward to next tab stop (9) vertical tab reverse line feed (11) All unrecognized control characters and escape sequences are discarded. The col utility keeps track of the character set as characters are read and makes sure the character set is correct when they are output. If the input attempts to back up to the last flushed line, col will display a warning message. ENVIRONMENT
The LANG, LC_ALL and LC_CTYPE environment variables affect the execution of col as described in environ(7). EXIT STATUS
The col utility exits 0 on success, and >0 if an error occurs. SEE ALSO
colcrt(1), expand(1), nroff(1), tbl(1) STANDARDS
The col utility conforms to Version 2 of the Single UNIX Specification (``SUSv2''). HISTORY
A col command appeared in Version 6 AT&T UNIX. BSD
August 4, 2004 BSD
All times are GMT -4. The time now is 09:33 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy