figure out the number different values a certain row takes


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting figure out the number different values a certain row takes
# 1  
Old 12-10-2008
figure out the number different values a certain row takes

Hi,

If we are concerned with each column,usually this is very easy using the wc command in shell, however,the problem is:
I have file with 8 rows but 100 columns.
1.I want to find the number of different values a certain "row" takes
2.The values are in fact characters for, example, "john", "david,"steve"
3.Sometimes the value is empty, when the value is empty, I want to ignore it completely when I am trying to figure out the number different values a certain row takes.

Seems awk is the most efficient tool.

Thanks,
john
# 2  
Old 12-10-2008
Could you post a small sample (two rows for example) of your data?
The solution will depend on your definition of empty (,, or ,"",).

You may consider something like this with Perl:

Code:
perl -F, -lane'
  map { $_{$_} = 1 if /\w/ } @F;
    print scalar keys %_  
  ' infile


Last edited by radoulov; 12-10-2008 at 04:25 PM..
# 3  
Old 12-10-2008
class1 john,alax,alex,wong, , ,paul,john
class2 mary,adel,yvone,mary, ,iona,wayne

so class1 has 4 different names
class2 has 5 different names



I have no idea with perl
Would you please you do it using awk or python?
# 4  
Old 12-10-2008
Hammer & Screwdriver Here is one approach in bash

Code:
> cat file105
class1 john,alax,alex,wong, , ,paul,john 
class2 mary,adel,yvone,mary, ,iona,wayne 

> cat manip105.sh
while read mydata
   do
   myclass=$(echo "$mydata" | cut -d" " -f1)
   mynames=$(echo "$mydata" | cut -d" " -f2-)
   uniqnames=$(echo "$mynames" | tr "," "\n" | sort -u | grep "^[A-Za-z]" | wc -l)
   echo "${myclass} has ${uniqnames} unique names"
done< file105

> manip105.sh 
class1 has 5 unique names
class2 has 5 unique names
>

Note that class 1 truly has five unique names
# 5  
Old 12-11-2008
AWK:

Code:
awk -F, '{
  cl = $1; sub(/ .*/, "", cl)
  sub(/[^ ]* /,"")
  while (++i <= NF) 
    $i ~ /^ *$/ || _[$i]++ || c++
  printf "%s has %d unique names\n",
    cl, c
  i = c = 0; split("",_)
  }' infile

Attempt with Python:

[script]
Code:
#! /usr/bin/env python

import fileinput, re

p = re.compile('^\s*$')

for l in fileinput.input():
	l = l.rstrip()
	l = l.split(',')
	cl, l[0] = l[0].split()[:2]
	u, c = {}, 0
	for r in l:
		u[r] = 1
	for k in u.iterkeys():
		if not p.match(k):
			c = c + 1	
	print cl, "has", c, "unique names"

Use scriptname infile to execute.

Last edited by radoulov; 12-11-2008 at 07:26 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Read row number from 1 file and print that row of second file

Hi. How can I read row number from one file and print that corresponding record present at that row in another file. eg file1 1 3 5 7 9 file2 11111 22222 33333 44444 55555 66666 77777 88888 99999 (3 Replies)
Discussion started by: Abhiraj Singh
3 Replies

2. Shell Programming and Scripting

Get row number from file1 and print that row of file2

Hi. How can we print those rows of file2 which are mentioned in file1. first character of file1 is a row number.. for eg file1 1:abc 3:ghi 6:pqr file2 a abc b def c ghi d jkl e mno f pqr ... (6 Replies)
Discussion started by: Abhiraj Singh
6 Replies

3. Shell Programming and Scripting

Help with change significant figure to normal figure command

Hi, Below is my input file: Long list of significant figure 1.757E-4 7.51E-3 5.634E-5 . . . Desired output file: 0.0001757 0.00751 0.00005634 . . . (10 Replies)
Discussion started by: perl_beginner
10 Replies

4. UNIX for Dummies Questions & Answers

Finding row number along with length of row

I have a fixed length file and I want to find out row number along with row length. I have a program that give me the line length if it satisfy the condition; but i would like to add row number as well? How do I do that? while IFS= read -r line; do if ; then echo ${line} echo... (8 Replies)
Discussion started by: princetd001
8 Replies

5. Shell Programming and Scripting

The difference between end number in the early row and the start number in the next

Hi Power User, I'm trying to compute this kind of text file format: file1: jakarta 100 150 jakarta 170 210 beijing 220 250 beijing 260 280 beijing 290 320 new_york 330 350 new_york 370 420 tokyo 430 470 tokyo 480 ... (2 Replies)
Discussion started by: anjas
2 Replies

6. Shell Programming and Scripting

addition of values in row

file content are like this sam,22,29,23,24,25,26,22 i want to add the values from column 3 (fix column no) to as per user input say up to column 8 (variable as per user) can we do this without using "awk" for each column (as number of columns are variable as per user input ) Thanks in... (5 Replies)
Discussion started by: sagar_1986
5 Replies

7. Shell Programming and Scripting

Keep 3 values in each row

Hi, I have n number of values like 1 2 3 4 I want the output like 1 2 3 4 5 6 - - - - - - Please help me on this:wall: (4 Replies)
Discussion started by: cns1710
4 Replies

8. Shell Programming and Scripting

dynamic values in a row

hi i have an input file in which there are diffrent values for xxxx,yyyyyy,zzzzzzz how can i arrange the dynamic values of x,y&z in a row. input file: xxxxx 1 yyyyyy 4 yyyyyy 5 zzzzzzzz 7 yyyyyy 13 zzzzzzzz 7 zzzzzzzz 6 yyyyyy 14 yyyyyy 12 zzzzzzzz 4 yyyyyy 4 yyyyyy 5 yyyyyy 6... (6 Replies)
Discussion started by: dodasajan
6 Replies

9. Shell Programming and Scripting

how to add the number of row and count number of rows

Hi experts a have a very large file and I need to add two columns: the first one numbering the incidence of records and the another with the total count The input file: 21 2341 A 21 2341 A 21 2341 A 21 2341 C 21 2341 C 21 2341 C 21 2341 C 21 4567 A 21 4567 A 21 4567 C ... (6 Replies)
Discussion started by: juelillo
6 Replies

10. Shell Programming and Scripting

How to insert data befor some field in a row of data depending up on values in row

Hi I need to do some thing like "find and insert before that " in a file which contains many records. This will be clear with the following example. The original data record should be some thing like this 60119827 RTMS_LOCATION_CDR INSTANT_POSITION_QUERY 1236574686123083rtmssrv7 ... (8 Replies)
Discussion started by: aemunathan
8 Replies
Login or Register to Ask a Question