Find different column numbers among rows in data


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Find different column numbers among rows in data
# 1  
Old 03-26-2009
Find different column numbers among rows in data

I want to find the different column numbers among rows in a file. For example:

A001 a b c d e ... N
A002 a b c d e ... N
A003 a b c d e ... N+1
A004 a b c d e ... N
A005 a b c d e ... N+2
:
:

For most of the lines I will have N columns (say 1000 rows) in each line except the line 3 and 5 (say 50 rows). I want to find all rows in my data file that contains different numbers of column, and print it out as

A003 N+1
A005 N+2

also, I want to delete those rows with different column numbers, what should I do for the command?

I use head awk '{print NF}' filename to find out the column numbers for each row, but there're just too many and always print on the screen at the same time~
# 2  
Old 03-26-2009
How do you define "different" numbers of columns....
Anyway this reports the distribution of columns using space as the column delimiter:
Code:
awk '{ arr[NF]++}
       END { for (i in arr) {print i, arr[i]} } ' inputfilename

to remove offending rows you have to know what the column count should be:
Code:
echo "allowed number of columns \c"
read ccnt
awk -v ccnt=$ccnt ' NF==ccnt' inputfilename > newfilename

[/code]
# 3  
Old 03-26-2009
To define the "different" numbers of columns in my dataset:

I know most of my rows in the dataset suppose to be N, but there are fews with N+2 or N+4, I want to find out which rows they are and delete it~
# 4  
Old 03-27-2009
cat file | egrep "N\+[1-9]"
# 5  
Old 03-27-2009
Code:
$ cat data.txt
A0001 1 2 3 4 5
A0002 1 2 3 4 5
A0003 1 2 3 4 5 6
A0004 1 2 3 4 5
A0005 1 2 3 4 5
A0006 1 2 3 4 5 6 7
A0007 1 2 3 4 5
A0008 1 2 3 4 5 6 7 8 9
A0009 1 2 3 4 5
A0010 1 2 3 4
$ awk '{ if(NF != 6){ print $1, NF } else { print >> "data2.txt" } }' data.txt
A0003 7
A0006 8
A0008 10
A0010 5
$ cat data2.txt
A0001 1 2 3 4 5
A0002 1 2 3 4 5
A0004 1 2 3 4 5
A0005 1 2 3 4 5
A0007 1 2 3 4 5
A0009 1 2 3 4 5
$

# 6  
Old 03-31-2009
Thank you everyone! The codes are great and works!
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to transpose pieces of data in a column to multiple rows?

Hello Everyone, I am very new to the world of regular expressions. I am trying to use grep/sed for the following: Input file is something like this and there are multiple such files: abc 1 2 3 4 5 ***END*** abc 6 7 8 9 ***END*** abc 10 (2 Replies)
Discussion started by: shellnewuser
2 Replies

2. Shell Programming and Scripting

How to separate rows of data into another column?

I have data such as below where the value in second field is the same as that in the row after. 123456,22222,John,0,xyz 234567,22222,John1,1,cde 43212,3333,Jean,3,pip 84324,3333,Abel,2,cat I'd like to rearrange the output like below to put such records beside each other and separated with... (5 Replies)
Discussion started by: james2009
5 Replies

3. Shell Programming and Scripting

Data rearranging from rows to column

Hello Everyone, I have a input file looks like -0.005-0.004-0.003-0.002-0.00100.0010.0020.0030.0040.005My desired output should look like -0.005 -0.004 -0.003 -0.002 -0.001 0 0.001 0.002 0.003 0.004 0.005I had some success in getting the desired output. But i face a problem when i... (15 Replies)
Discussion started by: dinesh.n
15 Replies

4. Shell Programming and Scripting

Convert Column data values to rows

Hi all , I have a file with the below content Header Section employee|employee name||Job description|Job code|Unitcode|Account|geography|C1|C2|C3|C4|C5|C6|C7|C8|C9|Csource|Oct|Nov|Dec|Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep Data section ... (1 Reply)
Discussion started by: Hypesslearner
1 Replies

5. Shell Programming and Scripting

[Solved] Compare column data in all the rows

Hi.. In the below sorted input file.. I am comparing the first 3 columns of data one by one row and it is a pipeline delimitter file.. AA|BB|CC|line1 AA|BB|CC|ine4 AA|BB|CC|line2 BB|CC|DD|line3 BB|CC|DD|line5 If first 3 columns of data matches with any record in the file the... (4 Replies)
Discussion started by: NareshN
4 Replies

6. Shell Programming and Scripting

Transpose Column of Data to Rows

I can no longer find my commands, but I use to be able to transpose data with common fields from a single column to rows using a command line. My data is separated as follows: NAME=BOB ADDRESS=COLORADO PET=CAT NAME=SUSAN ADDRESS=TEXAS PET=BIRD NAME=TOM ADDRESS=UTAH PET=DOG I would... (7 Replies)
Discussion started by: docdave78
7 Replies

7. UNIX for Dummies Questions & Answers

Assigning rank to rows of numbers based on the last column

I have a tab delimited text file that looks like the following: ERBB3 0.00097 IL31RA 0.000972 SETD5 0.000972 MCART1 0.000973 CENPJ 0.000973 FNDC6 0.000974 I want to assign a number to each row based on the value in the last column (in the order of increasing value so that the first row... (3 Replies)
Discussion started by: evelibertine
3 Replies

8. Shell Programming and Scripting

Summing up rows data regarding 1st column

Dear all, I have one file like LABEL A B C D E F G H I J K L M N G02100 64651.3 25630.7 8225.21 51238 267324 268005 234001 52410.9 18598.2 10611 10754.7 122535 267170 36631.4 G02100 12030.3 8260.15 8569.91 ... (4 Replies)
Discussion started by: AAWT
4 Replies

9. Shell Programming and Scripting

column data to rows every n line

Hi every one, I am trying to organise an input text file like: input 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 into an output as following: output file 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 (5 Replies)
Discussion started by: nxp
5 Replies

10. Shell Programming and Scripting

Script to find the average of a given column and also for specified number of rows??

Hi friends I have 100 files in my directory. Each file look like this.. Temp1 Temp2 Temp3 MAS 1 2 3 MAS 4 5 6 MAS 7 8 9 Delhi 10 11 12 Delhi 13 14 15 Delhi 16 17 ... (4 Replies)
Discussion started by: ks_reddy
4 Replies
Login or Register to Ask a Question