Help with analysis data based on particular column content


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Help with analysis data based on particular column content
# 1  
Old 03-22-2012
Help with analysis data based on particular column content

Input file:
Code:
Total_counts	1306726155	100%
Number_of_count_true	855020282
Number_of_count_true_1	160014283
Number_of_count_true_2	44002825
Number_of_count_true_3	18098424
Number_of_count_true_4	24693745
Number_of_count_false	115421870
Number_of_count_true	51048447
Total_number_of_false

Desired output file:
Code:
Total_counts	1306726155	100%
Number_of_count_true	855020282           65.43%
Number_of_count_true_1	160014283           12.25%
Number_of_count_true_2	44002825             3.37%
Number_of_count_true_3	18098424             1.39%
Number_of_count_true_4	24693745             1.89%
Number_of_count_false	115421870            8.83%
Number_of_count_true	51048447             3.91%
Total_number_of_false       1191304285   91.17%

Condition:
1. All those column 2 that matched "Number_of_count" is divided with the column 2 in "Total_counts" and multiple 100% to generate the figure in column 3;
eg. 855020282/1306726155*100%=65.43%

2. Column 2 of "Total_number_of_false" is generated based on minus of column 2 in "Total_counts" with column 2 in "Number_of_count_false";
eg. 1306726155-51048447=1191304285

3. Column 3 of "Total_number_of_false" is generated based on minus of column 3 in "Total_counts" with column 3 in "Number_of_count_false";
eg. 100%-8.83%=91.17%

Thanks for any advice.
# 2  
Old 03-22-2012
Not sure if there is a typo or if I misunderstood, but this might come close to what you wanted or can be used as a start:

Code:
awk 'NR==1 {a=$2; printf("%-30s%20s%11s\n", $1,$2,$3); next} !/^Total/ {s2+=$2; p3=$2*100/a; s3+=p3; printf("%-30s%20s%10.2f%s\n", $1, $2, p3, pr)} END {printf("%-30s%20s%10.2f%s\n", $1, s2, s3, pr)}' pr="%" infile
Total_counts                            1306726155       100%
Number_of_count_true                     855020282     65.43%
Number_of_count_true_1                   160014283     12.25%
Number_of_count_true_2                    44002825      3.37%
Number_of_count_true_3                    18098424      1.39%
Number_of_count_true_4                    24693745      1.89%
Number_of_count_false                    115421870      8.83%
Number_of_count_true                      51048447      3.91%
Total_number_of_false                   1268299876     97.06%

Please keep in mind, it would be nice that you show some effort yourself, as you have 80 posts already and might be familiar to such tasks, thanks.
# 3  
Old 03-22-2012
Code:
perl -lane '
if (/Total_counts/) { $tot = $F[1]; print }
(/Number_of_count/) && printf "%s\t%s\t%6.2f\%\n", $F[0], $F[1], $F[1]*100/$tot;
(/Number_of_count_false/) && ($false += $F[1]);
(/Total_number_of_false/) && printf "%s\t%d\t%6.2f\%\n", $F[0], $tot-$false, ($tot-$false)*100/$tot' inputfile


Last edited by balajesuri; 03-22-2012 at 09:43 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Print the column content based on the header

i have a input of csv file as below but the sequence of column get changed. I,e it is not necessary that name comes first then age and rest all, it may vary. name,age,marks,roll,section kevin,25,80,456,A Satch,23,56,789,B Meena,24,78,H245,C So i want to print that column entires which... (12 Replies)
Discussion started by: millan
12 Replies

2. Shell Programming and Scripting

Generate tabular data based on a column value from an existing data file

Hi, I have a data file with : 01/28/2012,1,1,98995 01/28/2012,1,2,7195 01/29/2012,1,1,98995 01/29/2012,1,2,7195 01/30/2012,1,1,98896 01/30/2012,1,2,7083 01/31/2012,1,1,98896 01/31/2012,1,2,7083 02/01/2012,1,1,98896 02/01/2012,1,2,7083 02/02/2012,1,1,98899 02/02/2012,1,2,7083 I... (1 Reply)
Discussion started by: himanish
1 Replies

3. Shell Programming and Scripting

Help with data rearrangement based on share same content

Input file data_2 USA data_2 JAPAN data_3 UK data_4 Brazil data_5 Singapore data_5 Indo data_5 Thailand data_6 China Desired output file data_2 USA/JAPAN data_3 UK data_4 Brazil data_5 Singapore/Indo/Thailand data_6 China I would like to merge all data content that share same... (2 Replies)
Discussion started by: perl_beginner
2 Replies

4. Shell Programming and Scripting

multiplication of two files based on the content of the first column

Hi, This is something that probably it is more difficult to explain than to do. I have two files e.g. FILE1 A15 8.3102E+00 3.2000E-04 A15 8.5688E+00 4.3000E-05 B13 5.1100E-01 1.9960E+00 B16 5.1100E-01 2.3000E-03 B16 8.6770E-01 1.0000E-07 B16 9.8693E-01 3.4000E-05... (4 Replies)
Discussion started by: f_o_555
4 Replies

5. Shell Programming and Scripting

Help with replace column one content based on reference file

Input file 1 testing 10 20 1 A testing 20 40 1 3 testing 23 232 2 1 testing 10 243 2 . . Reference file 1 final 3 used . . Output file (1 Reply)
Discussion started by: perl_beginner
1 Replies

6. Shell Programming and Scripting

Help with replace column one content based on reference file

Input file 1 testing 10 20 1 A testing 20 40 1 3 testing 23 232 2 1 testing 10 243 2 . . Reference file 1 final 3 used . . Output file (2 Replies)
Discussion started by: perl_beginner
2 Replies

7. Shell Programming and Scripting

Help with merge two file based on similar column content

Input file 1: A1BG A1BG A1BG A1CF A1CF BCAS BCAS A2LD1 A2M A2M HAT . . Input file 2: A1BG All A1CF TEMP (5 Replies)
Discussion started by: perl_beginner
5 Replies

8. Shell Programming and Scripting

Change file content based on data

I have a Transaction File coming into the system. In this file, in all records the relevant data is as follows- Position 1:10 -> Transaction Code Position 252:255 -> 4 digit business code Now based on these 2 fields I have to alter value in Transaction code (Position 1:10)... (6 Replies)
Discussion started by: varunrbs
6 Replies

9. Shell Programming and Scripting

How to add a new line between different column data content?

Input file: Germany 10 500 5000 Germany 20 500 5000 Germany 50 10 500 England 5 10 25 USA 30 25 55 USA 20 35 90 Japan 2 5 60 Singapore 50 30 90 Singapore 150 230 290 Output file: Germany 10 500 5000 Germany 20 500 5000 Germany 50 10 500 England 5 10 25 (7 Replies)
Discussion started by: patrick87
7 Replies

10. Shell Programming and Scripting

Extract data based on match against one column data from a long list data

My input file: data_5 Ali 422 2.00E-45 102/253 140/253 24 data_3 Abu 202 60.00E-45 12/23 140/23 28 data_1 Ahmad 256 7.00E-45 120/235 140/235 22 data_4 Aman 365 8.00E-45 15/65 140/65 20 data_10 Jones 869 9.00E-45 65/253 140/253 18... (12 Replies)
Discussion started by: patrick87
12 Replies
Login or Register to Ask a Question