Hi All,
I have the following input which i want to process using AWK.
Rows,NC,amount
1,1202,0.192387
2,1201,0.111111
3,1201,0.123456
i want the following output
count of rows = 3 ,sum of amount = 0.426954
Many thanks (2 Replies)
Dear All,
I have a data file input.csv like below. (Only five column shown here for example.)
Data1,StepNo,Data2,Data3,Data4
2,1,3,4,5
3,1,5,6,7
3,2,4,5,6
5,3,5,5,6
From this I want the below output
Data1,StepNo,Data2,Data3,Data4
2,1,3,4,5
3,1,5,6,7
where the second column... (4 Replies)
I have a following inputfile
MT,AP,CDM,TTML,MUM,GS,SUCC,3
MT,AP,CDM,TTSL,AP,GS,FAIL,9
MT,AP,CDM,RCom,MAH,GS,SUCC,3
MT,AP,CDM,RTL,HP,GS,SUCC,1
MT,AP,CDM,Uni,UPE,GS,SUCC,2
MT,AP,CDM,Uni,MUM,GS,SUCC,2
TTSL,AP,GS,MT,MAH,CDM,SUCC,20
TTML,AP,GS,MT,MAH,CDM,FAIL,10... (2 Replies)
Dear fellows, I need your help.
I'm trying to write a script to convert a single column into multiple rows.
But it need to recognize the beginning of the string and set it to its specific Column number.
Each Line (loop) begins with digit (RANGE).
At this moment it's kind of working, but it... (6 Replies)
Hi,
I have a similar input format-
A_1 2
B_0 4
A_1 1
B_2 5
A_4 1
and looking to print in this output format with headers. can you suggest in awk?awk because i am doing some pattern matching from parent file to print column 1 of my input using awk already.Thanks!
letter number_of_letters... (5 Replies)
Hi,
I have a table to be imported for R as matrix or data.frame but I first need to edit it because I've got several lines with the same identifier (1st column), so I want to sum the each column (2nd -nth) of each identifier (1st column)
The input is for example, after sorted:
K00001 1 1 4 3... (8 Replies)
Hi All,
I have a requirement where I need to find sum of values from column D through O present in a CSV file and check whether the sum of each Individual column matches with the value present for that corresponding column present in the trailer record.
For example, let's assume for column D... (9 Replies)
I have a file which need to be summed up using date column.
I/P:
2017/01/01 a 10
2017/01/01 b 20
2017/01/01 c 40
2017/01/01 a 60
2017/01/01 b 50
2017/01/01 c 40
2017/01/01 a 20
2017/01/01 b 30
2017/01/01 c 40
2017/02/01 a 10
2017/02/01 b 20
2017/02/01 c 30
2017/02/01 a 10... (6 Replies)
Hi Experts,
Please bear with me, i need help
I am learning AWk and stuck up in one issue.
First point : I want to sum up column value for column 7, 9, 11,13 and column15 if rows in column 5 are duplicates.No action to be taken for rows where value in column 5 is unique.
Second point : For... (1 Reply)
Hello,
I am trying to store sum of a column as a new column inside a file but have to find the column names dynamically
I/p
c1,c2,c3,c4,c5
10,20,30,40,50
20,30,40,50,60
If i want to find sum only column c1, c3 and output it as c6,c7
O/p
c1,c2,c3,c4,c5,c6,c7
10,20,30,40,50,30,70... (6 Replies)
Discussion started by: mkathi
6 Replies
LEARN ABOUT DEBIAN
fastx_quality_stats
FASTX_QUALITY_STATS(1) User Commands FASTX_QUALITY_STATS(1)NAME
fastx_quality_stats - FASTX Statistics
DESCRIPTION
usage: fastx_quality_stats [-h] [-N] [-i INFILE] [-o OUTFILE] Part of FASTX Toolkit 0.0.13.2 by A. Gordon (gordon@cshl.edu)
[-h] = This helpful help screen. [-i INFILE] = FASTQ input file. default is STDIN. [-o OUTFILE] = TEXT output file. default is
STDOUT. [-N] = New output format (with more information per nucleotide/cycle).
The *OLD* output TEXT file will have the following fields (one row per column):
column = column number (1 to 36 for a 36-cycles read solexa file)
count = number of bases found in this column.
min = Lowest quality score value found in this column.
max = Highest quality score value found in this column.
sum = Sum of quality score values for this column.
mean = Mean quality score value for this column.
Q1 = 1st quartile quality score.
med = Median quality score.
Q3 = 3rd quartile quality score.
IQR = Inter-Quartile range (Q3-Q1).
lW = 'Left-Whisker' value (for boxplotting).
rW = 'Right-Whisker' value (for boxplotting).
A_Count = Count of 'A' nucleotides found in this column. C_Count = Count of 'C' nucleotides found in this column. G_Count = Count
of 'G' nucleotides found in this column. T_Count = Count of 'T' nucleotides found in this column. N_Count = Count of 'N' nucleo-
tides found in this column. max-count = max. number of bases (in all cycles)
The *NEW* output format:
cycle (previously called 'column') = cycle number max-count For each nucleotide in the cycle (ALL/A/C/G/T/N):
count = number of bases found in this column.
min = Lowest quality score value found in this column.
max = Highest quality score value found in this column.
sum = Sum of quality score values for this column.
mean = Mean quality score value for this column.
Q1 = 1st quartile quality score.
med = Median quality score.
Q3 = 3rd quartile quality score.
IQR = Inter-Quartile range (Q3-Q1).
lW = 'Left-Whisker' value (for boxplotting).
rW = 'Right-Whisker' value (for boxplotting).
SEE ALSO
The quality of this automatically generated manpage might be insufficient. It is suggested to visit
http://hannonlab.cshl.edu/fastx_toolkit/commandline.html
to get a better layout as well as an overview about connected FASTX tools.
fastx_quality_stats 0.0.13.2 May 2012 FASTX_QUALITY_STATS(1)