I think I can produce the output you want, but I can't reconcile the output you have shown with your description of what you want done. The sum of the A's, B's, and C's in column 2 doesn't seem to play any role in the output. The two lines that you seem to have combined (the 1st line and the 3rd line in your input) have matching values in columns 1 and 6; not 6 and 4.
Would a more accurate statement of your requirements be?
For every line in the input file where the 1st field and the 6th field are the same, print the 1st field, the 6th field, and the sum of the values in the 3rd field.
Does the order of the output lines matter? If it does, please describe the required order
Last edited by Don Cragun; 07-19-2013 at 01:22 PM..
Reason: Sum of 3rd field; not 4th.
Hi,
I am new to this forum and new to awk.
I have a file that contains 2 columns.
Heres an example of what it looks like:
10 +
20 +
40 +
50 -
70 -
So the file is tab-delimited. What I want to do is add 10 to column 1 whenever column 2 is + and substract 10 from column 1... (1 Reply)
i have a file - it will be in sorted order on column 1
abc 0 1
abc 2 3
abc 3 5
def 1 7
def 0 1
--------
i'd like (awk maybe?) to get the results (any ideas)???
abc 5 9
def 1 8 (2 Replies)
Hi,
I have below as i/p file:
5ABC 36488989 K 000010000ASB BYTRES
5PQR 45757754 K 000200005KPC HGTRET
5ABC 36488989 K 000045000ASB HGTRET
5GTH 36488989 K 000200200ASB BYTRES
5FTU ... (2 Replies)
I have a following inputfile
MT,AP,CDM,TTML,MUM,GS,SUCC,3
MT,AP,CDM,TTSL,AP,GS,FAIL,9
MT,AP,CDM,RCom,MAH,GS,SUCC,3
MT,AP,CDM,RTL,HP,GS,SUCC,1
MT,AP,CDM,Uni,UPE,GS,SUCC,2
MT,AP,CDM,Uni,MUM,GS,SUCC,2
TTSL,AP,GS,MT,MAH,CDM,SUCC,20
TTML,AP,GS,MT,MAH,CDM,FAIL,10... (2 Replies)
i have file input
aaa ccc,45567,rterw,1
bbb dcs,564543,hjghgh,1
aaa ccc,454,rterw,6
i want to sum based on column 1
expected output
aaa ccc,7
bbb dcs,1 (4 Replies)
Hi,
I have a similar input format-
A_1 2
B_0 4
A_1 1
B_2 5
A_4 1
and looking to print in this output format with headers. can you suggest in awk?awk because i am doing some pattern matching from parent file to print column 1 of my input using awk already.Thanks!
letter number_of_letters... (5 Replies)
Hi,
I have a table to be imported for R as matrix or data.frame but I first need to edit it because I've got several lines with the same identifier (1st column), so I want to sum the each column (2nd -nth) of each identifier (1st column)
The input is for example, after sorted:
K00001 1 1 4 3... (8 Replies)
Hi All,
I have a requirement where I need to find sum of values from column D through O present in a CSV file and check whether the sum of each Individual column matches with the value present for that corresponding column present in the trailer record.
For example, let's assume for column D... (9 Replies)
Hello,
I am trying to store sum of a column as a new column inside a file but have to find the column names dynamically
I/p
c1,c2,c3,c4,c5
10,20,30,40,50
20,30,40,50,60
If i want to find sum only column c1, c3 and output it as c6,c7
O/p
c1,c2,c3,c4,c5,c6,c7
10,20,30,40,50,30,70... (6 Replies)
Hi All,
I have a file as below and want to sum based on the id in the first column
Input
10264;ATE; 12
10265;SES;11
10266AUT;50
10264;ATE;10
10265;SES;13
10266AUT;89
10264;ATE;1
10265;SES;15
10266AUT;78
Output
10264;ATE; 23
10265;SES;39
10266AUT;139 (6 Replies)
Discussion started by: arunkumar_mca
6 Replies
LEARN ABOUT DEBIAN
fastx_quality_stats
FASTX_QUALITY_STATS(1) User Commands FASTX_QUALITY_STATS(1)NAME
fastx_quality_stats - FASTX Statistics
DESCRIPTION
usage: fastx_quality_stats [-h] [-N] [-i INFILE] [-o OUTFILE] Part of FASTX Toolkit 0.0.13.2 by A. Gordon (gordon@cshl.edu)
[-h] = This helpful help screen. [-i INFILE] = FASTQ input file. default is STDIN. [-o OUTFILE] = TEXT output file. default is
STDOUT. [-N] = New output format (with more information per nucleotide/cycle).
The *OLD* output TEXT file will have the following fields (one row per column):
column = column number (1 to 36 for a 36-cycles read solexa file)
count = number of bases found in this column.
min = Lowest quality score value found in this column.
max = Highest quality score value found in this column.
sum = Sum of quality score values for this column.
mean = Mean quality score value for this column.
Q1 = 1st quartile quality score.
med = Median quality score.
Q3 = 3rd quartile quality score.
IQR = Inter-Quartile range (Q3-Q1).
lW = 'Left-Whisker' value (for boxplotting).
rW = 'Right-Whisker' value (for boxplotting).
A_Count = Count of 'A' nucleotides found in this column. C_Count = Count of 'C' nucleotides found in this column. G_Count = Count
of 'G' nucleotides found in this column. T_Count = Count of 'T' nucleotides found in this column. N_Count = Count of 'N' nucleo-
tides found in this column. max-count = max. number of bases (in all cycles)
The *NEW* output format:
cycle (previously called 'column') = cycle number max-count For each nucleotide in the cycle (ALL/A/C/G/T/N):
count = number of bases found in this column.
min = Lowest quality score value found in this column.
max = Highest quality score value found in this column.
sum = Sum of quality score values for this column.
mean = Mean quality score value for this column.
Q1 = 1st quartile quality score.
med = Median quality score.
Q3 = 3rd quartile quality score.
IQR = Inter-Quartile range (Q3-Q1).
lW = 'Left-Whisker' value (for boxplotting).
rW = 'Right-Whisker' value (for boxplotting).
SEE ALSO
The quality of this automatically generated manpage might be insufficient. It is suggested to visit
http://hannonlab.cshl.edu/fastx_toolkit/commandline.html
to get a better layout as well as an overview about connected FASTX tools.
fastx_quality_stats 0.0.13.2 May 2012 FASTX_QUALITY_STATS(1)