02-22-2012
AWK, Perl or Shell? Unique strings and their maximum values from 3 column data file
I have a file containing data like so:
2012-01-02 GREEN 4
2012-01-02 GREEN 6
2012-01-02 GREEN 7
2012-01-02 BLUE 4
2012-01-02 BLUE 3
2012-01-02 GREEN 4
2012-01-02 RED 4
2012-01-02 RED 8
2012-01-02 GREEN 4
2012-01-02 YELLOW 5
2012-01-02 YELLOW 2
I can't always predict what the strings are going to be in the second column (so in the example above there are colours but the data file could contain any string in column two). There is always however a number in the third column (which I want the max value of for a paticular string in column two). Is awk able to:
- Pull out each of the unique strings in column 2?
- For each of the unique strings get the maximum associated value (so using the above you'd end up with the following)?:
2012-01-02 GREEN 7
2012-01-02 BLUE 4
2012-01-02 RED 8
2012-01-02 YELLOW 5
or would this be easier with Perl (or even shell)? any code examples much appreciated!
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
Hi all,
I have a huge csv file with the following format of data,
Num SNPs, 549997
Total SNPs,555352
Num Samples, 157
SNP, SampleID, Allele1, Allele2
A001,AB1,A,A
A002,AB1,A,A
A003,AB1,A,A
...
...
...
I would like to write out a list of unique SNP (column 1). Could you... (3 Replies)
Discussion started by: phoeberunner
3 Replies
2. Shell Programming and Scripting
Hi All,
I have a file which is having 3 columns as (string string integer)
a b 1
x y 2
p k 5
y y 4
.....
.....
Question:
I want get the unique value of column 2 in a sorted way(on column 2) and the sum of the 3rd column of the corresponding rows. e.g the above file should return the... (6 Replies)
Discussion started by: amigarus
6 Replies
3. Shell Programming and Scripting
I have the perl data structure and what i need to do is find all values in
@{$extractColumns{'2'}{'D'}} which are not there in @{$extractColumns{'2'}{'M'}} but seems like i need to put a flag somewhere and i messed up
foreach my $order (keys %extractColumns)
{
foreach my $value... (2 Replies)
Discussion started by: dinjo_jo
2 Replies
4. Shell Programming and Scripting
Hello,
I have 2 columns (1st column has multiple entries but the corresponding values in the column 2 may be the same or different.) however I want to extract unique values for each entry in column 1 by assigning the max value from column 2
SDF4 -0.211654
SDF4 0.978068
... (1 Reply)
Discussion started by: Diya123
1 Replies
5. Shell Programming and Scripting
Hi,
Im looking for a script which will calculate the unique strings column 2 & 3 values in a log as mentioned in example
eg:-
bag 12 12
bag 18 15
bags 15 13
bags 15 14
blazer 24 24
blazer 33 32
boots 19 15
Result should be:-
bag 30 27
bags 30 27... (9 Replies)
Discussion started by: Paulwintech
9 Replies
6. Linux
cat sample.csv
ID,Name,no
1,AAA,1
2,BBB,1
3,AAA,1
4,BBB,1
cut -d',' -f2 sample.csv | sort | uniq
this gives only the 2nd column values
Name
AAA
BBB
How to I get all the columns of CSV along with this? (1 Reply)
Discussion started by: sanvel
1 Replies
7. Shell Programming and Scripting
Hi Folks,
I have the below feed file named abc1.txt in which you can see there is a title and below is the respective values in the rows and it is completely pipe delimited file ,.
... (4 Replies)
Discussion started by: punpun66
4 Replies
8. UNIX for Dummies Questions & Answers
Dear Unix Gurus,
I have a text file with multiple columns, for example, see sample.txt below
0 1 301
1 4 250
2 6 140
3 2 610
7 1 180I want to find the maximum in, say, column 3, normalise all the values to this maximum value (to 4 decimal places) and spit everything into a new... (2 Replies)
Discussion started by: tintin72
2 Replies
9. Shell Programming and Scripting
Hi,
I have multiple files that each contain four columns of strings:
File1:
Code:
123 abc gfh 273
456 ddff jfh 837
789 ghi u4u 395
File2:
Code:
123 abc dd fu
456 def 457 nd
891 384 djh 783
I want to compare the strings in Column 1 of File 1 with each other file and Print in... (3 Replies)
Discussion started by: owwow14
3 Replies
10. Shell Programming and Scripting
Hello,
I need to find out the minimum and maximum values based on specific column, and then print out the entire row with the max value.
Infile.txt:
scf6 290173 290416 . + X_047241 T_00113118-1
scf6 290491 290957 . + X_047241 T_00113118-2
scf6 290898 290957 . + X_047241 T_00113119-3
scf6... (2 Replies)
Discussion started by: yifangt
2 Replies
tokenmt(7ipp) IP Quality of Service Modules tokenmt(7ipp)
NAME
tokenmt - Single and Two Rate Three Conformance Level Meter
DESCRIPTION
The tokenmt module can be configured as a Single or a Two Rate meter. Packets are deemed to belong to one of the three levels - Red, Yellow
or Green - depending on the configured rate(s) and the burst sizes. When configured as a Single Rate meter, tokenmt can operate with just
the Green and Red levels.
Configuration parameters for tokenmt correspond to definitions in RFC- 2697 and RFC- 2698 as follows:
Configuring tokenmt as a Single Rate meter (from RFC- 2697):
committed_rate - CIR
committed_burst - CBS
peak_burst - EBS
(thus 'peak_burst' for a single rate meter is actually the 'excess burst' in the RFC. However, throughout the text the parameter name
"peak burst" is used.)
Configuring tokenmt as a Two Rate meter (from RFC- 2698):
committed_rate - CIR
peak_rate - PIR
committed_burst - CBS
peak_burst - PBS
The meter is implemented using token buckets C and P, which initially hold tokens equivalent to committed and peak burst sizes (bits)
respectively. When a packet of size B bits arrive at time t, the following occurs:
When operating as a Single Rate meter, the outcome (level)
is decided as follows:
- Update tokens in C and P
o Compute no. of tokens accumulated since the
last time packet was seen at the committed rate as
T(t) = committed rate * (t - t')
(where t' is the time the last packet was seen)
o Add T tokens to C up to a maximum of committed burst
size. Add remaining tokens ((C+T) - Commited Burst),
if any, to P, to a maximum of peak burst size.
- Decide outcome
o If not color aware
o If B <= C, outcome is GREEN and C -= B.
o Else, if B <= P, outcome is YELLOW and P -= B.
o Else, outcome is Red.
o Else,
o obtain DSCP from packet
o obtain color from color_map, color_map[DSCP]
o if (color is GREEN) and (B <= C), outcome is
GREEN and C -= B.
o Else, if (color is GREEN or YELLOW) and
(B <= P), outcome is YELLOW and P -= B.
o Else, outcome is RED.
Note that if peak_burst and yellow_next_actions are
not specified (that is, a single rate meter with two
outcomes), the outcome is never YELLOW.
When operating as a Two Rate meter, the outcome (level) is decided as follows:
- Update tokens in C and P
o Compute no. of tokens accumulated since the last time a
packet was seen at the committed and peak rates as
Tc(t) = committed rate * (t - t')
Tp(t) = peak rate * (t - t')
(where t' is the time the last packet was seen)
o Add Tc to C up to a maximum of committed burst size
o Add Tp to P up to a maximum of peak burst size
- Decide outcome
o If not color aware
o If B > P, outcome is RED.
o Else, if B > C, outcome is YELLOW and P -= B
o Else, outcome is GREEN and C -= B & P -= B
o Else,
o obtain DSCP from packet
o obtain color from color_map, color_map[DSCP]
o if (color is RED) or (B > P), outcome is RED
o Else, if (color is YELLOW) or (B > C),
outcome is YELLOW and P -= B
o Else, outcome is GREEN and C -= B & P -= B
STATISTICS
The tokenmt module exports the following statistics through kstat:
Global statistics:
module: tokenmt instance: <action id>
name: tokenmt statistics class <action name>
epackets <number of packets in error>
green_bits <number of bits in green>
green_packets <number of packets in green>
red_bits <number of bits in red>
red_packets <number of packets in red>
yellow_bits <number of bits in yellow>
yellow packets <number of packets in yellow>
FILES
/kernel/ipp/sparcv9/tokenmt
64-bit module (SPARC only.)
ATTRIBUTES
See attributes(5) for descriptions of the following attributes:
+-----------------------------+-----------------------------+
| ATTRIBUTE TYPE | ATTRIBUTE VALUE |
+-----------------------------+-----------------------------+
|Availability |SUNWqos |
+-----------------------------+-----------------------------+
SEE ALSO
ipqosconf(1M), dlcosmk(7IPP), dscpmk(7IPP), flowacct(7IPP), ipqos(7IPP), ipgpc(7IPP), tswtclmt(7IPP)
RFC 2697, A Single Rate Three Color Marker J. Heinanen, R. Guerin -- The Internet Society, 1999
RFC 2698, A Two Rate Three Color Marker J. Heinanen, R. Guerin -- The Internet Society, 1999
SunOS 5.10 29 Sep 2004 tokenmt(7ipp)