Find highest records in table


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Find highest records in table
# 1  
Old 07-02-2013
Find highest records in table

Dear All,

I have table files with different measurements for the same sensors. The first column indicate the sensor (7 different sensors and 16 measurements in the example below). I would like to find the best measurement for each sensor. The best measurement is determined by the higher value of ($6 x $7). If two or more measurements are equally good I would need the first one. The number of measurements per sensor varies between 3-6.

Code:
cat infile.txt
M01072-1    FJ973371    238617596    400    1979    97.91    383    8    0    1    383    1105    723
M01072-1    DQ980471    116222186    400    3178    97.91    383    8    0    1    383    1070    688
M01072-1    AM901334    195944118    400    988    90.74    378    26    9    1    373    986    613
M01072-2    AM901325    195944109    400    988    90.74    378    26    9    1    373    986    613
M01072-2    AM901311    195944095    400    988    89.82    393    29    11    1    387    986    599
M01072-2    EF205010    147743297    443    1269    99.55    443    2    0    1    443    869    427
M01011-2    HQ670228    339699497    443    1313    93.72    446    22    6    1    443    871    429
M01011-2    HQ670227    339699496    443    1210    93.72    446    22    6    1    443    817    375
M01072-4    HM483399    299790116    443    3656    92.63    448    23    9    1    443    3314    2872
M01072-4    HM483396    299790113    443    3541    92.60    446    27    6    1    443    3298    2856
M01072-5    AF183464    6120004    403    2120    93.20    397    22    5    8    403    539    931
M01072-5    AF183460    6120000    403    2126    93.20    397    22    5    8    403    538    930
M01072-6    AF395513    15281673    403    2104    93.25    400    12    13    1    392    510    902
M01072-6    AF183469    6120009    403    2142    93.23    399    14    13    1    392    529    921
M01072-6    DQ015714    66864259    403    1684    92.73    399    20    9    8    403    196    588
M01072-7    EF205010    147743297    358    1269    99.43    351    2    0    8    358    791    441

Code:
cat outfile.txt
M01072-1    FJ973371    238617596    400    1979    97.91    383    8    0    1    383    1105    723
M01072-2    EF205010    147743297    443    1269    99.55    443    2    0    1    443    869    427
M01011-2    HQ670228    339699497    443    1313    93.72    446    22    6    1    443    871    429
M01072-4    HM483399    299790116    443    3656    92.63    448    23    9    1    443    3314    2872
M01072-6    AF395513    15281673    403    2104    93.25    400    12    13    1    392    510    902
M01072-7    EF205010    147743297    358    1269    99.43    351    2    0    8    358    791    441

The following solution works but it is slow.
Code:
a. Add new columns 
awk '{ print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$1,$6*$7}' infile.txt > infile.new
b. Get a list of all the sensors:
awk '{ print $1}' infile.txt | sort -u > sensor.list
c. Sort file according to the last column
for SENSOR in `cat sensor.list`
do
  grep "${SENSOR}" infile.new | sort -k 15 -n | head -n 1 >  ${SENSOR}_best.res
done

Does anybody have a better, faster solution? Thank you very much for considering my request.
# 2  
Old 07-02-2013
Code:
awk '
        {
                $1 = $1
                V = $6 * $7
                if ( $1 in A )
                {
                        if ( A[$1] < V )
                        {
                                A[$1] = V
                                R[$1] = $0 OFS V
                        }
                }
                else
                {
                        A[$1] = V
                        R[$1] = $0 OFS V
                }
        }
        END {
                for ( k in R )
                        print R[k]
        }
' OFS='\t' file

This User Gave Thanks to Yoda For This Post:
# 3  
Old 07-02-2013
Quote:
Originally Posted by GDC
Dear All,

I have table files with different measurements for the same sensors. The first column indicate the sensor (7 different sensors and 16 measurements in the example below). I would like to find the best measurement for each sensor. The best measurement is determined by the higher value of ($6 x $7). If two or more measurements are equally good I would need the first one. The number of measurements per sensor varies between 3-6.

Code:
cat infile.txt
M01072-1    FJ973371    238617596    400    1979    97.91    383    8    0    1    383    1105    723
M01072-1    DQ980471    116222186    400    3178    97.91    383    8    0    1    383    1070    688
M01072-1    AM901334    195944118    400    988    90.74    378    26    9    1    373    986    613
M01072-2    AM901325    195944109    400    988    90.74    378    26    9    1    373    986    613
M01072-2    AM901311    195944095    400    988    89.82    393    29    11    1    387    986    599
M01072-2    EF205010    147743297    443    1269    99.55    443    2    0    1    443    869    427
M01011-2    HQ670228    339699497    443    1313    93.72    446    22    6    1    443    871    429
M01011-2    HQ670227    339699496    443    1210    93.72    446    22    6    1    443    817    375
M01072-4    HM483399    299790116    443    3656    92.63    448    23    9    1    443    3314    2872
M01072-4    HM483396    299790113    443    3541    92.60    446    27    6    1    443    3298    2856
M01072-5    AF183464    6120004    403    2120    93.20    397    22    5    8    403    539    931
M01072-5    AF183460    6120000    403    2126    93.20    397    22    5    8    403    538    930
M01072-6    AF395513    15281673    403    2104    93.25    400    12    13    1    392    510    902
M01072-6    AF183469    6120009    403    2142    93.23    399    14    13    1    392    529    921
M01072-6    DQ015714    66864259    403    1684    92.73    399    20    9    8    403    196    588
M01072-7    EF205010    147743297    358    1269    99.43    351    2    0    8    358    791    441

Code:
cat outfile.txt
M01072-1    FJ973371    238617596    400    1979    97.91    383    8    0    1    383    1105    723
M01072-2    EF205010    147743297    443    1269    99.55    443    2    0    1    443    869    427
M01011-2    HQ670228    339699497    443    1313    93.72    446    22    6    1    443    871    429
M01072-4    HM483399    299790116    443    3656    92.63    448    23    9    1    443    3314    2872
M01072-6    AF395513    15281673    403    2104    93.25    400    12    13    1    392    510    902
M01072-7    EF205010    147743297    358    1269    99.43    351    2    0    8    358    791    441

The following solution works but it is slow.
Code:
a. Add new columns 
awk '{ print $1,$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$1,$6*$7}' infile.txt > infile.new
b. Get a list of all the sensors:
awk '{ print $1}' infile.txt | sort -u > sensor.list
c. Sort file according to the last column
for SENSOR in `cat sensor.list`
do
  grep "${SENSOR}" infile.new | sort -k 15 -n | head -n 1 >  ${SENSOR}_best.res
done

Does anybody have a better, faster solution? Thank you very much for considering my request.
Why doesn't your desired output (as shown in outfile.txt) contain an entry for sensor M01072-5?
Yoda provided a solution that should be faster (and use a lot fewer resources) than your script. If the input for each of your sensors is grouped together by sensor (as shown in your example), it can be simplified even more.

So, is the input for each of your sensors always grouped as in your sample above?
This User Gave Thanks to Don Cragun For This Post:
# 4  
Old 07-02-2013
Dear Yoda,

thank you very much for your help. I am not quite sure if I understand your awk script. Would you please explain me the use of "if ( $1 in A )"?

Thanks again,
# 5  
Old 07-02-2013
if ( $1 in A ) statement scans if field $1 in present in array A index.

To know more about this expression check: Scanning an Array

Basically the code is checking if a 1st field is already part of array index, if yes then we will perform the comparison to determine whether to keep or discard existing record:
Code:
                        if ( A[$1] < V )
                        {
                                A[$1] = V
                                R[$1] = $0 OFS V
                        }

This User Gave Thanks to Yoda For This Post:
# 6  
Old 07-02-2013
Dear Don Cragun,

thanks for point out the missing value for one of the sensor. Finding the best values for each sensor is part of a larger script and at the end a few sensors have to be removed.

The list is concatenated and therefore the sensors are not sorted. I could, however, add a extra step to sort the file first.

Thanks for your suggestions!
# 7  
Old 07-02-2013
try also:
Code:
awk '
{if (($6 * $7) > b[$1]) {a[$1]=$0; b[$1]=$6 * $7}}
END { for (i in a) print a[i] }
' infile.txt

pipe into sort if needed.

Last edited by rdrtx1; 07-03-2013 at 10:44 AM.. Reason: corrected.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Solaris

Find highest value of a particular property in multiple files

I have multiple files with pattern of "*.tps (example:tps-20170307170421560-1053.tps)" in my log directory(files are in different sub directories). entries in files are given below. I want to extract highest value of endtime accross all files. "endTime :1488902691462" ... (7 Replies)
Discussion started by: Agoyals1986
7 Replies

2. Shell Programming and Scripting

Delete the records from table

Hi, Can any one help me... the records are not deleting when I run the below script. But if I issue the same delete command manually, the records are getting deleted. script: #!/bin/ksh USAGE_STRING="USAGE $0 " if then echo "SORRY you need to be user 'mqm'. Only 'mqm' has... (5 Replies)
Discussion started by: zxcjggu708
5 Replies

3. Shell Programming and Scripting

Find highest number - working but need help!

Hello all, I am new to this and need some help or maybe steer me to the right direction! I wrote a script to get the highest number and prints it on the screen, the script basically asks the user to input numbers, and then prints the highest number! very simple it works like this $sh max.sh... (8 Replies)
Discussion started by: unknownsolo
8 Replies

4. Shell Programming and Scripting

Getting number of records from a table

I am doing a loading process. I am loading data from a Oracle source to Oracle target. For example there is an SQL statement: Insert into emp_1 Select * from emp_2 where deptno=20; In this case my source is emp_2 and loading into my target table emp_1. This process is automated. Now I... (3 Replies)
Discussion started by: karthikkasarla
3 Replies

5. Programming

Help with find highest and smallest number in a file with c

Input file: #data_1 AGDG #data_2 ADG #data_3 ASDDG DG #data_4 A Desired result: Highest 7 Slowest 1 code that I try but failed to archive my goal :( #include <stdio.h> (2 Replies)
Discussion started by: cpp_beginner
2 Replies

6. Shell Programming and Scripting

Total records in table

Hi, I am having two tables. A & B I want to know the total number of records in each table and need to store each value in some variables to process further in the code. How it can be done ? With Regards (2 Replies)
Discussion started by: milink
2 Replies

7. Shell Programming and Scripting

Verifying table records

Hi, Script copies records of two tables into another tables for backup before using oracle's import utility to restore from backup. Now, suppose if the import utility fails then the script will again copy those records from the backup table to the original ones. How to check whether all... (2 Replies)
Discussion started by: milink
2 Replies

8. Shell Programming and Scripting

How do I load records from table to a flat file!

Hi All, I need to load records from oracle table XYZ to a flat file say ABC.dat. could any one tell me how do i do this in UNXI, Regards Ann (1 Reply)
Discussion started by: Haque123
1 Replies

9. Shell Programming and Scripting

Inserting records from flat file to db table

I have 20000 numbers present in a file in each line like 25663, 65465, 74579, 56446, .. .. I have created a table in db with single number column in it. create table testhari (no number(9)); I want to insert all these numbers into that table. how can i do it? can anybody please... (4 Replies)
Discussion started by: Hara
4 Replies

10. Shell Programming and Scripting

find the highest number in the file

Hi, I have a file a.txt and it has values in it Eg :- I need to read through the file and find the number that is the greatest in them all. Can any one assit me on this. Thanks (30 Replies)
Discussion started by: systemali
30 Replies
Login or Register to Ask a Question