Sponsored Content
Top Forums Shell Programming and Scripting Median and max of duplicate rows Post 302839189 by ripat on Wednesday 31st of July 2013 03:27:53 PM
Old 07-31-2013
ok, I see where the problem is. The ternary condition was not expecting to see zero values.

Try this:
Code:
{nbr[$1]++; a[$1]= (a[$1]!="") ? a[$1]"@"$2 : $2; sum[$1]+=$2} # NEW


END {
  for (key in a) {
    len = nbr[key]
    if ( len > 3 ) {
      split(a[key], b, "@")
      for (i=1;i<=len;i++) {
        avg = sum[key] / nbr[key];
        if (nbr[key]%2) {
          median = b[(nbr[key]+1)/2]
        } else {
          median = (b[(nbr[key]/2)+1] + b[nbr[key]/2])/2
        }
      }
      printf "%s %s %s %s %s\n", key, b[len], avg, median, b[1]
    }
  }
}
'


Last edited by ripat; 07-31-2013 at 04:40 PM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

duplicate rows in a file

hi all can anyone please let me know if there is a way to find out duplicate rows in a file. i have a file that has hundreds of numbers(all in next row). i want to find out the numbers that are repeted in the file. eg. 123434 534 5575 4746767 347624 5575 i want 5575 please help (3 Replies)
Discussion started by: infyanurag
3 Replies

2. Shell Programming and Scripting

How to extract duplicate rows

I have searched the internet for duplicate row extracting. All I have seen is extracting good rows or eliminating duplicate rows. How do I extract duplicate rows from a flat file in unix. I'm using Korn shell on HP Unix. For.eg. FlatFile.txt ======== 123:456:678 123:456:678 123:456:876... (5 Replies)
Discussion started by: bobbygsk
5 Replies

3. HP-UX

How to get Duplicate rows in a file

Hi all, I have written one shell script. The output file of this script is having sql output. In that file, I want to extract the rows which are having multiple entries(duplicate rows). For example, the output file will be like the following way. ... (7 Replies)
Discussion started by: raghu.iv85
7 Replies

4. Shell Programming and Scripting

How to extract duplicate rows

Hi! I have a file as below: line1 line2 line2 line3 line3 line3 line4 line4 line4 line4 I would like to extract duplicate lines (not unique, triplicate or quadruplicate lines). Output will be as below: line2 line2 I would appreciate if anyone can help. Thanks. (4 Replies)
Discussion started by: chromatin
4 Replies

5. Programming

eliminate duplicate rows - sqlloader

Hi , I have a data file in this format. p1 p2 p3 10 0 10 0 1000 I am using a sqlloader script to load the data into the database table.There is a unique constraint on the columns p1 and p2. So, sqlldr cannot load both the records. This eliminates duplicate records from being... (1 Reply)
Discussion started by: megha2525
1 Replies

6. Shell Programming and Scripting

Delete duplicate rows

Hi, This is a followup to my earlier post him mno klm 20 76 . + . klm_mango unix_00000001; alp fdc klm 123 456 . + . klm_mango unix_0000103; her tkr klm 415 439 . + . klm_mango unix_00001043; abc tvr klm 20 76 . + . klm_mango unix_00000001; abc def klm 83 84 . + . klm_mango... (5 Replies)
Discussion started by: jacobs.smith
5 Replies

7. Programming

Getting Rows from a MySQL Table with max values?

I feel stupid for asking this because it seems that MYSQL code isn't working the way that I think it should work. Basically I wrote code like this: select * from `Test_DC_Trailer` HAVING max(DR_RefKey); Where the DR_RefKey is a unique numeric field that is auto iterated (like a primary key)... (7 Replies)
Discussion started by: Astrocloud
7 Replies

8. UNIX for Dummies Questions & Answers

get max value every 4 rows between 2 column

Hi all I have a file that has two columns and I need the maximum value in column 2 of 4 positions o rows. for example at position {1..3} there are 4 characters (A, C, G and T) each of these characters with a value with a value in column 2. I need the maximum value in column 2 and the corresponding... (2 Replies)
Discussion started by: xinox
2 Replies

9. Shell Programming and Scripting

How to duplicate rows using awk or any other method?

I want to duplicate each row in my file Egfile.txt Name State Age Jack NJ 34 John MA 23 Jessica FL 45 I want the code to produce this output Name State Age Jack NJ 34 Jack NJ 34 John MA 23 John MA 23 Jessica FL 45 Jessica FL 45 (6 Replies)
Discussion started by: sidnow
6 Replies

10. UNIX for Dummies Questions & Answers

Any 'shortcut' to doing this search for duplicate and print max

Hi, I have a file that contains multiple records of the same database. I need to search for the maximum size of the database. At the moment, I am doing as below: Sample generated file to parse is as below. With the caret (^) delimiter, field 1 is the database name, 2 is the database ID and... (3 Replies)
Discussion started by: newbie_01
3 Replies
CGESDD(l)								 )								 CGESDD(l)

NAME
CGESDD - compute the singular value decomposition (SVD) of a complex M-by-N matrix A, optionally computing the left and/or right singular vectors, by using divide-and-conquer method SYNOPSIS
SUBROUTINE CGESDD( JOBZ, M, N, A, LDA, S, U, LDU, VT, LDVT, WORK, LWORK, RWORK, IWORK, INFO ) CHARACTER JOBZ INTEGER INFO, LDA, LDU, LDVT, LWORK, M, N INTEGER IWORK( * ) REAL RWORK( * ), S( * ) COMPLEX A( LDA, * ), U( LDU, * ), VT( LDVT, * ), WORK( * ) PURPOSE
CGESDD computes the singular value decomposition (SVD) of a complex M-by-N matrix A, optionally computing the left and/or right singular vectors, by using divide-and-conquer method. The SVD is written A = U * SIGMA * conjugate-transpose(V) where SIGMA is an M-by-N matrix which is zero except for its min(m,n) diagonal elements, U is an M-by-M unitary matrix, and V is an N-by-N unitary matrix. The diagonal elements of SIGMA are the singular values of A; they are real and non-negative, and are returned in descend- ing order. The first min(m,n) columns of U and V are the left and right singular vectors of A. Note that the routine returns VT = V**H, not V. The divide and conquer algorithm makes very mild assumptions about floating point arithmetic. It will work on machines with a guard digit in add/subtract, or on those binary machines without guard digits which subtract like the Cray X-MP, Cray Y-MP, Cray C-90, or Cray-2. It could conceivably fail on hexadecimal or decimal machines without guard digits, but we know of none. ARGUMENTS
JOBZ (input) CHARACTER*1 Specifies options for computing all or part of the matrix U: = 'A': all M columns of U and all N rows of V**H are returned in the arrays U and VT; = 'S': the first min(M,N) columns of U and the first min(M,N) rows of V**H are returned in the arrays U and VT; = 'O': If M >= N, the first N columns of U are overwritten on the array A and all rows of V**H are returned in the array VT; otherwise, all columns of U are returned in the array U and the first M rows of V**H are overwritten in the array VT; = 'N': no columns of U or rows of V**H are computed. M (input) INTEGER The number of rows of the input matrix A. M >= 0. N (input) INTEGER The number of columns of the input matrix A. N >= 0. A (input/output) COMPLEX array, dimension (LDA,N) On entry, the M-by-N matrix A. On exit, if JOBZ = 'O', A is overwritten with the first N columns of U (the left singular vectors, stored columnwise) if M >= N; A is overwritten with the first M rows of V**H (the right singular vectors, stored rowwise) other- wise. if JOBZ .ne. 'O', the contents of A are destroyed. LDA (input) INTEGER The leading dimension of the array A. LDA >= max(1,M). S (output) REAL array, dimension (min(M,N)) The singular values of A, sorted so that S(i) >= S(i+1). U (output) COMPLEX array, dimension (LDU,UCOL) UCOL = M if JOBZ = 'A' or JOBZ = 'O' and M < N; UCOL = min(M,N) if JOBZ = 'S'. If JOBZ = 'A' or JOBZ = 'O' and M < N, U contains the M-by-M unitary matrix U; if JOBZ = 'S', U contains the first min(M,N) columns of U (the left singular vectors, stored column- wise); if JOBZ = 'O' and M >= N, or JOBZ = 'N', U is not referenced. LDU (input) INTEGER The leading dimension of the array U. LDU >= 1; if JOBZ = 'S' or 'A' or JOBZ = 'O' and M < N, LDU >= M. VT (output) COMPLEX array, dimension (LDVT,N) If JOBZ = 'A' or JOBZ = 'O' and M >= N, VT contains the N-by-N unitary matrix V**H; if JOBZ = 'S', VT contains the first min(M,N) rows of V**H (the right singular vectors, stored rowwise); if JOBZ = 'O' and M < N, or JOBZ = 'N', VT is not referenced. LDVT (input) INTEGER The leading dimension of the array VT. LDVT >= 1; if JOBZ = 'A' or JOBZ = 'O' and M >= N, LDVT >= N; if JOBZ = 'S', LDVT >= min(M,N). WORK (workspace/output) COMPLEX array, dimension (LWORK) On exit, if INFO = 0, WORK(1) returns the optimal LWORK. LWORK (input) INTEGER The dimension of the array WORK. LWORK >= 1. if JOBZ = 'N', LWORK >= 2*min(M,N)+max(M,N). if JOBZ = 'O', LWORK >= 2*min(M,N)*min(M,N)+2*min(M,N)+max(M,N). if JOBZ = 'S' or 'A', LWORK >= min(M,N)*min(M,N)+2*min(M,N)+max(M,N). For good perfor- mance, LWORK should generally be larger. If LWORK < 0 but other input arguments are legal, WORK(1) returns the optimal LWORK. RWORK (workspace) REAL array, dimension (LRWORK) If JOBZ = 'N', LRWORK >= 7*min(M,N). Otherwise, LRWORK >= 5*min(M,N)*min(M,N) + 5*min(M,N) IWORK (workspace) INTEGER array, dimension (8*min(M,N)) INFO (output) INTEGER = 0: successful exit. < 0: if INFO = -i, the i-th argument had an illegal value. > 0: The updating process of SBDSDC did not converge. FURTHER DETAILS
Based on contributions by Ming Gu and Huan Ren, Computer Science Division, University of California at Berkeley, USA LAPACK version 3.0 15 June 2000 CGESDD(l)
All times are GMT -4. The time now is 02:17 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy