Sponsored Content
Top Forums Shell Programming and Scripting Extract and count number of Duplicate rows Post 302777817 by franzpizzo on Friday 8th of March 2013 12:14:54 PM
Old 03-08-2013
Code:
awk '
{s[$0]++}              # this populate an array, the number of elements is the distinct value in the file (A B C) 
END {                  # and the value is the count of each element: eg. if i=A --> s[i]=3
  for(i in s) {        # for each distinct value i in s
  for(j=1;j<s[i];j++){ # s[i] is the count of element i: in this way
      print i;         # print s[i]-1 times the element i
  }
  }
}' ${TMP_DUPE_RECS}>${TMP_BAD_DATA_DUPE_RECS}

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract duplicate fields in rows

I have a input file with formating: 6000000901 ;36200103 ;h3a01f496 ; 2000123605 ;36218982 ;heefa1328 ; 2000273132 ;36246985 ;h08c5cb71 ; 2000041207 ;36246985 ;heef75497 ; Each fields is seperated by semi-comma. Sometime, the second files is... (6 Replies)
Discussion started by: anhtt
6 Replies

2. Shell Programming and Scripting

How to extract duplicate rows

I have searched the internet for duplicate row extracting. All I have seen is extracting good rows or eliminating duplicate rows. How do I extract duplicate rows from a flat file in unix. I'm using Korn shell on HP Unix. For.eg. FlatFile.txt ======== 123:456:678 123:456:678 123:456:876... (5 Replies)
Discussion started by: bobbygsk
5 Replies

3. UNIX for Dummies Questions & Answers

how to count number of rows and sum of column using awk

Hi All, I have the following input which i want to process using AWK. Rows,NC,amount 1,1202,0.192387 2,1201,0.111111 3,1201,0.123456 i want the following output count of rows = 3 ,sum of amount = 0.426954 Many thanks (2 Replies)
Discussion started by: pistachio
2 Replies

4. Shell Programming and Scripting

how to add the number of row and count number of rows

Hi experts a have a very large file and I need to add two columns: the first one numbering the incidence of records and the another with the total count The input file: 21 2341 A 21 2341 A 21 2341 A 21 2341 C 21 2341 C 21 2341 C 21 2341 C 21 4567 A 21 4567 A 21 4567 C ... (6 Replies)
Discussion started by: juelillo
6 Replies

5. UNIX for Dummies Questions & Answers

count number of rows based on other column values

Could anybody help with this? I have input below ..... david,39 david,39 emelie,40 clarissa,22 bob,42 bob,42 tim,32 bob,39 david,38 emelie,47 what i want to do is count how many names there are with different ages, so output would be like this .... david,2 emelie,2 clarissa,1... (3 Replies)
Discussion started by: itsme999
3 Replies

6. Shell Programming and Scripting

How to extract duplicate rows

Hi! I have a file as below: line1 line2 line2 line3 line3 line3 line4 line4 line4 line4 I would like to extract duplicate lines (not unique, triplicate or quadruplicate lines). Output will be as below: line2 line2 I would appreciate if anyone can help. Thanks. (4 Replies)
Discussion started by: chromatin
4 Replies

7. UNIX for Dummies Questions & Answers

Script to count number of rows

Hi, I need a solaris shell script to read multiple files and count number of unique name rows(strings) from those files. The input and output should be like this Input: file 1 abc cde abc ... (9 Replies)
Discussion started by: ssk250
9 Replies

8. Shell Programming and Scripting

Extract duplicate rows with conditions

Gents Can you help please. Input file 5490921425 1 7 1310342 54909214251 5490921425 2 1 1 54909214252 5491120937 1 1 3 54911209371 5491120937 3 1 1 54911209373 5491320785 1 ... (4 Replies)
Discussion started by: jiam912
4 Replies

9. Shell Programming and Scripting

Extract and exclude rows based on duplicate values

Hello I have a file like this: > cat examplefile ghi|NN603762|eee mno|NN607265|ttt pqr|NN613879|yyy stu|NN615002|uuu jkl|NN607265|rrr vwx|NN615002|iii yzA|NN618555|ooo def|NN190486|www BCD|NN628717|ppp abc|NN190486|qqq EFG|NN628717|aaa HIJ|NN628717|sss > I can sort the file by... (5 Replies)
Discussion started by: CHoggarth
5 Replies

10. Shell Programming and Scripting

Reseting row count every given number of rows

I have a file with 48 rows. I am counting 6 rows and adding 6 to that number and repeating the operation, and then output the value in column 1. For the second column, I would like to get sort of a binary output (1s and 2s) every 3rd row. This is what I have: awk '{print ++src +... (1 Reply)
Discussion started by: Xterra
1 Replies
array(n)						       Tcl Built-In Commands							  array(n)

__________________________________________________________________________________________________________________________________________________

NAME
array - Manipulate array variables SYNOPSIS
array option arrayName ?arg arg ...? _________________________________________________________________ DESCRIPTION
This command performs one of several operations on the variable given by arrayName. Unless otherwise specified for individual commands below, arrayName must be the name of an existing array variable. The option argument determines what action is carried out by the command. The legal options (which may be abbreviated) are: array anymore arrayName searchId Returns 1 if there are any more elements left to be processed in an array search, 0 if all elements have already been returned. SearchId indicates which search on arrayName to check, and must have been the return value from a previous invocation of array startsearch. This option is particularly useful if an array has an element with an empty name, since the return value from array nextelement won't indicate whether the search has been completed. array donesearch arrayName searchId This command terminates an array search and destroys all the state associated with that search. SearchId indicates which search on arrayName to destroy, and must have been the return value from a previous invocation of array startsearch. Returns an empty string. array exists arrayName Returns 1 if arrayName is an array variable, 0 if there is no variable by that name or if it is a scalar variable. array get arrayName ?pattern? Returns a list containing pairs of elements. The first element in each pair is the name of an element in arrayName and the second element of each pair is the value of the array element. The order of the pairs is undefined. If pattern is not specified, then all of the elements of the array are included in the result. If pattern is specified, then only those elements whose names match pat- tern (using the matching rules of string match) are included. If arrayName isn't the name of an array variable, or if the array contains no elements, then an empty list is returned. array names arrayName ?pattern? Returns a list containing the names of all of the elements in the array that match pattern (using the matching rules of string match). If pattern is omitted then the command returns all of the element names in the array. If there are no (matching) elements in the array, or if arrayName isn't the name of an array variable, then an empty string is returned. array nextelement arrayName searchId Returns the name of the next element in arrayName, or an empty string if all elements of arrayName have already been returned in this search. The searchId argument identifies the search, and must have been the return value of an array startsearch command. Warning: if elements are added to or deleted from the array, then all searches are automatically terminated just as if array done- search had been invoked; this will cause array nextelement operations to fail for those searches. array set arrayName list Sets the values of one or more elements in arrayName. list must have a form like that returned by array get, consisting of an even number of elements. Each odd-numbered element in list is treated as an element name within arrayName, and the following element in list is used as a new value for that array element. If the variable arrayName does not already exist and list is empty, arrayName is created with an empty array value. array size arrayName Returns a decimal string giving the number of elements in the array. If arrayName isn't the name of an array then 0 is returned. array startsearch arrayName This command initializes an element-by-element search through the array given by arrayName, such that invocations of the array nex- telement command will return the names of the individual elements in the array. When the search has been completed, the array done- search command should be invoked. The return value is a search identifier that must be used in array nextelement and array done- search commands; it allows multiple searches to be underway simultaneously for the same array. | array unset arrayName ?pattern? | Unsets all of the elements in the array that match pattern (using the matching rules of string match). If arrayName isn't the name | of an array variable or there are no matching elements in the array, then an empty string is returned. If pattern is omitted and is | it an array variable, then the command unsets the entire array. KEYWORDS
array, element names, search Tcl 8.3 array(n)
All times are GMT -4. The time now is 01:30 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy