Sponsored Content
Top Forums Shell Programming and Scripting Extract and count number of Duplicate rows Post 302777817 by franzpizzo on Friday 8th of March 2013 12:14:54 PM
Old 03-08-2013
Code:
awk '
{s[$0]++}              # this populate an array, the number of elements is the distinct value in the file (A B C) 
END {                  # and the value is the count of each element: eg. if i=A --> s[i]=3
  for(i in s) {        # for each distinct value i in s
  for(j=1;j<s[i];j++){ # s[i] is the count of element i: in this way
      print i;         # print s[i]-1 times the element i
  }
  }
}' ${TMP_DUPE_RECS}>${TMP_BAD_DATA_DUPE_RECS}

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract duplicate fields in rows

I have a input file with formating: 6000000901 ;36200103 ;h3a01f496 ; 2000123605 ;36218982 ;heefa1328 ; 2000273132 ;36246985 ;h08c5cb71 ; 2000041207 ;36246985 ;heef75497 ; Each fields is seperated by semi-comma. Sometime, the second files is... (6 Replies)
Discussion started by: anhtt
6 Replies

2. Shell Programming and Scripting

How to extract duplicate rows

I have searched the internet for duplicate row extracting. All I have seen is extracting good rows or eliminating duplicate rows. How do I extract duplicate rows from a flat file in unix. I'm using Korn shell on HP Unix. For.eg. FlatFile.txt ======== 123:456:678 123:456:678 123:456:876... (5 Replies)
Discussion started by: bobbygsk
5 Replies

3. UNIX for Dummies Questions & Answers

how to count number of rows and sum of column using awk

Hi All, I have the following input which i want to process using AWK. Rows,NC,amount 1,1202,0.192387 2,1201,0.111111 3,1201,0.123456 i want the following output count of rows = 3 ,sum of amount = 0.426954 Many thanks (2 Replies)
Discussion started by: pistachio
2 Replies

4. Shell Programming and Scripting

how to add the number of row and count number of rows

Hi experts a have a very large file and I need to add two columns: the first one numbering the incidence of records and the another with the total count The input file: 21 2341 A 21 2341 A 21 2341 A 21 2341 C 21 2341 C 21 2341 C 21 2341 C 21 4567 A 21 4567 A 21 4567 C ... (6 Replies)
Discussion started by: juelillo
6 Replies

5. UNIX for Dummies Questions & Answers

count number of rows based on other column values

Could anybody help with this? I have input below ..... david,39 david,39 emelie,40 clarissa,22 bob,42 bob,42 tim,32 bob,39 david,38 emelie,47 what i want to do is count how many names there are with different ages, so output would be like this .... david,2 emelie,2 clarissa,1... (3 Replies)
Discussion started by: itsme999
3 Replies

6. Shell Programming and Scripting

How to extract duplicate rows

Hi! I have a file as below: line1 line2 line2 line3 line3 line3 line4 line4 line4 line4 I would like to extract duplicate lines (not unique, triplicate or quadruplicate lines). Output will be as below: line2 line2 I would appreciate if anyone can help. Thanks. (4 Replies)
Discussion started by: chromatin
4 Replies

7. UNIX for Dummies Questions & Answers

Script to count number of rows

Hi, I need a solaris shell script to read multiple files and count number of unique name rows(strings) from those files. The input and output should be like this Input: file 1 abc cde abc ... (9 Replies)
Discussion started by: ssk250
9 Replies

8. Shell Programming and Scripting

Extract duplicate rows with conditions

Gents Can you help please. Input file 5490921425 1 7 1310342 54909214251 5490921425 2 1 1 54909214252 5491120937 1 1 3 54911209371 5491120937 3 1 1 54911209373 5491320785 1 ... (4 Replies)
Discussion started by: jiam912
4 Replies

9. Shell Programming and Scripting

Extract and exclude rows based on duplicate values

Hello I have a file like this: > cat examplefile ghi|NN603762|eee mno|NN607265|ttt pqr|NN613879|yyy stu|NN615002|uuu jkl|NN607265|rrr vwx|NN615002|iii yzA|NN618555|ooo def|NN190486|www BCD|NN628717|ppp abc|NN190486|qqq EFG|NN628717|aaa HIJ|NN628717|sss > I can sort the file by... (5 Replies)
Discussion started by: CHoggarth
5 Replies

10. Shell Programming and Scripting

Reseting row count every given number of rows

I have a file with 48 rows. I am counting 6 rows and adding 6 to that number and repeating the operation, and then output the value in column 1. For the second column, I would like to get sort of a binary output (1s and 2s) every 3rd row. This is what I have: awk '{print ++src +... (1 Reply)
Discussion started by: Xterra
1 Replies
CQTEST(8C)																CQTEST(8C)

NAME
cqtest - HylaFAX copy quality checking test program SYNOPSIS
/usr/sbin/cqtest [ options ] input.tif DESCRIPTION
cqtest is a program for testing the copy quality checking support in the HylaFAX software (specifically, in the faxgetty(8C) program). cqtest takes a TIFF/F (TIFF Class F) file and generates a new TIFF/F file that is a copy of the input file, but with any erroneous scan- lines replaced/regenerated. In addition, cqtest prints diagnostic messages describing its actions and indicates whether the input data has acceptable copy quality according to the copy quality checking threshold parameters. Options are provided for specifying copy quality checking threshold parameters OPTIONS
-m badlines Set the maximum consecutive bad lines of data that may appear in each acceptable page of input data. This is equivalent to the MaxConsecutiveBadLines configuration parameter; c.f. hylafax-config(5F). By default cqtest accepts no more than 5 con- secutive bad lines in a page. -o file Write output to file. By default output is written to the file cq.tif. -p %goodlines Set the minimum percentage of ``good lines'' of data that may appear in acceptable page of input data. A line is good if it decodes without error to a row of pixels that is the expected width. This is equivalent to the PercentGoodLines configura- tion parameter; c.f. hylafax-config(5F). By default cqtest requires that 95% of the rows of each page be good. EXAMPLES
The following shows a multi-page, high-resolution document with a single error on each page. Each page has acceptable copy quality using the default threshold parameters. hyla% /usr/sbin/cqtest ~/tiff/pics/faxix.tif 1728 x 297, 7.7 line/mm, 1-D MH, lsb-to-msb RECV/CQ: Bad 1D pixel count, row 245, got 1616, expected 1728 RECV: 2234 total lines, 1 bad lines, 1 consecutive bad lines 1728 x 297, 7.7 line/mm, 1-D MH, lsb-to-msb RECV/CQ: Bad 1D pixel count, row 148, got 3023, expected 1728 RECV: 2234 total lines, 1 bad lines, 1 consecutive bad lines 1728 x 297, 7.7 line/mm, 1-D MH, lsb-to-msb RECV/CQ: Bad 1D pixel count, row 151, got 1722, expected 1728 RECV: 2234 total lines, 1 bad lines, 1 consecutive bad lines 1728 x 297, 7.7 line/mm, 1-D MH, lsb-to-msb RECV/CQ: Bad 1D pixel count, row 148, got 1776, expected 1728 RECV: 2234 total lines, 1 bad lines, 1 consecutive bad lines SEE ALSO
faxgetty(8C), hylafax-config(5F) October 3, 1995 CQTEST(8C)
All times are GMT -4. The time now is 11:41 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy