Filter duplicate records from csv file with condition on one column Post: 303010264

Sponsored Content

Top Forums Shell Programming and Scripting Filter duplicate records from csv file with condition on one column Post 303010264 by as7951 on Saturday 30th of December 2017 12:41:11 AM

12-30-2017

Registered User

Hi Rudic,

I don't want to modify input data in csv file and don't want output in diff file
i just want to throw/print error for the rows where condition is not met in csv file

File should contain data in two columns in below given format.
and numbers in row and column may vary.
In short if column 2 contain row(1-2) with duplicate values(1234,1234) and column 3 should also contain duplicate values(4567,4567) in row(1-2)
and false condition will be when column 2 contain duplicate value(0808,0808,0808) where in row(1-3) but column 3 does not contain duplicate value(4567,4567,1234) in rows(1-3) , where column 3 contain 1234 in row 3 which causes this condition to be false

hope im clear now
Good condition

Code:

DT	DN	ON
CN	1234	4567
CN	1234	4567
CN	9876	6543
CN	9876	6543
CN	5678	4321
CN	5678	4321
CN	0909	3089
CN	0909	3089

False condition in "red"

Code:

DT   DN     ON
CN   0808  4567
CN   0808  4567
CN   0808  1234

---------- Post updated at 03:31 AM ---------- Previous update was at 02:24 AM ----------

Hi chubler,

Could you please help me , how to execute these script.
As when i tried putting these code in .sh file then no output is coming
and when tried from command line getting syntax error at "next" command.

---------- Post updated 12-30-17 at 12:41 AM ---------- Previous update was 12-29-17 at 03:31 AM ----------

Hi chubler,

Thank you for the code, will run and test the same,
and will let you know for issue if any.

thanks Smilie

Last edited by Don Cragun; 12-29-2017 at 03:38 AM.. Reason: Add CODE tags again.

as7951

View Public Profile for as7951

Find all posts by as7951

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Find Duplicate records in first Column in File

Hi, Need to find a duplicate records on the first column, ANU4501710430989 0000000W20389390 ANU4501710430989 0000000W67065483 ANU4501130050520 0000000W80838713 ANU4501210170685 0000000W69246611...

2. Shell Programming and Scripting

Apply condition on fixed width file and filter records

Dear members.. I have a fixed width file. Requirement is as below:- 1. Scan each record from this fixed width file 2. Check for value under field no "6" equals to "ABC". If yes, then filter this record into the output file Please suggest a unix command to achieve this, my guess awk might...

3. UNIX for Dummies Questions & Answers

CSV file:Find duplicates, save original and duplicate records in a new file

Hi Unix gurus, Maybe it is too much to ask for but please take a moment and help me out. A very humble request to you gurus. I'm new to Unix and I have started learning Unix. I have this project which is way to advanced for me. File format: CSV file File has four columns with no header...

4. Shell Programming and Scripting

Removing duplicate records in a file based on single column

Hi, I want to remove duplicate records including the first line based on column1. For example inputfile(filer.txt): ------------- 1,3000,5000 1,4000,6000 2,4000,600 2,5000,700 3,60000,4000 4,7000,7777 5,999,8888 expected output: ---------------- 3,60000,4000 4,7000,7777...

5. Shell Programming and Scripting

Removing duplicate records in a file based on single column explanation

I was reading this thread. It looks like a simpler way to say this is to only keep uniq lines based on field or column 1. https://www.unix.com/shell-programming-scripting/165717-removing-duplicate-records-file-based-single-column.html Can someone explain this command please? How are there no...

6. Linux

Filter a .CSV file based on the 5th column values

I have a .CSV file with the below format: "column 1","column 2","column 3","column 4","column 5","column 6","column 7","column 8","column 9","column 10 "12310","42324564756","a simple string with a , comma","string with or, without commas","string 1","USD","12","70%","08/01/2013",""...

7. Shell Programming and Scripting

Identify duplicate values at first column in csv file

Input 1,ABCD,no 2,system,yes 3,ABCD,yes 4,XYZ,no 5,XYZ,yes 6,pc,noCode used to find duplicate with regard to 2nd column awk 'NR == 1 {p=$2; next} p == $2 { print "Line" NR "$2 is duplicated"} {p=$2}' FS="," ./input.csv Now is there a wise way to de-duplicate the entire line (remove...

8. Shell Programming and Scripting

Filter file to remove duplicate values in first column

Hello, I have a script that is generating a tab delimited output file. num Name PCA_A1 PCA_A2 PCA_A3 0 compound_00 -3.5054 -1.1207 -2.4372 1 compound_01 -2.2641 0.4287 -1.6120 3 compound_03 -1.3053 1.8495 ...

9. Shell Programming and Scripting

CSV File:Filter duplicate records from column1 & another column having unique record

Hi Experts, I have csv file with 30, 40 columns Pasting just 2 column for problem description. Need to print error if below combination is not present in file check for column-1 (DocumentNumber) and filter columns where value in DocumentNumber field is same. For all such rows, the field...

10. UNIX for Beginners Questions & Answers

Filtering records of a csv file based on a value of a column

Hi, I tried filtering the records in a csv file using "awk" command listed below. awk -F"~" '$4 ~ /Active/{print }' inputfile > outputfile The output always has all the entries. The same command worked for different users from one of the forum links. content of file I was...

LEARN ABOUT REDHAT

dgeequ

DGEEQU(l)								 )								 DGEEQU(l)

NAME

       DGEEQU - compute row and column scalings intended to equilibrate an M-by-N matrix A and reduce its condition number

SYNOPSIS

       SUBROUTINE DGEEQU( M, N, A, LDA, R, C, ROWCND, COLCND, AMAX, INFO )

	   INTEGER	  INFO, LDA, M, N

	   DOUBLE	  PRECISION AMAX, COLCND, ROWCND

	   DOUBLE	  PRECISION A( LDA, * ), C( * ), R( * )

PURPOSE

       DGEEQU computes row and column scalings intended to equilibrate an M-by-N matrix A and reduce its condition number. R returns the row scale
       factors and C the column scale factors, chosen to try to make the largest element in each row and column of  the  matrix  B  with  elements
       B(i,j)=R(i)*A(i,j)*C(j) have absolute value 1.

       R(i) and C(j) are restricted to be between SMLNUM = smallest safe number and BIGNUM = largest safe number.  Use of these scaling factors is
       not guaranteed to reduce the condition number of A but works well in practice.

ARGUMENTS

       M       (input) INTEGER
	       The number of rows of the matrix A.  M >= 0.

       N       (input) INTEGER
	       The number of columns of the matrix A.  N >= 0.

       A       (input) DOUBLE PRECISION array, dimension (LDA,N)
	       The M-by-N matrix whose equilibration factors are to be computed.

       LDA     (input) INTEGER
	       The leading dimension of the array A.  LDA >= max(1,M).

       R       (output) DOUBLE PRECISION array, dimension (M)
	       If INFO = 0 or INFO > M, R contains the row scale factors for A.

       C       (output) DOUBLE PRECISION array, dimension (N)
	       If INFO = 0,  C contains the column scale factors for A.

       ROWCND  (output) DOUBLE PRECISION
	       If INFO = 0 or INFO > M, ROWCND contains the ratio of the smallest R(i) to the largest R(i).  If ROWCND >= 0.1 and AMAX is  neither
	       too large nor too small, it is not worth scaling by R.

       COLCND  (output) DOUBLE PRECISION
	       If INFO = 0, COLCND contains the ratio of the smallest C(i) to the largest C(i).  If COLCND >= 0.1, it is not worth scaling by C.

       AMAX    (output) DOUBLE PRECISION
	       Absolute  value	of  largest  matrix  element.	If AMAX is very close to overflow or very close to underflow, the matrix should be
	       scaled.

       INFO    (output) INTEGER
	       = 0:  successful exit
	       < 0:  if INFO = -i, the i-th argument had an illegal value
	       > 0:  if INFO = i,  and i is
	       <= M:  the i-th row of A is exactly zero
	       >  M:  the (i-M)-th column of A is exactly zero

LAPACK version 3.0						   15 June 2000 							 DGEEQU(l)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Find Duplicate records in first Column in File

Discussion started by: Murugesh

2. Shell Programming and Scripting

Apply condition on fixed width file and filter records

Discussion started by: sureshg_sampat

3. UNIX for Dummies Questions & Answers

CSV file:Find duplicates, save original and duplicate records in a new file

Discussion started by: arvindosu

4. Shell Programming and Scripting

Removing duplicate records in a file based on single column

Discussion started by: G.K.K