Compare Values between column in the same file


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Compare Values between column in the same file
# 15  
Old 09-28-2016
I agree with RudiC that your specification (in both post #1 [after 20 edits] and in post #12) saying:
Quote:
if Column 3 has value= P then check column 7,8,9,10,12,13 (values should be zero in each) and hence sum of these columns should be zero.
if Column 3 has value= N then check column 7,8,9,10,12,13 (values should be zero in each) and hence sum of these columns should be zero.
appears to be a mistake. And to get the output you said you want, it would seem that if field #3 is N, you really want to check 7, 8, 9, 10, 11, and 13 instead of checking exactly the same fields for both P and N.

Assuming that that is correct, here is another awk script using the same underlying logic as the code RudiC suggested. This script builds the array of fields to be skipped for each letter than appears in the string in field #3 manually. RudiC's script depends on an extension to the standards (using an empty ERE in split() to create an array from single characters in a string) that is available in GNU awk but is not often supported on versions of awk available on UNIX systems and BSD-based systems. The standards say that the behavior of awk is unspecified when split() is called with an empty string specified as the field separator.

Unlike RudiC's code, this adds a new column to the header line as shown in the output header you said you wanted. And, while RudiC's code adds up all of the fields being checked, this code looks at each field individually and breaks out of the loop immediately if a non-zero value is found in a field that is to be checked.

The following awk script:
Code:
awk '
BEGIN {	# Initialize skip array: S[fn] = char
	# If field #3 contains the character specified by char, do NOT check
	# the contents of field #fn.
	S[7] = "S"
	S[8] = "E"
	S[9] = "A"
	S[10] = "M"
	S[11] = "P"
	S[12] = "N"
	S[13] = "C"
}
NR < 2 {# Add the requested heading field to the header line...
	print $0, "Newcolumn"
	# and skip to the next input line.
	next
}
{	# For all other input lines, check fields 7-13 inclusive:
	for(i = 7; i <= 13; i++)
		# If the character corresponding to the field is not prsent in
		# field #3 AND the field contains a non-zero value...
		if($3 !~ S[i] && $i) 
			# break out of the loop.
			break
	# Print the input line followed by "WARNING" if a non-zero value was
	# found in a field to be checked; otherwise, print the input line
	# followed by "GOOD".
	print $0, (i <= 13) ? "WARNING" : "GOOD"
}' file

when file contains the sample input you provided, produces the output:
Code:
COLUMN1 COLUMN2 COLUMN3 COLUMN4 COLUMN5 COLUMN6 SMS Email AO Mail Post N Cell Newcolumn
VEGE Potato E W 396 12 0 384 0 0 0 0 0 GOOD
VEGE Onion S W 17 0 17 0 0 0 0 0 0 GOOD
FRUIT APPLE N W 549 61 0 0 0 0 0 488 0 GOOD
FRUIT APPLE SE W 291 14 239 38 0 10 0 0 0 WARNING
FRUIT APPLE EAMS W 397 32 309 56 309 309 0 0 0 GOOD
FRUIT APPLE SEA W 808 58 663 87 488 20 0 0 0 WARNING

which seems to match the output you said you wanted.

If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.
This User Gave Thanks to Don Cragun For This Post:
# 16  
Old 09-29-2016
@Don ..Yes you and Rudi were right about P and N . Your code is working fine only one issue when I am running it, it is giving me list of files in that directory and gives the output.

doing something like below

Code:
> }
> {# For all other input lines, check fields 7-13 inclusive:
> for(i = 7; i <= 13; i++)
>
.csv                          20160405_StarData/            BCSreplay/                    archive/                      neha.txt                      sync_query_results_26668.txt
20140327_ISP_AAID_migrations/ 25049_CT_DELIVERY2.txt        DSLOrderWTN/                  data_for_demos/               neha2.txt                     sync_query_results_29460.txt
20150818_AppleCreate/         32706_CT_DELIVERY2.txt        MIGRATION_DATA_DIRECTORIES/   dp332j/                       neha3.txt                     sync_query_results_6635.txt
20151008_OMSIssue_DeTitanize/ ActivateHSIA/                 OldProjectDirs/               igate_data/                   other_logs/                   text.txt
20151014_DetitanizeMobility/  AdHoc_Requests/               ProcessEDDFiles/              junk/                         replays/                      tmpUniqAsMob.out
20160314_FULCRUM/             AddOrder_data/                ProjectDirs/                  kr9850/                       sh2818/                       tmpUniqAsMob.sql
> # If the character corresponding to the field is not prsent in
>
.csv                          20160405_StarData/            BCSreplay/                    archive/                      neha.txt                      sync_query_results_26668.txt
20140327_ISP_AAID_migrations/ 25049_CT_DELIVERY2.txt        DSLOrderWTN/                  data_for_demos/               neha2.txt                     sync_query_results_29460.txt
20150818_AppleCreate/         32706_CT_DELIVERY2.txt        MIGRATION_DATA_DIRECTORIES/   dp332j/                       neha3.txt                     sync_query_results_6635.txt
20151008_OMSIssue_DeTitanize/ ActivateHSIA/                 OldProjectDirs/               igate_data/                   other_logs/                   text.txt
20151014_DetitanizeMobility/  AdHoc_Requests/               ProcessEDDFiles/              junk/                         replays/                      tmpUniqAsMob.out
20160314_FULCRUM/             AddOrder_data/                ProjectDirs/                  kr9850/                       sh2818/                       tmpUniqAsMob.sql
> # field #3 AND the field contains a non-zero value...
>
.csv                          20160405_StarData/            BCSreplay/                    archive/                      neha.txt                      sync_query_results_26668.txt
20140327_ISP_AAID_migrations/ 25049_CT_DELIVERY2.txt        DSLOrderWTN/                  data_for_demos/               neha2.txt                     sync_query_results_29460.txt
20150818_AppleCreate/         32706_CT_DELIVERY2.txt        MIGRATION_DATA_DIRECTORIES/   dp332j/                       neha3.txt                     sync_query_results_6635.txt
20151008_OMSIssue_DeTitanize/ ActivateHSIA/                 OldProjectDirs/               igate_data/                   other_logs/                   text.txt
20151014_DetitanizeMobility/  AdHoc_Requests/               ProcessEDDFiles/              junk/                         replays/                      tmpUniqAsMob.out
20160314_FULCRUM/             AddOrder_data/                ProjectDirs/                  kr9850/                       sh2818/                       tmpUniqAsMob.sql
> if($3 !~ S[i] && $i)
>

# 17  
Old 09-29-2016
Hello Nina2910,

Could you please create a script for example script.ksh, paste script there then give it executable permissions eg-->chmod 755 and then run Rudi's/Don's code, it should fly without showing anything else then.

Thanks,
R. Singh

Last edited by RavinderSingh13; 09-29-2016 at 12:19 PM..
This User Gave Thanks to RavinderSingh13 For This Post:
# 18  
Old 09-29-2016
@Ravinder...Rudi's code is working without script that's was only for Don's code
# 19  
Old 09-29-2016
Quote:
Originally Posted by Nina2910
@Rudi...yes you are right and code is working perfectly fine..Thank you.

would it be possible if you can explain the code ?


Yes:
Code:
awk '
BEGIN   {MX = split ("      SEAMPNC", CH, _)                    # create the CH array : CH[1] = " ",...,CH[7] = "S",...,CH[13] = "C" by splitting a string constant at the empty string ("_")
        }                                                       # MX variable could be skipped in this context; provided only for totally "dynamic data"
NR > 1  {SUM = 0                                                # initialize SUM for every input line
         for (i = 7; i<=13; i++) if ($3 !~ CH[i]) SUM += $i     # for the to be checked fields: test if the relevant char (CH[fieldNr]) is found in $3
                                                                # if yes, DON"T sum the field
         $(NF+1) = SUM?"WARNING":"GOOD"                         # add the respective info as the "last plus one" filed.
        }
1                                                               # default action: print
 ' file


Last edited by RudiC; 09-29-2016 at 01:34 PM..
This User Gave Thanks to RudiC For This Post:
# 20  
Old 09-29-2016
Quote:
Originally Posted by Nina2910
@Don ..Yes you and Rudi were right about P and N . Your code is working fine only one issue when I am running it, it is giving me list of files in that directory and gives the output.

doing something like below

Code:
> }
> {# For all other input lines, check fields 7-13 inclusive:
> for(i = 7; i <= 13; i++)
>
.csv                          20160405_StarData/            BCSreplay/                    archive/                      neha.txt                      sync_query_results_26668.txt
20140327_ISP_AAID_migrations/ 25049_CT_DELIVERY2.txt        DSLOrderWTN/                  data_for_demos/               neha2.txt                     sync_query_results_29460.txt
20150818_AppleCreate/         32706_CT_DELIVERY2.txt        MIGRATION_DATA_DIRECTORIES/   dp332j/                       neha3.txt                     sync_query_results_6635.txt
20151008_OMSIssue_DeTitanize/ ActivateHSIA/                 OldProjectDirs/               igate_data/                   other_logs/                   text.txt
20151014_DetitanizeMobility/  AdHoc_Requests/               ProcessEDDFiles/              junk/                         replays/                      tmpUniqAsMob.out
20160314_FULCRUM/             AddOrder_data/                ProjectDirs/                  kr9850/                       sh2818/                       tmpUniqAsMob.sql
> # If the character corresponding to the field is not prsent in
>
.csv                          20160405_StarData/            BCSreplay/                    archive/                      neha.txt                      sync_query_results_26668.txt
20140327_ISP_AAID_migrations/ 25049_CT_DELIVERY2.txt        DSLOrderWTN/                  data_for_demos/               neha2.txt                     sync_query_results_29460.txt
20150818_AppleCreate/         32706_CT_DELIVERY2.txt        MIGRATION_DATA_DIRECTORIES/   dp332j/                       neha3.txt                     sync_query_results_6635.txt
20151008_OMSIssue_DeTitanize/ ActivateHSIA/                 OldProjectDirs/               igate_data/                   other_logs/                   text.txt
20151014_DetitanizeMobility/  AdHoc_Requests/               ProcessEDDFiles/              junk/                         replays/                      tmpUniqAsMob.out
20160314_FULCRUM/             AddOrder_data/                ProjectDirs/                  kr9850/                       sh2818/                       tmpUniqAsMob.sql
> # field #3 AND the field contains a non-zero value...
>
.csv                          20160405_StarData/            BCSreplay/                    archive/                      neha.txt                      sync_query_results_26668.txt
20140327_ISP_AAID_migrations/ 25049_CT_DELIVERY2.txt        DSLOrderWTN/                  data_for_demos/               neha2.txt                     sync_query_results_29460.txt
20150818_AppleCreate/         32706_CT_DELIVERY2.txt        MIGRATION_DATA_DIRECTORIES/   dp332j/                       neha3.txt                     sync_query_results_6635.txt
20151008_OMSIssue_DeTitanize/ ActivateHSIA/                 OldProjectDirs/               igate_data/                   other_logs/                   text.txt
20151014_DetitanizeMobility/  AdHoc_Requests/               ProcessEDDFiles/              junk/                         replays/                      tmpUniqAsMob.out
20160314_FULCRUM/             AddOrder_data/                ProjectDirs/                  kr9850/                       sh2818/                       tmpUniqAsMob.sql
> if($3 !~ S[i] && $i)
>

Pasting a script that contains tabs into a shell that uses tabs to trigger command completion is not going to work. As Ravinder suggested, copy my script into a file and execute the file.

You haven't told us what operating system or shell you're using, but I test it out with the following in a file named tester:
Code:
#!/bin/ksh
awk '
BEGIN {	# Initialize skip array: S[fn] = char
	# If field #3 contains the character specified by char, do NOT check
	# the contents of field #fn.
	S[7] = "S"
	S[8] = "E"
	S[9] = "A"
	S[10] = "M"
	S[11] = "P"
	S[12] = "N"
	S[13] = "C"
}
NR < 2 {# Add the requested heading field to the header line...
	print $0, "Newcolumn"
	# and skip to the next input line.
	next
}
{	# For all other input lines, check fields 7-13 inclusive:
	for(i = 7; i <= 13; i++)
		# If the character corresponding to the field is not prsent in
		# field #3 AND the field contains a non-zero value...
		if($3 !~ S[i] && $i) 
			# break out of the loop.
			break
	# Print the input line followed by "WARNING" if a non-zero value was
	# found in a field to be checked; otherwise, print the input line
	# followed by "GOOD".
	print $0, (i <= 13) ? "WARNING" : "GOOD"
}' "${1:-file}"

make it executable with:
Code:
chmod +x tester

and then executing it with:
Code:
./tester

produces the output I showed you in post #15 in this thread. If you invoke it with an operand that is the pathname of a different file to be processed:
Code:
./tester otherfile

it will process the data in otherfile instead of the data in a file named file.
This User Gave Thanks to Don Cragun For This Post:
# 21  
Old 09-30-2016
would it possible if code can add also "Bad" to value which is suppose to be zero but it is not.

Code:
 
 COLUMN1 COLUMN2 COLUMN3 COLUMN4 COLUMN5 COLUMN6 SMS Email AO Mail Post N Cell Newcolumn
VEGE Potato E W 396 12 0 384 0 0 0 0 0 GOOD
VEGE Onion S W 17 0 17 0 0 0 0 0 0 GOOD
FRUIT APPLE N W 549 61 0 0 0 0 0 488 0 GOOD
 FRUIT APPLE SE W 291 14 239 38 0 Bad10 0 0 0 WARNING
FRUIT APPLE EAMS W 397 32 309 56 309 309 0 0 0 GOOD
FRUIT APPLE SEA W 808 58 663 87 488 Bad20 0 0 0 WARNING

like I added bad in row number 5 and 7.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Compare values in multiple rows in one column using awk

I would like to compare values in column 8, and grep the ones where the different is > 1, columns 1 and 2 are the key for array. Every 4 rows the records values in columns 1 and 2 changed. Then, the comparison in the column 8 need to be done for the 4 rows everytime columns 1 and 2 changed ... (4 Replies)
Discussion started by: jiam912
4 Replies

2. Shell Programming and Scripting

Compare two files column values using awk

Judi # cat File1 judi /export/home 76 judi /usr 83 judi # judi # cat File2 judi /export/home 79 judi /usr 82 judi # if COLUMN3 of File2 is greater that COLUMN3 of File1, then print File2's lines juid /export/home 79 Code tags please (2 Replies)
Discussion started by: judi
2 Replies

3. Shell Programming and Scripting

How to compare the values of a column in a same file using awk?

Dear Unix experts, I have got a file where I would like to compare the values of second column if first column is same in such a way that the difference between the values is >50. If not, I would like to discard both values. For example, my input file looks like - comp275_c0_seq2 73... (7 Replies)
Discussion started by: utritala
7 Replies

4. UNIX for Dummies Questions & Answers

Compare values of fields from same column with awk

Hi all ! If there is only one single value in a column (e.g. column 1 below), then return this value in the same output column. If there are several values in the same column (e.g. column 2 below), then return the different values separated by "," in the output. pipe-separated input: ... (11 Replies)
Discussion started by: lucasvs
11 Replies

5. Shell Programming and Scripting

Compare values in two files. For matching rows print corresponding values from File 1 in File2.

- I have two files (File 1 and File 2) and the contents of the files are mentioned below. - I am trying to compare the values of Column1 of File1 with Column1 of File2. If a match is found, print the corresponding value from Column2 of File1 in Column5 of File2. - I tried to modify and use... (10 Replies)
Discussion started by: Santoshbn
10 Replies

6. Shell Programming and Scripting

Take values from a column and put it in a variable and compare

Hi, I have a table in unix from which i want to read the contents line by line, then filter out the values from 6th column one by one and compare it a fixed value. How to do this? (7 Replies)
Discussion started by: arijitsaha
7 Replies

7. Shell Programming and Scripting

How to compare the values of a column in awk in a same file and consecutive lines..

I would like to compare the values of 2nd column of consecutive lines of same file in such a way so that if the difference between first value and second value is more than 100 it should print complete line else ignore line. Input File ========== PDB 2500 RTDB 123 RTDB-EAGLE 122 VSCCP 2565... (4 Replies)
Discussion started by: manuswami
4 Replies

8. UNIX for Dummies Questions & Answers

Compare two files using awk or sed, add values in a column if their previous fields are same

Hi All, I have two files file1: abc,def,ghi,5,jkl,mno pqr,stu,ghi,10,vwx,xyz cba,ust,ihg,4,cdu,oqw file2: ravi,def,kishore ramu,ust,krishna joseph,stu,mike I need two output files as follows In my above example, each row in file1 has 6 fields and each row in file2 has 3... (1 Reply)
Discussion started by: yerruhari
1 Replies

9. UNIX for Advanced & Expert Users

Compare two files using awk or sed, add values in a column if their previous fields are same

Hi All, I have two files file1: abc,def,ghi,5,jkl,mno pqr,stu,ghi,10,vwx,xyz cba,ust,ihg,4,cdu,oqw file2: ravi,def,kishore ramu,ust,krishna joseph,stu,mike I need two output files as follows In my above example, each row in file1 has 6 fields and each row in file2 has 3... (1 Reply)
Discussion started by: yerruhari
1 Replies

10. Shell Programming and Scripting

I need to extract last column of a file and compare the values

Hi, I am new to unix and I need help in solving below mentioned issue, really appreciate ur help. I have a file sam, john, 2324, 07142007 tom, thomson, 2343, 07142007 john, scott, 2478, 07142007 its a comma delimited file, I need to extract the last column from each line and this... (4 Replies)
Discussion started by: vukkusila
4 Replies
Login or Register to Ask a Question