How to mark the row based on col value.?

Login or Register to Ask a Question and Join Our Community

How to mark the row based on col value.?

Tags

Login to Discuss or Reply to this Discussion in Our Community

Top Forums Shell Programming and Scripting How to mark the row based on col value.?

02-09-2015

Registered User

190, 4

Join Date: May 2013

Last Activity: 13 February 2017, 4:51 PM EST

Location: Toronto

Posts: 190

Thanks Given: 170

Thanked 4 Times in 4 Posts

How to mark the row based on col value.?

Hi Gurus,

I have requirement to identify the records based on one column value.
the sample file as below:
ID AMT, AMT1

Code:

for each ID, there 2 records, if any one record amt1 is 0, the in 4th col add "no", otherwise, 'yes'. the expected result as below:

Code:

 
ID, AMT, AMT1, ind
100,10, 2, yes
100,20, 3, yes
200,30, 0, no
200, 40, 0, no
300, 20, 2, yes
300, 50, 2, yes
400, 20, 1, no
400, 60, 0, no

can anybody give me some input for this?

thanks in advance.

ken6503

View Public Profile for ken6503

Find all posts by ken6503

02-09-2015

Moderator

12,296, 3,792

Join Date: Nov 2008

Last Activity: 1 January 2021, 1:47 AM EST

Location: Amsterdam

Posts: 12,296

Thanks Given: 679

Thanked 3,792 Times in 3,282 Posts

Try:

Code:

awk '{print $0, $3==0?"no":"yes"}' FS=, OFS=", " file

This User Gave Thanks to Scrutinizer For This Post:

Scrutinizer

View Public Profile for Scrutinizer

Find all posts by Scrutinizer

02-09-2015

Registered User

190, 4

Join Date: May 2013

Last Activity: 13 February 2017, 4:51 PM EST

Location: Toronto

Posts: 190

Thanks Given: 170

Thanked 4 Times in 4 Posts

Quote:

Originally Posted by Scrutinizer

Try:

Code:

awk '{print $0, $3==0?"no":"yes"}' FS=, OFS=", " file

Hi Scrutinizer,

thanks for your quick reply. the result doesn't look correct. please see below result:

Code:

 
100, 20, 1, yes
100, 30, 1, yes
200, 30, 2, yes
200, 30, 2, yes
300, 20, 0, no
300, 20, 0, no
400, 20, 0, no
400, 20, 1, yes

the business rule is "for each ID, there 2 records, if any one record amt1 is 0, the in 4th col add "no", for each ID, there 2 records, if any one record amt1 is 0, the in 4th col add "no""
I expected result is

Code:

 
100, 20, 1, yes
100, 30, 1, yes
200, 30, 2, yes
200, 30, 2, yes
300, 20, 0, no
300, 20, 0, no
400, 20, 0, no
400, 20, 1, no

ken6503

View Public Profile for ken6503

Find all posts by ken6503

02-09-2015

Registered User

503, 195

Join Date: Sep 2013

Last Activity: 22 January 2021, 1:52 PM EST

Location: France

Posts: 503

Thanks Given: 43

Thanked 195 Times in 176 Posts

Hi,
If your input file has just data lines (no title):

Code:

awk 'NR % 2 {A=$1;B=$2;C=$3;next}{($3==0||C==0)?D="no":D="yes";print A,B,C,D;print $1,$2,$3,D}' FS=, OFS=", " file 
100, 10,  2, yes
100, 20,  3, yes
200, 30,  0, no
200,  40,  0, no
300,  20,  2, yes
300,  50,  2, yes
400,  20,  1, no
400,  60,  0, no

Regards.

disedorgue

View Public Profile for disedorgue

Find all posts by disedorgue

02-10-2015

Moderator

3,105, 1,603

Join Date: May 2013

Last Activity: 31 August 2020, 1:46 AM EDT

Location: Chennai

Posts: 3,105

Thanks Given: 1,269

Thanked 1,603 Times in 1,369 Posts

Hello Ken6503,

Following may also help in same. This solution has been made according to input provided, like all row's (except first row) first column is coming continously 2 times and if any first column comes in odd number then it will only look for the last column and accordingly print.

1st: With your given example.

Code:

awk -F, '{if(NR==1){print $0 OFS "VALUE"} else {A=$NF;B=$0;$0="";getline;if($0 ~ /^$/){E=A!=0?B OFS "yes":"no";print E;E=""} else {Q=$NF!=0 && A!=0 ?B OFS "yes" ORS $0 OFS "yes":B OFS "no" ORS $0 OFS "no";print Q;Q=""}}}' OFS=,  Input_file

Output will be as follows.

Code:

ID AMT, AMT1,VALUE
100,10, 2,yes
100,20, 3,yes
200,30, 0,no
200, 40, 0,no
300, 20, 2,yes
300, 50, 2,yes
400, 20, 1,no
400, 60, 0,no

2nd: With my tried example, where I have added one extra row at last.

Code:

cat test5
ID AMT, AMT1
100,10, 2
100,20, 3
200,30, 0
200, 40, 0
300, 20, 2
300, 50, 2
400, 20, 1
400, 60, 0
700, 60, 3

Code is as follows.

Code:

 awk -F, '{if(NR==1){print $0 OFS "VALUE"} else {A=$NF;B=$0;$0="";getline;if($0 ~ /^$/){E=A!=0?B OFS "yes":"no";print E;E=""} else {Q=$NF!=0 && A!=0 ?B OFS "yes" ORS $0 OFS "yes":B OFS "no" ORS $0 OFS "no";print Q;Q=""}}}' OFS=, test5

Output is as follows.

Code:

ID AMT, AMT1,VALUE
100,10, 2,yes
100,20, 3,yes
200,30, 0,no
200, 40, 0,no
300, 20, 2,yes
300, 50, 2,yes
400, 20, 1,no
400, 60, 0,no
700, 60, 3,yes

Hope this helps.

Thanks,
R. Singh

RavinderSingh13

View Public Profile for RavinderSingh13

Find all posts by RavinderSingh13

02-10-2015

Registered User

12,315, 4,560

Join Date: Jul 2012

Last Activity: 22 November 2019, 4:29 PM EST

Location: San Jose, CA, USA

Posts: 12,315

Thanks Given: 952

Thanked 4,560 Times in 3,818 Posts

From post #1 in this thread it isn't clear whether your input file contains a header line or not and it appears that you want a two line header in your output file (with the first header line consisting of a single space character).

Rather than working on pairs of lines and starting counting lines on line 1 or line 2, the following code produces the desired header and then groups all sets of one or more adjacent lines that have three fields and contain the same numeric value in the first field adding a "yes" to each line in the set on output if none of the 3rd fields in the set on input is 0 or an empty string, and adding a "no" to each line in the set on output if one or more of the 3rd fields in the set on input is 0 or an empty string:

Code:

#!/bin/ksh
awk '
BEGIN {	printf(" \nID, AMT, AMT1, ind\n")
	FS = "[ ,]+"
	OFS = ", "
}
function pd(	i) {
	for(i = 1; i <= cnt; i++)
		print line[i], ind
	ind = "yes"
	cnt = 0
}
NF == 3 && ($1 + 0) == $1 {
	if($1 != last)
		pd()
	last = $1
	line[++cnt] = $0
	if($3 == 0)
		ind = "no"
}
END {	pd()
}' sample.txt

If sample.txt contains the text you showed in post #1 in this thread optionally containing one or more blank lines and any number of either, both, or neither of the following header lines:

Code:

ID, AMT, AMT1

or:

Code:

ID AMT, AMT1

and any number of comment lines (starting with a # or an alphabetic character); it produces the output:

Code:

 
ID, AMT, AMT1, ind
100,10, 2, yes
100,20, 3, yes
200,30, 0, no
200, 40, 0, no
300, 20, 2, yes
300, 50, 2, yes
400, 20, 1, no
400, 60, 0, no

as you requested.

If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk, /usr/xpg6bin/awk, or nawk.

This was tested using the Korn shell, but will work with any shell that uses basic Bourne shell syntax.

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Splitting single row into multiple rows based on for every 10 digits of last field of the row

Hi ALL, We have requirement in a file, i have multiple rows. Example below: Input file rows 01,1,102319,0,0,70,26,U,1,331,000000113200000011920000001212 01,1,102319,0,1,80,20,U,1,241,00000059420000006021 I need my output file should be as mentioned below. Last field should split for...

2. Emergency UNIX and Linux Support

Read values in each col starting 3rd row.Print occurrence value.

Hello Friends, Hope all are doing fine. Here is a tricky issue. my input file is like this 07 10 14 20 21 03 15 27 30 32 01 10 11 19 30 02 06 14 15 17 01 06 20 25 29 Logic: 1. Please print another column as "0-0-0-0-0" for the first and second rows. 2. Read the first column...

3. Shell Programming and Scripting

Field widths based on a row

I want to specify field width based on the row with FTR. I can acheive this if column width is constant with: awk 'BEGIN { FIELDWIDTHS = "20 7 14 30" }{print $1,$4}' file file:COL1 COL2 CL3 FTR AA8 S2 CAT2 your comments CC7 ...

4. Shell Programming and Scripting

Bin iteratively based on each row

Hi I have a file some thing like below. I want to bin the data. My Bin size is 100. items number HELIX1 75 HELIX6 160 HELIX2 88 HELIX19 114 HELIX5 61 HELIX4 167 it should consider each elemet under the number column and bin all the lines like below with 100...

5. Shell Programming and Scripting

Modifying col values based on another col

Hi, Please help with this. I have several excel files (with and .xlsx format) with 10-15 columns each. They all have the same type of data but the columns are not ordered in the same way. Here is a 3 column example. What I want to do add the alphabet from column 2 to column 3, provided...

6. Shell Programming and Scripting

Printing from col x to end of line, except last col

Hello, I have some tab delimited data and I need to move the last col. I could hard code it, awk '{ print $1,$NF,$2,$3,$4,etc }' infile > outfile but it would be nice to know the syntax to print a range cols. I know in cut you can do, cut -f 1,4-8,11- to print fields 1,...

7. Shell Programming and Scripting

Change col to row using shell script..Very Complex

Hi guys I have file A with Below Data ABC123 X1 X2 X3 ABC123 Y1 Y33 Y4 ABC123 Z1 ZS2 ZL3 ABC234 P1 PP3 PP9 ABC234 Q1 ABC234 R1 P09 PO332 PO331 OKI12 .. .. .. Now I want file B as below ABC123 X1 X2 X3;Y1 Y33 Y4;Z1 ZS2 ZL3 ABC234 P1 PP3 PP9;Q1;R1 P09 PO332 PO331 OKI12...

8. Shell Programming and Scripting

Find duplicate based on 'n' fields and mark the duplicate as 'D'

Hi, In a file, I have to mark duplicate records as 'D' and the latest record alone as 'C'. In the below file, I have to identify if duplicate records are there or not based on Man_ID, Man_DT, Ship_ID and I have to mark the record with latest Ship_DT as "C" and other as "D" (I have to create...

9. UNIX for Dummies Questions & Answers

how to join files with diff col # and row #?

I am a new user of Unix/Linux, so this question might be a bit simple! I am trying to join two (very large) files that both have different # of cols and rows in each file. I want to keep 'all' rows and 'all' cols from both files in the joint file, and the primary key variables are in the rows....

10. UNIX for Advanced & Expert Users

Print line based on highest value of col (B) and repetion of values in col (A)

Hello everyone, I am writing a script to process data from the ATP world tour. I have a file which contains: t=540 y=2011 r=1 p=N409 t=540 y=2011 r=2 p=N409 t=540 y=2011 r=3 p=N409 t=540 y=2011 r=4 p=N409 t=520 y=2011 r=1 p=N409 t=520 y=2011 r=2 p=N409 t=520 y=2011 r=3 p=N409 The...

Login or Register to Ask a Question