creating a new variable from existing data

12-01-2011

Registered User

8, 0

Join Date: Oct 2011

Last Activity: 1 December 2011, 4:02 PM EST

Posts: 8

Thanks Given: 4

Thanked 0 Times in 0 Posts

creating a new variable from existing data

Hello,
I have the following data set:
TRAIT DOSE
40 0.4
30 0.3
95 1.2
120 1.7
85 1.4
136 1.8
134 1.8
40 0.4
30 0.3
95 1.2
120 1.7
85 1.4
136 1.8
134 1.8
40 0.4
30 0.3
95 1.2

I want to code a new variable this way. For the Dosage column, say I call it D_GROUP, I would like to group the dosages into 3 distinct groups, that is to say,
Dosage Levels will be
Group1 : 0.0-0.49 including all values more than 0 but less than 0.5
Group2: 0.5-1.49: including all values equal to or more than 0.5 but less than 1.5
Group3: 1.5-2: including all values equal to or more than 0.5 but less than 2

Essentially, I would like my final file to look like this:
TRAIT DOSAGE GROUP
40 0.4 0.0-0.49
30 0.3 0.0-0.49
95 1.2 0.5-1.49
120 1.7 1.5-2
85 1.4 0.5-1.49
136 1.8 1.5-2
134 1.8 1.5-2
40 0.4 0.0-0.49
30 0.3 0.0-0.49
95 1.2 0.5-1.49
120 1.7 1.5-2
85 1.4 0.5-1.49
136 1.8 1.5-2
134 1.8 1.5-2
40 0.4 0.0-0.49
30 0.3 0.0-0.49
95 1.2 0.5-1.49

How do I do this? Please help!
Thank you!

wolf_blue

View Public Profile for wolf_blue

Find all posts by wolf_blue

12-01-2011

Registered User

945, 8

Join Date: Dec 2009

Last Activity: 28 February 2018, 3:14 PM EST

Posts: 945

Thanks Given: 81

Thanked 8 Times in 8 Posts

You can try using awk, store the values in an array on your first entry of the file, then read the file again and print the values. Not complete but will give you an idea.

Outputting the results of each range to a separate file and then joining them together will be more efficient.

Code:

BEGIN {
  ARGV[ARGC] = ARGV[ARGC-1]                                    # Read file twice
  ARGC++
  i=1
  j=1
}

FNR == NR {

  if ($2 >= 0.0 || $2 <= 0.49) {
    range1[i] = $0
  }

  if ($2 >= 0.5 || $2 <= 1.49) {
    range1[j] = $0
  }

  i = i++
  j = j++

  next

}

FNR > NR {
  print range1[m] 
}

kristinu

View Public Profile for kristinu

Find all posts by kristinu

12-01-2011

Registered User

23,310, 4,623

Join Date: Aug 2005

Last Activity: 7 July 2020, 11:47 AM EDT

Location: Saskatchewan

Posts: 23,310

Thanks Given: 1,331

Thanked 4,623 Times in 4,217 Posts

Code:

$ cat grp.awk
BEGIN { split("0.0:0.5:1.5:2", A, ":"); }

{       for(N=1; A[N+1]; N++)
        if(($2 >= A[N])&&($2 < A[N+1]))
        {
                MAX=A[N+1];
                if(A[N+2]) MAX-=0.01;
                $3=sprintf("%.2f-%.2f", A[N], MAX);
                break;
        }
} 1

$ awk -f grp.awk data

40 0.4 0.00-0.49
30 0.3 0.00-0.49
95 1.2 0.50-1.49
120 1.7 1.50-2.00
85 1.4 0.50-1.49
136 1.8 1.50-2.00
134 1.8 1.50-2.00
40 0.4 0.00-0.49
30 0.3 0.00-0.49
95 1.2 0.50-1.49
120 1.7 1.50-2.00
85 1.4 0.50-1.49
136 1.8 1.50-2.00
134 1.8 1.50-2.00
40 0.4 0.00-0.49
30 0.3 0.00-0.49
95 1.2 0.50-1.49

$

Corona688

View Public Profile for Corona688

Visit Corona688's homepage!

Find all posts by Corona688

UNIX for Dummies Questions & Answers

creating a new variable from existing data

7 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Creating the script for updating or replacing the existing http.conf file

Discussion started by: satej

2. AIX

Creating NIM SPOT using an existing mksysb from cliet

Discussion started by: System Admin 77

3. Shell Programming and Scripting

Generate tabular data based on a column value from an existing data file

Discussion started by: himanish

4. Shell Programming and Scripting

add more data to existing data in a file

Discussion started by: redse171

5. Shell Programming and Scripting

Creating/ammending Name Column in existing .txt file

Discussion started by: awknerd

6. Filesystems, Disks and Memory

Creating a Mirror RAID With Existing Disk

Discussion started by: sysera

7. UNIX for Dummies Questions & Answers

help on appending data to existing data

Discussion started by: precious51980