awk: syntax for "if (array doesn't contain a particular index)"

09-20-2012

Registered User

1,413, 498

Join Date: Mar 2012

Last Activity: 8 November 2019, 2:39 AM EST

Location: India

Posts: 1,413

Thanks Given: 101

Thanked 498 Times in 474 Posts

You may refer to $1 inside a function, no problems. But, remember that in awk, all the variables used in a user-defined function are global (except for those passed as parameters while invoking the function). So, if your function changes $1 in any way (assignment,sub/gsub,etc.), those changes will be permanent, which might not be what you wanted.

So, it's always better to pass data to functions using the parameters so that you are working on a copy of the data and in no way, changing/destroying with the original data.

This User Gave Thanks to elixir_sinari For This Post:

elixir_sinari

View Public Profile for elixir_sinari

Find all posts by elixir_sinari

09-20-2012

Registered User

123, 1

Join Date: Apr 2012

Last Activity: 3 February 2020, 7:11 AM EST

Posts: 123

Thanks Given: 70

Thanked 1 Time in 1 Post

Thanks elixir_sinari for this explanation !

However, I am a bit confused with "values" and "parameters".
if I take the example of the code I' ve written in the previous post (same as at the end here).
"w" is the parameter defining the value "$1".

Quote:

So, it's always better to pass data to functions using the parameters so that you are working on a copy of the data and in no way, changing/destroying with the original data.

How do you pass data to functions using parameters?
Do you need to assign the parameter ("w") to the value ("$1") before invoking the function? Tell me if I am wrong, but with the code below it would give:

Code:

...
{function}

{ for(i=1; i<=2; i++)
   	  $1 = w
          print xlate(w, i)        # and not print xlate($1, i)
}

{I can keep using $1 after}

Quote:

(except for those passed as parameters while invoking the function)

That's what I did before, but $1 is not recognised once I invoked the function. ???

Code I am referring to:

Code:

BEGIN {
        xtab["11"]="a";
        xtab["12"]="b";
        xtab["13"]="c";
    }
  
    # accepts word and starting offset 
    function xlate( w, start)
    {
        xw = ""
        n = substr(w, start, 2)    # seed n with first pair 
        while( n != "" )
        {
            xw = xw (n in xtab ? xtab[n] : "X")
            start = start + 2;
            n = substr(w, start, 2)    # look ahead to next; exit loop if at end
        }
        return xw
    }

    { for(i=1; i<=2; i++)
   	  print xlate($1, i)
    }

    { for(i=1; i<=2; i++)
    		print $1 = xlate($1, i)
                if($1 !~ /X/){
            	    print $1
            	    }
     }

beca123456

View Public Profile for beca123456

Find all posts by beca123456

09-20-2012

Registered User

1,466, 512

Join Date: Jul 2010

Last Activity: 7 April 2014, 3:02 PM EDT

Location: earth>US>UTC-5

Posts: 1,466

Thanks Given: 110

Thanked 512 Times in 491 Posts

Quote:

Originally Posted by beca123456

I don't get it. What is the difference between the field value (e.g. $1) and the content of the field? Is it not the same at the end?

You are right, the contents of the field, and field value are the same. What I was suggesting was passing the value to the function rather than using $1 inside of the function. It's not always possible/easy to avoid using globals, but when you can avoid them it is best.

To pass $1 to a function you just put $1 in the call to the function. It is assigned to the variable in the matching position in the function definition. For instance, the function foo accepts two parameters:

Code:

function foo( p1, p2)
{
   print p1 + p2;
}

and if you want to add fields 1 and 2 from the input record to be processed by foo(), then you can call it like this:

Code:

foo( $1, $2);

When the function is invoked, the contents of field1 is assigned to p1 and the contents of field 2 is assigned to p2.

Quote:

I managed to go farther with your code and added a for loop after to increment the start position:

Code:

BEGIN {
        xtab["11"]="a";
        xtab["12"]="b";
        xtab["13"]="c";
    }
  
    # accepts word and starting offset 
    function xlate( w, start)
    {
        xw = ""
        n = substr(w, start, 2)    # seed n with first pair 
        while( n != "" )
        {
            xw = xw (n in xtab ? xtab[n] : "X")
            start = start + 2;
            n = substr(w, start, 2)    # look ahead to next; exit loop if at end
        }
        return xw
    }
     { for(i=1; i<=2; i++)  #you need to incr by 2 to skip both characters
         print xlate($1, i)
    }

it returns:

Code:

abc
aXX
abX
aXX

However, if I want farther to extend the code and refer to this field, for example to print only string without X (as a beginning), is there any way to use again the term "$1" in the rest of the script?
Even when I re-assign the value to $1 it doesn't work!
[code]

Yes, as long as you don't assign anything to $1, you can use it later in the script.

This User Gave Thanks to agama For This Post:

agama

View Public Profile for agama

Find all posts by agama

09-21-2012

Registered User

123, 1

Join Date: Apr 2012

Last Activity: 3 February 2020, 7:11 AM EST

Posts: 123

Thanks Given: 70

Thanked 1 Time in 1 Post

Thanks agama, it is more clear to me now!

But, how could I do if I would need a track of the output I got after converting number to letters, but need to keep going in my script using this ouptut (the track I produced just before) as an input for the rest of the script.

I know how to output from the script, but I don't know how to input a new file from the script.
Is it doable with awk?

I tried several variants of that but it didn't work:

Code:

BEGIN {
        xtab["11"]="a";
        xtab["12"]="b";
        xtab["13"]="c";
    }
  
    # accepts word and starting offset 
    function xlate( w, start)
    {
        xw = ""
        n = substr(w, start, 2) 
        while( n != "" )
        {
            xw = xw (n in xtab ? xtab[n] : "X")
            start = start + 2;
            n = substr(w, start, 2)
        }
        return xw
    }
     { 
         for(i=1; i<=2; 1 +=2)
             print xlate($1, i) > "track.txt"         # I can keep a track like that
     }
     {
         while(0< (getline < "track.test")          # I don't know how to input a different file from the one I used originally
             do the rest of my script
                 ...
     }

---------- Post updated 21-09-12 at 08:29 PM ---------- Previous update was 20-09-12 at 09:24 PM ----------

Help please !!!

beca123456

View Public Profile for beca123456

Find all posts by beca123456

09-22-2012

Registered User

1,466, 512

Join Date: Jul 2010

Last Activity: 7 April 2014, 3:02 PM EDT

Location: earth>US>UTC-5

Posts: 1,466

Thanks Given: 110

Thanked 512 Times in 491 Posts

Quote:

Originally Posted by beca123456

Thanks agama, it is more clear to me now! Smilie

It depends on what you need. Your example is a bit confusing to me. Remember that awk processes the input file one line at a time. Your example was trying to read the track.text file before it was closed (you weren't finished writing to it).

You have a choice depending on what you need to do. You can make the translation and save it in a variable, and use that variable in the rest of your script, or you can create the file track.text and then use a second script to process its contents. You've not stated what you are trying to do, and that might help provide you with a more specific answer to your question.

If it's possible to give an example of your input and desired output that'd be great. At the moment I'm not able to check very often, but there are lots of others on this site that can provide good direction.

agama

View Public Profile for agama

Find all posts by agama

09-22-2012

Registered User

123, 1

Join Date: Apr 2012

Last Activity: 3 February 2020, 7:11 AM EST

Posts: 123

Thanks Given: 70

Thanked 1 Time in 1 Post

You are right agama. It is not really clear.

I try with a precise example:

I need to convert every second line of the input starting from the first letter, then starting from the second letter, then the third, ..., and finally starting from the letter before the last one (according to the conversion table in the BEGIN section).

input:

Code:

%line1
abcdef
%line2
...

in order to get this ouput:

Code:

line1|135   # starting from the first letter (i.e. "abcdef")
line1|24X   # starting from the second letter (i.e. "bcdef")
line1|35     # starting from the third letter (i.e. "cdef")
line1|4X     # starting from the fourth letter (i.e. "def")
line1|5       # starting from the fifth letter (i.e. "ef")
line2|...

I wanted to do that with 2 different approaches, just to be familiar with awk user-defined function and the function getline:

1) using an increment in a function called "convert(letter)" with only one argument, and use the result as a variable that I could use in the rest of the script.
This is the topic I posted in a different thread:https://www.unix.com/unix-dummies-que...program-2.html

2) not using an increment in the function called "convert(letter, start)" with two arguments (the string to convert + the position to start), and redirect the results from the script to an output file. From this file I wanted to use the getline function to redirect it from the script and keep processing it.
This is the topic of this thread.

But it appeared that none of these strategies work !!!

For the purpose of this thread I focused on using getline function (2nd strategy).
I tried:

Code:

BEGIN {RS="%"; ORS="\n"; OFS="|"; conv["ab"]="1"; conv["bc"]="2"; conv["cd"]="3"; conv["de"]="4"; conv["ef"]="5"}

     
    function convert(letter, start){
        number = ""
        ss = substr(letter, start, 2) 
        while( ss != "" ){
            number = number (ss in conv ? conv[ss] : "X")
            start = start + 2;
            ss = substr(letter, start, 2)
        }
        return number
    }


NR==1{next}
	
NR>1{
            sub("\n","|",$0)

           { 
                l = length($2)
                for(i=1; i<=(l-1);  i++){
                    print convert($2, i) > "track.txt"     # redirect output to a file here
                }
                close("track.txt")                              # according to your suggestion
           }
}

{
    RS="\n"
    while(0< (getline < "track.test")){
        print $2}
}

The output I obtain with this code is messy (a mix between the original input with no FS, RS...)

Last edited by beca123456; 09-22-2012 at 10:18 PM..

beca123456

View Public Profile for beca123456

Find all posts by beca123456

UNIX for Dummies Questions & Answers

awk: syntax for "if (array doesn't contain a particular index)"

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to use "cut" or "awk" or "sed" to remove a string

Discussion started by: timmywong

2. HP-UX

DLT 7000 tape drive failing with "write error on a record in the index"

Discussion started by: dreh99

3. Shell Programming and Scripting

awk command to replace ";" with "|" and ""|" at diferent places in line of file

Discussion started by: shis100

4. Programming

C: Initialize "const" array from the "heap"

Discussion started by: HeavyJ

5. Shell Programming and Scripting

Need to parse file "x" lines at a time ... awk array?

Discussion started by: STN

6. Shell Programming and Scripting

cat $como_file | awk /^~/'{print $1","$2","$3","$4}' | sed -e 's/~//g'

Discussion started by: harshakusam

7. Shell Programming and Scripting

awk array starting with "ord"

Discussion started by: timj123

8. Shell Programming and Scripting

acessing awk array element while getline < "file"

Discussion started by: timj123

9. UNIX for Dummies Questions & Answers

Explain the line "mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'`"

Discussion started by: Lokesha

10. Shell Programming and Scripting

Awk Syntax Error "!~"

Discussion started by: yongho