awk: syntax for "if (array doesn't contain a particular index)"


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers awk: syntax for "if (array doesn't contain a particular index)"
# 1  
Old 09-18-2012
awk: syntax for "if (array doesn't contain a particular index)"

Hi!

Let's say I would like to convert "1", "2", "3" to "a", "b", "c" respectively. But if a record contains other number then return "X".

input:
Code:
1
2
3
4

output:
Code:
a
b
c
X

What is the syntax for:
Code:
if(array doesn't contain a particular index){
   then print the value "X" instead}

How do we make the difference between the end of the input string
Code:
array[number]==""

and an unknown number in the array
Code:
array[4]=="X"

?



In this way:
Code:
BEGIN'{FS=OFS=""; array["1"]=a; array["2"]=b; array["3"]=c}
letter=""
if (array[number]==""){
        letter=letter""
	}
else if(<number is not in array>){
        array[number]=="X"
	}
else{ 
        letter=letter array[number]""
}1

Thanks for your help !

Last edited by beca123456; 09-18-2012 at 10:40 PM..
# 2  
Old 09-18-2012
You've got some serious quoting issues.
This should provide you with an example. Prints 'x' if the value in column 1 of the input is not 1, 2 or 3, otherwise translates it.

Code:
awk '
    BEGIN {
        a["1"]="a";
        a["2"]="b";
        a["3"]="c";
    }
    { print !( $1 in a) ? "x" : a[$1]; }
'

# 3  
Old 09-18-2012
Thanks agama for your help !

Your code works if the record contains only 1 letter.
But what if it contains a string of letters instead.
How do I make the difference between the end of the string and an unknown number?

input:
Code:
1233
33
243
11792

output:
Code:
abcc
cc
bXc
aaxxb


Last edited by beca123456; 09-18-2012 at 11:04 PM..
# 4  
Old 09-18-2012
Another example you can tailor to your needs:

Code:
awk '
    BEGIN {
        xtab["1"]="a";
        xtab["2"]="b";
        xtab["3"]="c";
    }
    function xlate( w,  xw, i, n, a )   # translate all characters in w using xtab returning the translation
    {
        xw = "";
        n = split( w, a, "" );
        for( i = 1; i <= n; i++ )
            xw = xw (a[i] in xtab ? xtab[a[i]] : "x");
        return xw;
    }
    { print xlate( $1 ) }
'

This User Gave Thanks to agama For This Post:
# 5  
Old 09-19-2012
Thanks again agama !

However I cannot use the split function as I will have further to deal with substrings of length 2 starting at different positions.

input:
Code:
111213
111214

output:
Code:
abc
abX

I tried to modify your code but it is still the same issue. It adds an extra "X" at the end of every record:
Code:
abcX
abXX

I cannot make the difference between the end of the string and an unknown index (bold line of the code below):
Code:
BEGIN {
        xtab["11"]="a";
        xtab["12"]="b";
        xtab["13"]="c";
    }
    function xlate(i)   
    {
        xw = ""
        do{
            n = substr($1, i, 2)
            xw = xw (n in xtab ? xtab[n] : "X")
            i=i+2}
        while(n!="")
        return $1=xw
    }
    { print xlate(1) }

# 6  
Old 09-19-2012
You are getting the extra X at the end because you are executing your loop one time more than you have pairs in your string. This is the way I would have coded it:

Code:
awk '
    BEGIN {
        xtab["11"]="a";
        xtab["12"]="b";
        xtab["13"]="c";
    }
    function xlate( w,      xw, i, ss )
    {
        xw = "";
        for( i = 1; i < length( w ); i+= 2 )
        {
            ss = substr( w, i, 2 );
            xw = xw (ss in xtab ? xtab[ss] : "x");
        }

        return xw
    }
    { print xlate( $1 ); }
'

There isn't any bound checking to see that the word passed in is an even number of characters. You could also make the function accept both starting and ending positions in the word in addition to the word.

In my opinion, it's bad form to reference field values (e.g. $1) in a function. Better to pass in the contents of the field that you wish to operate on.

If you want to use a while loop (more cumbersome), then you'll need to 'look ahead' before looping. Something like this:

Code:
awk '
BEGIN {
        xtab["11"]="a";
        xtab["12"]="b";
        xtab["13"]="c";
    }
  
    # accepts word and starting offset 
    function xlate( w, start,       xw, n )
    {
        xw = ""
        n = substr(w, start, 2)    # seed n with first pair 
        while( n != "" )
        {
            xw = xw (n in xtab ? xtab[n] : "X")
            start = start + 2;
            n = substr(w, start, 2)    # look ahead to next; exit loop if at end
        }
        return xw
    }
    { print xlate($1, 1) }
'

Hope this gets you going again.
# 7  
Old 09-20-2012
Thanks agama for your explanations !

Quote:
In my opinion, it's bad form to reference field values (e.g. $1) in a function. Better to pass in the contents of the field that you wish to operate on.
I don't get it. What is the difference between the field value (e.g. $1) and the content of the field? Is it not the same at the end?

I managed to go farther with your code and added a for loop after to increment the start position:
Code:
BEGIN {
        xtab["11"]="a";
        xtab["12"]="b";
        xtab["13"]="c";
    }
  
    # accepts word and starting offset 
    function xlate( w, start)
    {
        xw = ""
        n = substr(w, start, 2)    # seed n with first pair 
        while( n != "" )
        {
            xw = xw (n in xtab ? xtab[n] : "X")
            start = start + 2;
            n = substr(w, start, 2)    # look ahead to next; exit loop if at end
        }
        return xw
    }
    { for(i=1; i<=2; i++)
   	  print xlate($1, i)
    }

it returns:
Code:
abc
aXX
abX
aXX

However, if I want farther to extend the code and refer to this field, for example to print only string without X (as a beginning), is there any way to use again the term "$1" in the rest of the script?
Even when I re-assign the value to $1 it doesn't work!

Code:
BEGIN {
        xtab["11"]="a";
        xtab["12"]="b";
        xtab["13"]="c";
    }
  
    # accepts word and starting offset 
    function xlate( w, start)
    {
        xw = ""
        n = substr(w, start, 2)    # seed n with first pair 
        while( n != "" )
        {
            xw = xw (n in xtab ? xtab[n] : "X")
            start = start + 2;
            n = substr(w, start, 2)    # look ahead to next; exit loop if at end
        }
        return xw
    }

    { for(i=1; i<=2; i++)
   	  print xlate($1, i)
    }

    { for(i=1; i<=2; i++)
    		print $1 = xlate($1, i)
                if($1 !~ /X/){
            	    print $1
            	    }
     }


Last edited by beca123456; 09-20-2012 at 04:49 AM..
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

how to use "cut" or "awk" or "sed" to remove a string

logs: "/home/abc/public_html/index.php" "/home/abc/public_html/index.php" "/home/xyz/public_html/index.php" "/home/xyz/public_html/index.php" "/home/xyz/public_html/index.php" how to use "cut" or "awk" or "sed" to get the following result: abc abc xyz xyz xyz (8 Replies)
Discussion started by: timmywong
8 Replies

2. HP-UX

DLT 7000 tape drive failing with "write error on a record in the index"

HP rp5450 (L2000) running HP-UX 11.11B Using DLT 7000 and DLT 4000 tape drives for nightly full backups Backup jobs created by SAM DLT 7000 cron entry is as follows: 00 2 * * 1-6 /usr/sam/lbin/br_backup DLT FULL Y /dev/rmt/0m /var/sam/graphLCAa17036 root Y 1 N > /var/sam/SAM_br_msgs 2>&1... (1 Reply)
Discussion started by: dreh99
1 Replies

3. Shell Programming and Scripting

awk command to replace ";" with "|" and ""|" at diferent places in line of file

Hi, I have line in input file as below: 3G_CENTRAL;INDONESIA_(M)_TELKOMSEL;SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL My expected output for line in the file must be : "1-Radon1-cMOC_deg"|"LDIndex"|"3G_CENTRAL|INDONESIA_(M)_TELKOMSEL"|LAST|"SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL" Can someone... (7 Replies)
Discussion started by: shis100
7 Replies

4. Programming

C: Initialize "const" array from the "heap"

Hello, I am working on solving an NP-Complete problem, so it is very important that operations and data with limited integer-argument ranges be computed using immutable look-up-tables contained entirely in CPU cache. Retrieval of the look-up-table data must never leave the CPU once initially... (6 Replies)
Discussion started by: HeavyJ
6 Replies

5. Shell Programming and Scripting

Need to parse file "x" lines at a time ... awk array?

I have files that store multiple data points for the same device "vertically" and include multiple devices. It repeats a consistant pattern of lines where for each line: Column 1 is a common number for the entire file and all devices in that file Column 2 is a unique device number Column 3 is... (7 Replies)
Discussion started by: STN
7 Replies

6. Shell Programming and Scripting

cat $como_file | awk /^~/'{print $1","$2","$3","$4}' | sed -e 's/~//g'

hi All, cat file_name | awk /^~/'{print $1","$2","$3","$4}' | sed -e 's/~//g' Can this be done by using sed or awk alone (4 Replies)
Discussion started by: harshakusam
4 Replies

7. Shell Programming and Scripting

awk array starting with "ord"

I was writing a awk function and had a error I was wondering about. It revolves around a Multidimensional array starting with ord example: if ( _e == 5 ) { lmrb=$5 ; lmtb=$6 ; larb=$7 ; latb=$8 } ... (2 Replies)
Discussion started by: timj123
2 Replies

8. Shell Programming and Scripting

acessing awk array element while getline < "file"

I am attempting to write a awk script that reads in a file after awk array elements are assigned and using those elements while reading in the new file. Does this make sense? /pattern/ {tst=$3} (( getline < "file" ) > 0 ) { x=x " "tst } When I print tst in the END statement it... (9 Replies)
Discussion started by: timj123
9 Replies

9. UNIX for Dummies Questions & Answers

Explain the line "mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'`"

Hi Friends, Can any of you explain me about the below line of code? mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'` Im not able to understand, what exactly it is doing :confused: Any help would be useful for me. Lokesha (4 Replies)
Discussion started by: Lokesha
4 Replies

10. Shell Programming and Scripting

Awk Syntax Error "!~"

This is my Awk command. direct_count=`awk -F; 'BEGIN { direct_count=0 } $16 !~ /(PBLON|PBNY)/ || $18 !~ /(PBLON|PBNY)/ { direct_count++ } END { print direct_count }' ls.txt` I get a syntax error. BEGIN { direct_count=0 } $16 !~ /(PBLON|PBNY)/ || $18 !~ /(PBLON|PBNY)/ { direct_count++ }... (6 Replies)
Discussion started by: yongho
6 Replies
Login or Register to Ask a Question