Word boundary with awk in ksh


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Word boundary with awk in ksh
# 1  
Old 05-21-2013
Tools Word boundary with awk in ksh

Hi All,
I am searching for IP address pattern in some files which I want to replace with some characters. However when I replace the IP numbers, it also replaces other characters beyond the IP address like below-
125.29.234.18.23456->SSS.SS.SSS.SS.SSSSS

In the above example it should not replace the 23456 with the new characters. It should be-
125.29.234.18.23456->SSS.SS.SSS.SS.23456
this thing I am doing with awk in ksh. I don't know how to specify word boundary for this IP address so that only 4 numbers separated by 3 dots which form the the IP address should be replaced with the characters.

thanks in advance for the reply.
# 2  
Old 05-21-2013
I do not understand your request. SSSSS is not a number, but you seem to want it replaced. You say you don't want 23456 replaced, but it is a number.

Please give a clear statement of what you are trying to do.
# 3  
Old 05-21-2013
Hi Don,
yes, I dnt want to replace the last number coming after the 4 numbers forming the IP address. So the replacement should consider only the first 4 numbers forming the IP and should replace it, not the number coming beyond it. So the replacement should be like this-
ip1 -> 10.145.234.176.45645-> SS.SSS.SSS.SSS.45645
ip2-> 125.15.234.9.3456->SSS.SS.SSS.S.3456

hope you understand my point.
thanks for your reply
# 4  
Old 05-21-2013
No. I do not understand.

What does:
Code:
ip1 -> 10.145.234.176.45645-> SS.SSS.SSS.SSS.45645
ip2-> 125.15.234.9.3456->SSS.SS.SSS.S.3456

mean?

The string:
Code:
125.29.234.18.23456->SSS.SS.SSS.SS.SSSSS

is not an IP address. It might be two IP addresses separated by "->" although I have never seen IP addresses that contained letters instead of digits.

Show us you input (using CODE tags) and state what it is. (I.e., Is it stored in a shell variable? Is it a line in an input file? Is it something else?)

Then show us the output you are trying to produce using CODE tags. Make positive statements like: I want to change x to y. Stating a negative (I don't want to replace the last number.) doesn't really help if we don't know what you do want to do.

It may also help if you tell us what shell and OS you're using.
# 5  
Old 05-21-2013
Find all IPv4 addresses and replace their digits by S characters?
I suggest perl...
# 6  
Old 05-21-2013
Hi,
Say, I have a file containing 'netstat -an' output like below -
Code:
TCP: IPv4
   Local Address        Remote Address    Swind Send-Q Rwind Recv-Q    State
-------------------- -------------------- ----- ------ ----- ------ -----------
10.108.231.140.33099 10.108.231.140.33122 49152      0 49152      0 ESTABLISHED
127.0.0.1.33123            *.*                0      0 49152      0 LISTEN
      *.*                  *.*                0      0 49152      0 IDLE
10.108.231.140.33124 10.108.231.140.33099 49152      0 49152      0 ESTABLISHED
10.108.231.140.33099 10.108.231.140.33124 49152      0 49152      0 ESTABLISHED
...

I want to replace only the IP address 10.108.231.140 to some letter/character. If you see this pattern 10.108.231.140.33099 which represents some_ip.port_no. I want to replace only the IP address leaving the port no unchanged.

Below sample code replaces all ip with port no pattern without considering the exact IP address pattern.
Code:
for str in ${ip_arr[@]}
do
    #here str contains the exact ip, say 10.108.231.140
    /usr/xpg4/bin/awk '{ for (i=1; i<=NF; i++) if ($i ~ /'str'/) gsub(/[[:digit:]]/, "S", $i) }1' netstat-an.out
done

So can you please tell me how to get output like below-
Code:
TCP: IPv4
   Local Address        Remote Address    Swind Send-Q Rwind Recv-Q    State
-------------------- -------------------- ----- ------ ----- ------ -----------
ff.fff.fff.fff.33099 ff.fff.fff.fff.33122 49152 0 49152 0 ESTABLISHED
ggg.g.g.g.33123 *.* 0 0 49152 0 LISTEN
      *.*                  *.*                0      0 49152      0 IDLE
ff.fff.fff.fff.33124 ff.fff.fff.fff.33099 49152 0 49152 0 ESTABLISHED
ff.fff.fff.fff.33099 ff.fff.fff.fff.33124 49152 0 49152 0 ESTABLISHED
...

As you see, only the IP address is replaced with a Letter. I want this as the desired result.
This I am doing on Solaris using ksh and awk. So I can't use perl for this, because I want this in korn script only.
thanks for your valuable replies.
# 7  
Old 05-22-2013
You could try something like the following:
Code:
#!/usr/xpg4/bin/sh
/usr/xpg4/bin/awk '
# repDigits(1, split1, n1, m) or
# repDigits(2, split2, n2, m) reconstituutes the field specified by the 1st
# argument with data from the array specifeid by the 2nd argument containing
# the # of elements specified by the 3rd argument.  The digits in the 1st 4
# elements of splitN will be replaced by the same number of alphabetic
# characters:  if both fields are being replaced, digits are replaced with "f";
# if only field 1 is being replaced, digits are replaced with "g"; and if only
# field 2 is being replaced, digits are replaced with "h".
function repDigits(field, fieldData, fieldCount, repIndex,      i) {
        $field = sprintf("%.*s.%.*s.%.*s.%.*s",
                        length(fieldData[1]), rep[repIndex],
                        length(fieldData[2]), rep[repIndex],
                        length(fieldData[3]), rep[repIndex],
                        length(fieldData[4]), rep[repIndex])
        for(i = 5; i <= fieldCount; i++)
                $field = $field "." fieldData[i]
}
BEGIN { # Initialize replaceemnt strings for digits in IP addresses
        rep[1] = "gggggggggg"   # 1st field only
        rep[2] = "hhhhhhhhhh"   # 2nd field only
        rep[3] = "ffffffffff"   # both fields
}
NF == 7 {
        # Split the first two fields on this line into the arrays split1[] and
        # split2[] using a period as the field separator and then verify that
        # each array has at least four elements and that the 1st four elements
        # of each is numeric.  The variable m is then set as follows:
        #    0=>neither field meets the criteria
        #    1=>the 1st field meets the criteria, but the 2nd argument does not
        #    2=>the 2nd field meets the criteria, but the 1st argument does not
        #    3=>both arguments meet the criteria
        m = 0
        n1 = split($1, split1, /[.]/)
        if(n1 >= 4 &&   split1[1] ~ /^[0-9]+$/ &&
                        split1[2] ~ /^[0-9]+$/ &&
                        split1[3] ~ /^[0-9]+$/ &&
                        split1[4] ~ /^[0-9]+$/) m = 1
        n2 = split($2, split2, /[.]/)
        if(n2 >= 4 &&   split2[1] ~ /^[0-9]+$/ &&
                        split2[2] ~ /^[0-9]+$/ &&
                        split2[3] ~ /^[0-9]+$/ &&
                        split2[4] ~ /^[0-9]+$/) m += 2
        if(m % 2) repDigits(1, split1, n1, m)
        if(m >= 2) repDigits(2, split2, n2, m)
}
1       # print the (possibly updated) line' input

Note that on Solaris systems, /usr/xpg[46]/bin/sh are Korn shells, but there is nothing in this script that requires Korn shell extensions Any shell that accepts basic Bourne shell syntax will be fine for this script.
Login or Register to Ask a Question

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

How to search for a word in column header that fully matches the word not partially in awk?

I have a multicolumn text file with header in the first row like this The headers are stored in an array called . which contains I want to search for each elements of this array from that multicolumn text file. And I am using this awk approach for ii in ${hdr} do gawk -vcol="$ii" -F... (1 Reply)
Discussion started by: Atta
1 Replies

2. Shell Programming and Scripting

Help with defining a consition within a circular boundary

Hi Help, I am trying to create something like --- Suppose, I have grid origin at X=600000.0 & Y=90000.0. For any values of X, Y values lying within a circular periphery defined by circle of radius R=500m;X=599500.0 & 600500.0 ;Y=90500.0 & 89500.0should have a default=0or else it should... (4 Replies)
Discussion started by: Indra2011
4 Replies

3. Shell Programming and Scripting

How do i replace a word ending with "key" using awk excpet for one word?

echo {mbr_key,grp_key,dep_key,abc,xyz,aaa,ccc} | awk 'gsub(/^|abc,|$/,"") {print}' Required output {grp_key,xyz,aaa,ccc} (5 Replies)
Discussion started by: 100bees
5 Replies

4. UNIX for Dummies Questions & Answers

Spooling data from the database in .csv file with boundary

Hi Guys, Another questions to the genius over here. I have spool the dataf from the database into a .csv file. But can it be possible to have all the rows and column with the boundaries..for example the .csv file which i have is as below: 20140327 BU 9A 3 20140327 SPACE 9A 3 20140327... (8 Replies)
Discussion started by: Pramod_009
8 Replies

5. Red Hat

Warning: extended partition does not start at a cylinder boundary.

Can you please help me to remove this error. Disk /dev/sda: 64.4 GB, 64424509440 bytes 255 heads, 63 sectors/track, 7832 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk... (4 Replies)
Discussion started by: karthik9358
4 Replies

6. UNIX for Dummies Questions & Answers

How do I count how many times a specific word appear in a file (ksh)?

Hi Please can you help how do I count the number of specific characters or words that appear in a file? (8 Replies)
Discussion started by: fretagi
8 Replies

7. Solaris

Partition 0 not aligned on cylinder boundary: "

hi Guys .. user want mirror disk c3t9d0 (running ) to c2t9d0 (fresh hdd). when i tried to bash : prtvtoc /dev/rdsk/c3t9d0s2 | fmthard -s- /dev/rdsk/c2t9d0s2 it showing following error Partition 0 not aligned on cylinder boundary: " 0 4 222 ..... unable to mirror .... plz... (1 Reply)
Discussion started by: coolboys
1 Replies

8. Programming

Aligning for boundary conditions

Hi, I have tcp/ip client server programs which will communicate through reqest,reply c-structures. As the sizeof(struct) may give different value between client and server programs, how do i align properly for boundary conditions. Could anybody please give some suggestion. Thanks in... (3 Replies)
Discussion started by: axes
3 Replies
Login or Register to Ask a Question