Convert ip ranges to CIDR netblock


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Convert ip ranges to CIDR netblock
# 22  
Old 12-21-2018
congrats.
If you paid attention to the thread, I've been testing the latest version of the script with data files either
Code:
1.62.189.215 1.62.189.222
4.0.25.146 4.0.25.148
4.0.26.14 4.0.29.24
24.149.30.0 24.149.30.4

OR
with your downloaded file ydxerpxkpcfqjaybcssw.gz.
Also your FS="-| - |: is equivalent to my FS="[- :]".

I enhanced the script that allows to do pretty printing of the output that's easier to digest:
Code:
function range2cidr(ipStart, ipEnd, result, bits, mask, newip) {
    bits = 1
    mask = 1
    while (bits < 32) {
        newip = or(ipStart, mask)
        if ((newip>ipEnd) || ((lshift(rshift(ipStart,bits),bits)) != ipStart)) {
           bits--
           mask = rshift(mask,1)
           break
        }
        bits++
        mask = lshift(mask,1)+1
    }
    newip = or(ipStart, mask)
    bits = 32 - bits
    result = (result)?result ORS dec2ip(ipStart) "/" bits : dec2ip(ipStart) "/" bits
    if (newip < ipEnd) result = range2cidr(newip + 1, ipEnd,result)
    return result
}

# convert dotted quads to long decimal ip
#       int ip2dec("192.168.0.15")
#
function ip2dec(ip, slice) {
        split(ip, slice, ".")
        return (slice[1] * 2^24) + (slice[2] * 2^16) + (slice[3] * 2^8) + slice[4]
}

# convert decimal long ip to dotted quads
#       str dec2ip(1171259392)
#
function dec2ip(dec, ip, quad) {
        for (i=3; i>=1; i--) {
                quad = 256^i
                ip = ip int(dec/quad) "."
                dec = dec%quad
        }
        return ip dec
}

function sanitize(ip) {
        split(ip, slice, ".")
        return slice[1]/1 "." slice[2]/1 "." slice[3]/1 "." slice[4]/1
}

BEGIN{
        #FS=" |-|:"
        # to match BOTH formats: 'ip ip' AND the one in ydxerpxkpcfqjaybcssw file
        FS="[- :]"

}

# sanitize ip's
!/^#/ && NF {
  f1= sanitize($(NF-1))
  f2= sanitize($NF)
  cidr=range2cidr(ip2dec(f1), ip2dec(f2))
  if (pretty) {
     gsub("^|"ORS, ORS "\t", cidr)
     print  f1 " -> " f2 " : " cidr
  }
  else
     print cidr
}

END {print ""}

Doing awk -v pretty=1 -f ipRange2cidrREAL.awk ipRange2cidrREAL.txt produces:
Code:
222.104.193.5 -> 222.104.193.5 :
        222.104.193.5/32
222.106.31.112 -> 222.106.31.112 :
        222.106.31.112/32
222.106.31.123 -> 222.106.31.123 :
        222.106.31.123/32
222.126.13.224 -> 222.126.13.231 :
        222.126.13.224/29
222.191.251.186 -> 222.191.251.186 :
        222.191.251.186/32
222.126.13.224 -> 222.126.13.251 :
        222.126.13.224/28
        222.126.13.240/29
        222.126.13.248/30
1.62.189.215 -> 1.62.189.222 :
        1.62.189.215/32
        1.62.189.216/30
        1.62.189.220/31
        1.62.189.222/32
4.0.25.146 -> 4.0.25.148 :
        4.0.25.146/31
        4.0.25.148/32
4.0.26.14 -> 4.0.29.24 :
        4.0.26.14/31
        4.0.26.16/28
        4.0.26.32/27
        4.0.26.64/26
        4.0.26.128/25
        4.0.27.0/24
        4.0.28.0/24
        4.0.29.0/28
        4.0.29.16/29
        4.0.29.24/32
24.149.30.0 -> 24.149.30.4 :
        24.149.30.0/30
        24.149.30.4/32

with the data file ipRange2cidrREAL.txt:
Code:
222.104.193.5 222.104.193.5
222.106.31.112 222.106.31.112
222.106.31.123 222.106.31.123
222.126.13.224 222.126.13.231
222.191.251.186 222.191.251.186
222.126.13.224 222.126.13.251
1.62.189.215 1.62.189.222
4.0.25.146 4.0.25.148
4.0.26.14 4.0.29.24
24.149.30.0 24.149.30.4

If you skip -v pretty=1 on cli, you'll get the old default output as you see it now.
Hope you find it useful.
This User Gave Thanks to vgersh99 For This Post:
# 23  
Old 12-21-2018
Forgive my ignorance. I get wrong results when running the script with a file containing the following IP address ranges. Note that the IP address ranges are separated by a space hyphen space.

Example
Code:
1.62.189.215 - 1.62.189.222
4.0.25.146 - 4.0.25.148
4.0.26.14 - 4.0.29.24
24.149.30.0 - 24.149.30.4

When I changed the field separator to include space hyphen space FS="-| - |:" it correctly converts the IP ranges.

This field separator FS="[- :]" will not properly convert my example IP ranges in this post. Granted, the script you posted doesn't parse the downloaded file to include a space before and after the hyphens that separate IP ranges. I think the script is more flexible now since it will also properly convert the example I provided.

Can you test and confirm that the field separators that I posted and the one you posted are indeed the same? I get different results. Please test against the example that I have just posted.

Thank you for your time.
# 24  
Old 12-21-2018
You're right - the FS definitions are not the same - I've overlooked something in your definition.
If you are to handle 2 possible formats:
  1. address ranges separated by - ( dash surrounded by spaces)
  2. your downloaded log file
use your FS definition (FS="-| - |:") and it will handle both formats.

I was using my list (with ip ranges separated by a single space with no dash) for my own debugging only.

Just remember that the script will only cover the previously described formats.
The basic assumption is that it takes the next to last field AND the last fields in the records - however you define your input Field Separator (FS).

Last edited by vgersh99; 12-21-2018 at 03:29 PM..
This User Gave Thanks to vgersh99 For This Post:
# 25  
Old 12-24-2018
I'm using OpenBSD that uses awk source coding dating back to 2011. gawk is not in the base install of OpenBSD because of the GNU licensing so it has to be installed as a package. When I run the script iprange2cidr.awk using awk I get a segmentation fault. I'm curious if someone can change the script to avoid the fault. I've isolated the line in the script that causes the fault:

Code:
if (newip < ipEnd) result = range2cidr(newip + 1, ipEnd,result)

If I remove the above listed code, awk will still process a file containing 230k lines of data without a segmentation fault but the output would obviously be wrong. Any suggestions on how this line might be changed to accommodate OpenBSD's awk version?

At the bottom of the OpenBSD awk man page the bug section advises the following:

Quote:
There are no explicit conversions between numbers and strings. To force an expression to be treated as a number add 0 to it; to force it to be treated as a string concatenate "" to it.
The scope rules for variables in functions are a botch; the syntax is worse.
Any suggestions?
# 26  
Old 12-25-2018
Quote:
Originally Posted by azdps
I'm using OpenBSD that uses awk source coding dating back to 2011. gawk is not in the base install of OpenBSD because of the GNU licensing so it has to be installed as a package. When I run the script iprange2cidr.awk using awk I get a segmentation fault. I'm curious if someone can change the script to avoid the fault. I've isolated the line in the script that causes the fault:

Code:
if (newip < ipEnd) result = range2cidr(newip + 1, ipEnd,result)

If I remove the above listed code, awk will still process a file containing 230k lines of data without a segmentation fault but the output would obviously be wrong. Any suggestions on how this line might be changed to accommodate OpenBSD's awk version?

At the bottom of the OpenBSD awk man page the bug section advises the following:



Any suggestions?
It can be anything including improper handling of the recursive call to range2cidr.

You can try this mod, but it's difficult to guess what the issue might be:
Code:
if (newip+0 < ipEnd+0) result = range2cidr(newip + 1, ipEnd,result)

Can you install gawk?
# 27  
Old 12-25-2018
Didn't work vgersh99. yeah I can install gawk. I just wanted it to work with awk. I'm using this on my OpenBSD firewall. I prefer to use base installed programs instead of having to install packages.
# 28  
Old 12-25-2018
If we go back to the code you showed us in post #1 in this thread:
Code:
# Convert IP ranges to CIDR notation

function range2cidr(ipStart, ipEnd, result, bits, mask, newip) {
    bits = 1
    mask = 1
    while (bits < 32) {
        newip = or(ipStart, mask)
        if ((newip>ipEnd) || ((lshift(rshift(ipStart,bits),bits)) != ipStart)) {
           bits--
           mask = rshift(mask,1)
           break
        }
        bits++
        mask = lshift(mask,1)+1
    }
    newip = or(ipStart, mask)
    bits = 32 - bits
    result = (result)?result ORS dec2ip(ipStart) "/" bits : dec2ip(ipStart) "/" bits
    if (newip < ipEnd) result = range2cidr(newip + 1, ipEnd,result)
    return result
}

# convert dotted quads to long decimal ip
#       int ip2dec("192.168.0.15")
#
function ip2dec(ip, slice) {
        split(ip, slice, ".")
        return (slice[1] * 2^24) + (slice[2] * 2^16) + (slice[3] * 2^8) + slice[4]
}

# convert decimal long ip to dotted quads
#       str dec2ip(1171259392)
#
function dec2ip(dec, ip, quad) {
        for (i=3; i>=1; i--) {
                quad = 256^i
                ip = ip int(dec/quad) "."
                dec = dec%quad
        }
        return ip dec
}

function sanitize(ip) {
        split(ip, slice, ".")
        return slice[1]/1 "." slice[2]/1 "." slice[3]/1 "." slice[4]/1
}

BEGIN{
        FS=" - |-|:"
}

# sanitize ip's
!/^#/ && NF {
  f1= sanitize($(NF-1))
  f2= sanitize($NF)
  print range2cidr(ip2dec(f1), ip2dec(f2))
}

END {print ""}

you will see two lines that I have highlighted in red. According to the standards, these two lines tell the script to split the 1st argument given to the functions ip2dec() and sanitize() into an array named slice with every character in the 1st argument to be treated as a field separator (in other words, the "." argument to split() is an extended regular expression that is a special character matching any character and if those functions are called with an argument like "192.168.0.15" you'll end up with the slice array having 13 elements each with a value that is an empty string; not 4 elements having the values 192, 168, 0, and 15, respectively, that I assume your code is expecting).

Please try rerunning your script after changing those two lines to any one of the following four possible replacements:
Code:
        split(ip, slice, /[.]/)
or:
        split(ip, slice, /\./)
or:
        split(ip, slice, "[.]")
or:
        split(ip, slice, "\.")

and let us know what happens.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. What is on Your Mind?

Blocked A6-Index and Entire AWS Netblock

Weary of seeing our load average go up to 50+, I just did a major block on these networks (stats over a less than 20 min interval): https://www.unix.com/members/1-albums215-picture866.png (3 Replies)
Discussion started by: Neo
3 Replies

2. Shell Programming and Scripting

Convert ip ranges to CIDR netblocks

Hi, Recently I had to convert a 280K lines of ip ranges to the CIDR notation and generate a file to be used by ipset (netfilter) for ip filtering. Input file: 000.000.000.000 - 000.255.255.255 , 000 , invalid ip 001.000.064.000 - 001.000.127.255 , 000 , XXXXX 001.000.245.123 -... (10 Replies)
Discussion started by: ripat
10 Replies

3. Shell Programming and Scripting

How to change ip addressing format from CIDR notation to netmask and vice versa?

Hi all, I would appreciate if someone could share how to convert CIDR notation to netmask and vice versa. The value below is just an example. it could be different numbers/ip addresses. Initial Output, let say file1.txt Final Output, let say file2.txt (3 Replies)
Discussion started by: type8code0
3 Replies

4. Shell Programming and Scripting

How to convert multiple number ranges into sequence?

Looking for a simple way to convert ranges to a numerical sequence that would assign the original value of the range to the individual numbers that are on the range. Thank you given data 13196-13199 0 13200 4 13201 10 13202-13207 3 13208-13210 7 desired... (3 Replies)
Discussion started by: jcue25
3 Replies

5. Shell Programming and Scripting

Values between ranges

Hi, I have two files file1 chr1_22450_22500 chr2_12300_12350 chr1_34500_34550 file2 11000_13000 15000_19000 33000_44000 If the file 1 ranges fall between file2 ranges then assign the value of file2 in column 2 to file1 output: chr2_12300_12350 11000_13000 chr1_34500_34550 ... (7 Replies)
Discussion started by: Diya123
7 Replies

6. UNIX for Dummies Questions & Answers

Need help filling in ranges

I have a list of about 200,000 lines in a text file that look like this: 1 1 120 1 80 200 1 150 270 5 50 170 5 100 220 5 300 420 The first column is an identifier, the next 2 columns are a range (always 120 value range) I'm trying fill in the values of those ranges, and remove... (4 Replies)
Discussion started by: knott76
4 Replies

7. Programming

How to parse IP range in CIDR format in C

Hello everybody, I'm coding a network program and i need it to "understand" ip ranges, but i don't know how to make to parse an IP CIDR range, let's say "172.16.10.0/24" to work with the specified IP range. I've found a program which does it, but i don't understand the code. Here is the... (3 Replies)
Discussion started by: semash!
3 Replies

8. Shell Programming and Scripting

date ranges

Hi, Please anyone help to achive this using perl or unix scripting . This is date in my table 20090224,based on the date need to check the files,If file exist for that date then increment by 1 for that date and check till max date 'i.e.20090301 and push those files . files1_20090224... (2 Replies)
Discussion started by: akil
2 Replies

9. Shell Programming and Scripting

Get IP list from CIDR

Dear Srs :-) I'm looking for a shell script, that given a network in CIDR format it lists all IPs, for example: Preferredly a shell script, but a Perl, Python, C, etc.. is also welcome :-) I have been looking in sipcalc, ipcalc, etc.. options but this feature is not implemented :-( ... (10 Replies)
Discussion started by: Santi
10 Replies
Login or Register to Ask a Question