Convert ip ranges to CIDR netblock


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Convert ip ranges to CIDR netblock
# 8  
Old 12-13-2018
mawk is indeed designed for speed above all else. Occasionally its features are lacking but that can usually be worked around.
This User Gave Thanks to Corona688 For This Post:
# 9  
Old 12-13-2018
are you sure you're getting the desired results?
I'm getting:
Code:
222.104.193.5/32
222.106.31.112/32
222.106.31.123/32
222.126.13.224/32
222.191.251.186/32

on your sample/modified file:
Code:
222.104.193.5  222.104.193.5
222.106.31.112  222.106.31.112
222.106.31.123  222.106.31.123
222.126.13.224  222.126.13.231
222.191.251.186  222.191.251.186

At least 222.126.13.224 222.126.13.231 should convert to 222.126.13.224/29

Last edited by vgersh99; 12-13-2018 at 12:44 PM..
This User Gave Thanks to vgersh99 For This Post:
# 10  
Old 12-13-2018
vgersh99 the sample I provided had a hyphen between the IP address ranges and the one you provided has a space for a field separator. Thank you for addressing the issue though, since I also noticed that when a hyphen was used as a field separator for a single IP address range the output wasn't being appended with /32. I've refined the code and removed the bit_or(), bit_lshift () and bit_rshift () functions since they were much slower than using gawk's built in or(), lshift() and rshift() functions. Can't use mawk now but the code below is still faster now using gawk. For some reason I get a segmentation fault with awk.

Code:
#!/usr/local/bin/gawk

# Library with various ip manipulation functions
# convert ip ranges to CIDR notation

function range2cidr(ipStart, ipEnd, bits, mask, newip) {
    bits = 1
    mask = 1
    while (bits < 32) {
        newip = or(ipStart, mask)
        if ((newip>ipEnd) || ((lshift(rshift(ipStart,bits),bits)) != ipStart)) {
           bits--
           mask = rshift(mask,1)
           break
        }
        bits++
        mask = lshift(mask,1)+1
    }
    newip = or(ipStart, mask)
    bits = 32 - bits
    result = dec2ip(ipStart) "/" bits
    if (newip < ipEnd) result = range2cidr(newip + 1, ipEnd)
    return result
}

# convert dotted quads to long decimal ip
#       int ip2dec("192.168.0.15")
#
function ip2dec(ip, slice) {
        split(ip, slice, ".")
        return (slice[1] * 2^24) + (slice[2] * 2^16) + (slice[3] * 2^8) + slice[4]
}

# convert decimal long ip to dotted quads
#       str dec2ip(1171259392)
#
function dec2ip(dec, ip, quad) {
        for (i=3; i>=1; i--) {
                quad = 256^i
                ip = ip int(dec/quad) "."
                dec = dec%quad
        }
        return ip dec
}

function sanitize(ip) {
        split(ip, slice, ".")
        return slice[1]/1 "." slice[2]/1 "." slice[3]/1 "." slice[4]/1
}

BEGIN{
        FS=" | - "
}

# sanitize ip's
{$1 = sanitize($1); $2 = sanitize($2)}

{print range2cidr(ip2dec($1), ip2dec($2))}

END {print ""}

Here are benchmarks processing a file containing ip address ranges. The output file after running the script contained 236,315 lines.
  • ipcacl 15 min
  • mawk does not work with this script
  • gawk 27 sec
  • awk segmentation fault

Last edited by azdps; 12-13-2018 at 10:32 PM..
# 11  
Old 12-14-2018
wrt segmentation fault...
What's your OS?
or(), lshift() and rshift() are gawk specific
what does which awk come back with?
What does ls -l $(which awk) come back with
# 12  
Old 12-14-2018
Quote:
What is your OS?
OpenBSD

Quote:
What does which awk come back with?
/usr/bin/awk

Quote:
What does ls -l $(which awk) come back with?
-r-xr-xr-x 1 root bin 180464 Oct 11 12:18 /usr/bin/awk


OpenBSD man page for awk: Here
The version of awk that OpenBSD supports or(), lshift() and rshift()

FYI, the list that I've been converting to CIDR notation is from iblock.com. Link

Last edited by azdps; 12-14-2018 at 12:41 PM..
# 13  
Old 12-14-2018
Looks like an old branch of gawk with non-POSIX extensions.
Probably a buggy and/or not so forgiving as gawk...
The BUGS section at the bottom of the man page might give some hints of what can be tweaked to make it run...
Stick with gawk if it's an option, or get a newer version of a native BSD version of awk.

P.S. I'm getting the following timing with gawk under Cygwin with the referenced file:
Code:
real    0m12.081s
user    0m10.202s
sys     0m0.967s

with the slightly modified awk code:
Code:
function range2cidr(ipStart, ipEnd, bits, mask, newip) {
    bits = 1
    mask = 1
    while (bits < 32) {
        newip = or(ipStart, mask)
        if ((newip>ipEnd) || ((lshift(rshift(ipStart,bits),bits)) != ipStart)) {
           bits--
           mask = rshift(mask,1)
           break
        }
        bits++
        mask = lshift(mask,1)+1
    }
    newip = or(ipStart, mask)
    bits = 32 - bits
    result = dec2ip(ipStart) "/" bits
    if (newip < ipEnd) result = range2cidr(newip + 1, ipEnd)
    return result
}

# convert dotted quads to long decimal ip
#       int ip2dec("192.168.0.15")
#
function ip2dec(ip, slice) {
        split(ip, slice, ".")
        return (slice[1] * 2^24) + (slice[2] * 2^16) + (slice[3] * 2^8) + slice[4]
}

# convert decimal long ip to dotted quads
#       str dec2ip(1171259392)
#
function dec2ip(dec, ip, quad) {
        for (i=3; i>=1; i--) {
                quad = 256^i
                ip = ip int(dec/quad) "."
                dec = dec%quad
        }
        return ip dec
}

function sanitize(ip) {
        split(ip, slice, ".")
        return slice[1]/1 "." slice[2]/1 "." slice[3]/1 "." slice[4]/1
}

BEGIN{
        FS="[- :]"
}

# sanitize ip's
!/^#/ && NF {$1 = sanitize($(NF-1)); $2 = sanitize($NF)}

!/^#/ && NF {print range2cidr(ip2dec($(NF-1)), ip2dec($NF))}

END {print ""}


Last edited by vgersh99; 12-14-2018 at 01:18 PM..
This User Gave Thanks to vgersh99 For This Post:
# 14  
Old 12-14-2018
Thank you vgersh99 for the modified script. I'm using an Intel Celeron N2930 on my OpenBSD box so the timings are definitely different. I appreciative the time you've spent looking into this script. Hopfeuly others can find this beneficial in the future.

I wish I would have posted how I downloaded and parsed the downloaded file which was redirected to a temp file before you refined the script. I'm going to post it now to show you what my downloaded file looks like prior to me processing it using the script and gawk.

ftp -V -o - http://list.iblocklist.com/?list=ydx...chiveformat=gz | gunzip -c | grep -v '#' | sed '/^$/d' | sed 's/.*://' | sed 's/-/ - /' >> temp

Also the awk version that OpenBSD is currenly using is:
awk version 20110810

The version was obtained using the command awk -V
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. What is on Your Mind?

Blocked A6-Index and Entire AWS Netblock

Weary of seeing our load average go up to 50+, I just did a major block on these networks (stats over a less than 20 min interval): https://www.unix.com/members/1-albums215-picture866.png (3 Replies)
Discussion started by: Neo
3 Replies

2. Shell Programming and Scripting

Convert ip ranges to CIDR netblocks

Hi, Recently I had to convert a 280K lines of ip ranges to the CIDR notation and generate a file to be used by ipset (netfilter) for ip filtering. Input file: 000.000.000.000 - 000.255.255.255 , 000 , invalid ip 001.000.064.000 - 001.000.127.255 , 000 , XXXXX 001.000.245.123 -... (10 Replies)
Discussion started by: ripat
10 Replies

3. Shell Programming and Scripting

How to change ip addressing format from CIDR notation to netmask and vice versa?

Hi all, I would appreciate if someone could share how to convert CIDR notation to netmask and vice versa. The value below is just an example. it could be different numbers/ip addresses. Initial Output, let say file1.txt Final Output, let say file2.txt (3 Replies)
Discussion started by: type8code0
3 Replies

4. Shell Programming and Scripting

How to convert multiple number ranges into sequence?

Looking for a simple way to convert ranges to a numerical sequence that would assign the original value of the range to the individual numbers that are on the range. Thank you given data 13196-13199 0 13200 4 13201 10 13202-13207 3 13208-13210 7 desired... (3 Replies)
Discussion started by: jcue25
3 Replies

5. Shell Programming and Scripting

Values between ranges

Hi, I have two files file1 chr1_22450_22500 chr2_12300_12350 chr1_34500_34550 file2 11000_13000 15000_19000 33000_44000 If the file 1 ranges fall between file2 ranges then assign the value of file2 in column 2 to file1 output: chr2_12300_12350 11000_13000 chr1_34500_34550 ... (7 Replies)
Discussion started by: Diya123
7 Replies

6. UNIX for Dummies Questions & Answers

Need help filling in ranges

I have a list of about 200,000 lines in a text file that look like this: 1 1 120 1 80 200 1 150 270 5 50 170 5 100 220 5 300 420 The first column is an identifier, the next 2 columns are a range (always 120 value range) I'm trying fill in the values of those ranges, and remove... (4 Replies)
Discussion started by: knott76
4 Replies

7. Programming

How to parse IP range in CIDR format in C

Hello everybody, I'm coding a network program and i need it to "understand" ip ranges, but i don't know how to make to parse an IP CIDR range, let's say "172.16.10.0/24" to work with the specified IP range. I've found a program which does it, but i don't understand the code. Here is the... (3 Replies)
Discussion started by: semash!
3 Replies

8. Shell Programming and Scripting

date ranges

Hi, Please anyone help to achive this using perl or unix scripting . This is date in my table 20090224,based on the date need to check the files,If file exist for that date then increment by 1 for that date and check till max date 'i.e.20090301 and push those files . files1_20090224... (2 Replies)
Discussion started by: akil
2 Replies

9. Shell Programming and Scripting

Get IP list from CIDR

Dear Srs :-) I'm looking for a shell script, that given a network in CIDR format it lists all IPs, for example: Preferredly a shell script, but a Perl, Python, C, etc.. is also welcome :-) I have been looking in sipcalc, ipcalc, etc.. options but this feature is not implemented :-( ... (10 Replies)
Discussion started by: Santi
10 Replies
Login or Register to Ask a Question