Convert ip ranges to CIDR netblocks


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Convert ip ranges to CIDR netblocks
# 1  
Old 08-23-2013
Convert ip ranges to CIDR netblocks

Hi,

Recently I had to convert a 280K lines of ip ranges to the CIDR notation and generate a file to be used by ipset (netfilter) for ip filtering.

Input file:
Code:
000.000.000.000 - 000.255.255.255 , 000 , invalid ip
001.000.064.000 - 001.000.127.255 , 000 , XXXXX
001.000.245.123 - 001.000.245.123 , 000 , YYYYYY YYYYY
001.002.002.000 - 001.002.002.255 , 000 , ZZZ ZZZ ZZ
001.002.004.000 - 001.002.004.255 , 000 , AAAA AA

Some of them are range with a single ip.

Required output:
Code:
-N cidr nethash --maxelem 260000
-N single iphash --maxelem 60000
-A cidr 0.0.0.0/8
-A cidr 1.0.64.0/18
-A single 1.0.245.123
-A cidr 1.2.2.0/24
-A cidr 1.2.4.0/24
COMMIT

As I got nowhere with awk - the CIDR convertion being the culprit - I found a solution with Python and its netaddr module:
Code:
#!/usr/bin/python3

"""
Usage: ip2cidr.py input_file
"""

import sys, re, netaddr

def sanitize (ip):
	seg = ip.split('.')
	return '.'.join([ str(int(v)) for v in seg ])

# pointer to input file
fp_source = open(sys.argv[1], "r")

# pointer to outfile
fp_outfile = open('ip.ipset', "w")

ptrnSplit = re.compile(' - | , ')

# Write ipset header to outfile
fp_outfile.write('-N cidr nethash --maxelem 260000\n-N single iphash --maxelem 60000\n',)

for line in fp_source:
	
	# parse on ' - ' et ' , '
	s = re.split(ptrnSplit, line)
	
	# sanitize ip: 001.004.000.107 --> 1.4.0.107 to avoid netaddr err.
	ip = [ sanitize(v) for v in s[:2] ]
	
	# conversion ip range to CIDR netblocks
	# single ip in range
	if ip[0] == ip[1]:
		fp_outfile.write('-A single %s\n' % ip[0])
		
	# multiple ip's in range
	else:
		ipCidr = netaddr.IPRange(ip[0], ip[1])
		for cidr in ipCidr.cidrs():
			fp_outfile.write('-A cidr %s\n' % cidr)

fp_outfile.write('COMMIT\n')

Time to process the 280K ip ranges: 4 minutes.



As I found that time being on the high side and having a couple of days off, I decided to give awk another try:
Code:
@include "lib_netaddr.awk"

function sanitize(ip) {
	split(ip, slice, ".")
	return slice[1]/1 "." slice[2]/1 "." slice[3]/1 "." slice[4]/1
}

BEGIN{
	FS=" , | - "
	print "-N cidr nethash --maxelem 260000\n-N single iphash --maxelem 60000\n"
}

# sanitize ip's
{$1 = sanitize($1); $2 = sanitize($2)}

# range with a single IP
$1==$2 {printf "-A single %s\n", $1} 

# ranges with multiple IP's
$1!=$2{print range2cidr(ip2dec($1), ip2dec($2))}

# footer
END {print "COMMIT\n"}

lib_netaddr.awk
Code:
#
#    Library with various ip manipulation functions
#

# convert ip ranges to CIDR notation
# str range2cidr(ip2dec("192.168.0.15"), ip2dec("192.168.5.115"))
#
# Credit to Chubler_XL for this brilliant function. (see his post below for non GNU awk)
#
function range2cidr(ipStart, ipEnd,  bits, mask, newip) {
    bits = 1
    mask = 1
    result = "-A cidr "
    while (bits < 32) {
        newip = or(ipStart, mask)
        if ((newip>ipEnd) || ((lshift(rshift(ipStart,bits),bits)) != ipStart)) {
           bits--
           mask = rshift(mask,1)
           break
        }
        bits++
        mask = lshift(mask,1)+1
    }
    newip = or(ipStart, mask)
    bits = 32 - bits
    result = result dec2ip(ipStart) "/" bits
    if (newip < ipEnd) result = result "\n" range2cidr(newip + 1, ipEnd)
    return result
}

# convert dotted quads to long decimal ip
#	int ip2dec("192.168.0.15")
#
function ip2dec(ip,   slice) {
	split(ip, slice, ".")
	return (slice[1] * 2^24) + (slice[2] * 2^16) + (slice[3] * 2^8) + slice[4]
}

# convert decimal long ip to dotted quads
#	str dec2ip(1171259392)
#
function dec2ip(dec,    ip, quad) {
	for (i=3; i>=1; i--) {
		quad = 256^i
		ip = ip int(dec/quad) "."
		dec = dec%quad
	}
	return ip dec
}


# convert decimal ip to binary
#	str dec2binary(1171259392)
#
function dec2binary(dec,    bin) {
	while (dec>0) {
		bin = dec%2 bin
		dec = int(dec/2)
	}
	return bin
}

# Convert binary ip to decimal
#	int binary2dec("1000101110100000000010011001000")
#
function binary2dec(bin,   slice, l, dec) {
	split(bin, slice, "")
	l = length(bin)
	for (i=l; i>0; i--) {
		dec += slice[i] * 2^(l-i)
	}
	return dec
}

# convert dotted quad ip to binary
#	str ip2binary("192.168.0.15")
#
function ip2binary(ip) {
	return dec2binary(ip2dec(ip))
}


# count the number of ip's in a dotted quad ip range
#	int countIp ("192.168.0.0" ,"192.168.1.255") + 1
#
function countQuadIp(ipStart, ipEnd) {
	return (ip2dec(ipEnd) - ip2dec(ipStart))
}


# count the number of ip's in a CIDR block
#	int countCidrIp ("192.168.0.0/12")
#
function countCidrIp (cidr) {
	sub(/.+\//, "", cidr)
	return 2^(32-cidr)
}

Time to process: 16 sec. A whooping 15 times faster! Not bad for a 43 years old language! And it's even faster with mawk: 7 sec.

Please note that the @include only works with gawk. If you are using the original awk or the lightning fast mawk, you will have to copy/paste the functions library into your main script.

If you find this awk library useful or if it needs to be optimized, let me know before I submit it in Tips & Tutorials section.

Last edited by ripat; 09-04-2013 at 07:46 AM.. Reason: Inclusion of Chuble_XL's range2cidr() function
These 6 Users Gave Thanks to ripat For This Post:
# 2  
Old 08-25-2013
How about this for range2cidr (Then call it like this range2cidr(ip2dec($1), ip2dec($2)):

Code:
function range2cidr(ipStart, ipEnd,  bits, mask, newip) {
    bits = 1
    mask = 1
    while (bits < 32) {
        newip = or(ipStart, mask)
        if ((newip>ipEnd) || ((lshift(rshift(ipStart,bits),bits)) != ipStart)) {
           bits--
           mask = rshift(mask,1)
           break
        }
        bits++
        mask = lshift(mask,1)+1
    }
    newip = or(ipStart, mask)
    bits = 32 - bits
    result = dec2ip(ipStart) "/" bits
    if (newip < ipEnd) result = result "\n" range2cidr(newip + 1, ipEnd)
    return result
}

---------- Post updated at 10:24 AM ---------- Previous update was at 08:31 AM ----------

Of course this does require the following gawk bitwise functions: or() lshift() and rshift()

We could be replace these with local (bit_) variants for more portability.

Code:
# Bitwise OR of var1 and var2
function bit_or(a, b, r, i, c) {
    for (r=i=0;i<32;i++) {
        c = 2 ^ i
        if ((int(a/c) % 2) || (int(b/c) % 2)) r += c
    }
    return r
}


# Rotate bytevalue left x times
function bit_lshift(var, x) {
  while(x--) var*=2;
  return var;
}

# Rotate bytevalue right x times
function bit_rshift(var, x) {
  while(x--) var=int(var/2);
  return var;
}

These 5 Users Gave Thanks to Chubler_XL For This Post:
# 3  
Old 09-03-2013
Brilliant. Works much better than my original range2cidr() function. I just edited my post above to include your function.

Well done!
# 4  
Old 04-27-2017
Tools ip2cidr

Hi, I'm trying to convert bulk IP's into nearest CIDR. I came across your script and was trying to run the awk script in cygwin. Can you send me the syntax on how to run the script along with the library file. Thanks
# 5  
Old 04-27-2017
How to write the code depends entirely on what you want to do with it. Show the input you have and the output you want.
# 6  
Old 04-27-2017
Hi, I have list of IP's ~3k, which are from very small to large subnets. So, I want the IPs to be grouped into subnets that makes sense. The scenario is several groups get IP's based on availability and none of the group should not touch or scan the other IP's. We get the list of IPs based on manual inventory from each group and the key to this part is the provider doesn't manage which set of IPs belong to which group.
So the task is I collected manually all the IPs (which are around 3K) and want to make them into subnets to the nearest class. For example if I have a single IP address it should round off to /32 or if it has 4 ip's it should round off to /29 or /30. I have CIDR tools to do this task, but it needs manual input each time.

I'm looking for a way if I put the 3K ip's into excel or any format the script should round off to nearest subnet class.
# 7  
Old 04-27-2017
The trick to that is, where it should cut off? Hypothetically speaking you can encompass 1.1.1.1 and 254.254.254.254 with the mask 0.0.0.0 but I doubt you want that. You could also do 100% perfect groups with no empty spaces but I doubt you want that either.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Convert ip ranges to CIDR netblock

2 scripts to convert IP ranges to CIDR notation using awk, gawk or mawk. The scripts are much faster than using ipcalc and will return the same results. The first script is reliably compatible with awk, gawk and mawk but is over 3 times as slow as the second script which is reliably compatible with... (38 Replies)
Discussion started by: azdps
38 Replies

2. Shell Programming and Scripting

How to change ip addressing format from CIDR notation to netmask and vice versa?

Hi all, I would appreciate if someone could share how to convert CIDR notation to netmask and vice versa. The value below is just an example. it could be different numbers/ip addresses. Initial Output, let say file1.txt Final Output, let say file2.txt (3 Replies)
Discussion started by: type8code0
3 Replies

3. Shell Programming and Scripting

How to convert multiple number ranges into sequence?

Looking for a simple way to convert ranges to a numerical sequence that would assign the original value of the range to the individual numbers that are on the range. Thank you given data 13196-13199 0 13200 4 13201 10 13202-13207 3 13208-13210 7 desired... (3 Replies)
Discussion started by: jcue25
3 Replies

4. Shell Programming and Scripting

Values between ranges

Hi, I have two files file1 chr1_22450_22500 chr2_12300_12350 chr1_34500_34550 file2 11000_13000 15000_19000 33000_44000 If the file 1 ranges fall between file2 ranges then assign the value of file2 in column 2 to file1 output: chr2_12300_12350 11000_13000 chr1_34500_34550 ... (7 Replies)
Discussion started by: Diya123
7 Replies

5. UNIX for Dummies Questions & Answers

Need help filling in ranges

I have a list of about 200,000 lines in a text file that look like this: 1 1 120 1 80 200 1 150 270 5 50 170 5 100 220 5 300 420 The first column is an identifier, the next 2 columns are a range (always 120 value range) I'm trying fill in the values of those ranges, and remove... (4 Replies)
Discussion started by: knott76
4 Replies

6. Programming

How to parse IP range in CIDR format in C

Hello everybody, I'm coding a network program and i need it to "understand" ip ranges, but i don't know how to make to parse an IP CIDR range, let's say "172.16.10.0/24" to work with the specified IP range. I've found a program which does it, but i don't understand the code. Here is the... (3 Replies)
Discussion started by: semash!
3 Replies

7. Shell Programming and Scripting

date ranges

Hi, Please anyone help to achive this using perl or unix scripting . This is date in my table 20090224,based on the date need to check the files,If file exist for that date then increment by 1 for that date and check till max date 'i.e.20090301 and push those files . files1_20090224... (2 Replies)
Discussion started by: akil
2 Replies

8. Shell Programming and Scripting

Get IP list from CIDR

Dear Srs :-) I'm looking for a shell script, that given a network in CIDR format it lists all IPs, for example: Preferredly a shell script, but a Perl, Python, C, etc.. is also welcome :-) I have been looking in sipcalc, ipcalc, etc.. options but this feature is not implemented :-( ... (10 Replies)
Discussion started by: Santi
10 Replies

9. HP-UX

Valid ranges for uids for HP-UX

Hi , I am using adduser in hp-ux to create users in Hp-ux. i would like to know what are the valid values for uids and gids in hp-ux what are the rannges for the valid uids . How to check what are the used uids in Hp-ux . Thanks Narendra babu C (7 Replies)
Discussion started by: naren_chella
7 Replies
Login or Register to Ask a Question