Sponsored Content
Top Forums Shell Programming and Scripting Convert ip ranges to CIDR netblocks Post 302845779 by ripat on Friday 23rd of August 2013 07:48:50 AM
Old 08-23-2013
Convert ip ranges to CIDR netblocks

Hi,

Recently I had to convert a 280K lines of ip ranges to the CIDR notation and generate a file to be used by ipset (netfilter) for ip filtering.

Input file:
Code:
000.000.000.000 - 000.255.255.255 , 000 , invalid ip
001.000.064.000 - 001.000.127.255 , 000 , XXXXX
001.000.245.123 - 001.000.245.123 , 000 , YYYYYY YYYYY
001.002.002.000 - 001.002.002.255 , 000 , ZZZ ZZZ ZZ
001.002.004.000 - 001.002.004.255 , 000 , AAAA AA

Some of them are range with a single ip.

Required output:
Code:
-N cidr nethash --maxelem 260000
-N single iphash --maxelem 60000
-A cidr 0.0.0.0/8
-A cidr 1.0.64.0/18
-A single 1.0.245.123
-A cidr 1.2.2.0/24
-A cidr 1.2.4.0/24
COMMIT

As I got nowhere with awk - the CIDR convertion being the culprit - I found a solution with Python and its netaddr module:
Code:
#!/usr/bin/python3

"""
Usage: ip2cidr.py input_file
"""

import sys, re, netaddr

def sanitize (ip):
	seg = ip.split('.')
	return '.'.join([ str(int(v)) for v in seg ])

# pointer to input file
fp_source = open(sys.argv[1], "r")

# pointer to outfile
fp_outfile = open('ip.ipset', "w")

ptrnSplit = re.compile(' - | , ')

# Write ipset header to outfile
fp_outfile.write('-N cidr nethash --maxelem 260000\n-N single iphash --maxelem 60000\n',)

for line in fp_source:
	
	# parse on ' - ' et ' , '
	s = re.split(ptrnSplit, line)
	
	# sanitize ip: 001.004.000.107 --> 1.4.0.107 to avoid netaddr err.
	ip = [ sanitize(v) for v in s[:2] ]
	
	# conversion ip range to CIDR netblocks
	# single ip in range
	if ip[0] == ip[1]:
		fp_outfile.write('-A single %s\n' % ip[0])
		
	# multiple ip's in range
	else:
		ipCidr = netaddr.IPRange(ip[0], ip[1])
		for cidr in ipCidr.cidrs():
			fp_outfile.write('-A cidr %s\n' % cidr)

fp_outfile.write('COMMIT\n')

Time to process the 280K ip ranges: 4 minutes.



As I found that time being on the high side and having a couple of days off, I decided to give awk another try:
Code:
@include "lib_netaddr.awk"

function sanitize(ip) {
	split(ip, slice, ".")
	return slice[1]/1 "." slice[2]/1 "." slice[3]/1 "." slice[4]/1
}

BEGIN{
	FS=" , | - "
	print "-N cidr nethash --maxelem 260000\n-N single iphash --maxelem 60000\n"
}

# sanitize ip's
{$1 = sanitize($1); $2 = sanitize($2)}

# range with a single IP
$1==$2 {printf "-A single %s\n", $1} 

# ranges with multiple IP's
$1!=$2{print range2cidr(ip2dec($1), ip2dec($2))}

# footer
END {print "COMMIT\n"}

lib_netaddr.awk
Code:
#
#    Library with various ip manipulation functions
#

# convert ip ranges to CIDR notation
# str range2cidr(ip2dec("192.168.0.15"), ip2dec("192.168.5.115"))
#
# Credit to Chubler_XL for this brilliant function. (see his post below for non GNU awk)
#
function range2cidr(ipStart, ipEnd,  bits, mask, newip) {
    bits = 1
    mask = 1
    result = "-A cidr "
    while (bits < 32) {
        newip = or(ipStart, mask)
        if ((newip>ipEnd) || ((lshift(rshift(ipStart,bits),bits)) != ipStart)) {
           bits--
           mask = rshift(mask,1)
           break
        }
        bits++
        mask = lshift(mask,1)+1
    }
    newip = or(ipStart, mask)
    bits = 32 - bits
    result = result dec2ip(ipStart) "/" bits
    if (newip < ipEnd) result = result "\n" range2cidr(newip + 1, ipEnd)
    return result
}

# convert dotted quads to long decimal ip
#	int ip2dec("192.168.0.15")
#
function ip2dec(ip,   slice) {
	split(ip, slice, ".")
	return (slice[1] * 2^24) + (slice[2] * 2^16) + (slice[3] * 2^8) + slice[4]
}

# convert decimal long ip to dotted quads
#	str dec2ip(1171259392)
#
function dec2ip(dec,    ip, quad) {
	for (i=3; i>=1; i--) {
		quad = 256^i
		ip = ip int(dec/quad) "."
		dec = dec%quad
	}
	return ip dec
}


# convert decimal ip to binary
#	str dec2binary(1171259392)
#
function dec2binary(dec,    bin) {
	while (dec>0) {
		bin = dec%2 bin
		dec = int(dec/2)
	}
	return bin
}

# Convert binary ip to decimal
#	int binary2dec("1000101110100000000010011001000")
#
function binary2dec(bin,   slice, l, dec) {
	split(bin, slice, "")
	l = length(bin)
	for (i=l; i>0; i--) {
		dec += slice[i] * 2^(l-i)
	}
	return dec
}

# convert dotted quad ip to binary
#	str ip2binary("192.168.0.15")
#
function ip2binary(ip) {
	return dec2binary(ip2dec(ip))
}


# count the number of ip's in a dotted quad ip range
#	int countIp ("192.168.0.0" ,"192.168.1.255") + 1
#
function countQuadIp(ipStart, ipEnd) {
	return (ip2dec(ipEnd) - ip2dec(ipStart))
}


# count the number of ip's in a CIDR block
#	int countCidrIp ("192.168.0.0/12")
#
function countCidrIp (cidr) {
	sub(/.+\//, "", cidr)
	return 2^(32-cidr)
}

Time to process: 16 sec. A whooping 15 times faster! Not bad for a 43 years old language! And it's even faster with mawk: 7 sec.

Please note that the @include only works with gawk. If you are using the original awk or the lightning fast mawk, you will have to copy/paste the functions library into your main script.

If you find this awk library useful or if it needs to be optimized, let me know before I submit it in Tips & Tutorials section.

Last edited by ripat; 09-04-2013 at 07:46 AM.. Reason: Inclusion of Chuble_XL's range2cidr() function
These 6 Users Gave Thanks to ripat For This Post:
 

9 More Discussions You Might Find Interesting

1. HP-UX

Valid ranges for uids for HP-UX

Hi , I am using adduser in hp-ux to create users in Hp-ux. i would like to know what are the valid values for uids and gids in hp-ux what are the rannges for the valid uids . How to check what are the used uids in Hp-ux . Thanks Narendra babu C (7 Replies)
Discussion started by: naren_chella
7 Replies

2. Shell Programming and Scripting

Get IP list from CIDR

Dear Srs :-) I'm looking for a shell script, that given a network in CIDR format it lists all IPs, for example: Preferredly a shell script, but a Perl, Python, C, etc.. is also welcome :-) I have been looking in sipcalc, ipcalc, etc.. options but this feature is not implemented :-( ... (10 Replies)
Discussion started by: Santi
10 Replies

3. Shell Programming and Scripting

date ranges

Hi, Please anyone help to achive this using perl or unix scripting . This is date in my table 20090224,based on the date need to check the files,If file exist for that date then increment by 1 for that date and check till max date 'i.e.20090301 and push those files . files1_20090224... (2 Replies)
Discussion started by: akil
2 Replies

4. Programming

How to parse IP range in CIDR format in C

Hello everybody, I'm coding a network program and i need it to "understand" ip ranges, but i don't know how to make to parse an IP CIDR range, let's say "172.16.10.0/24" to work with the specified IP range. I've found a program which does it, but i don't understand the code. Here is the... (3 Replies)
Discussion started by: semash!
3 Replies

5. UNIX for Dummies Questions & Answers

Need help filling in ranges

I have a list of about 200,000 lines in a text file that look like this: 1 1 120 1 80 200 1 150 270 5 50 170 5 100 220 5 300 420 The first column is an identifier, the next 2 columns are a range (always 120 value range) I'm trying fill in the values of those ranges, and remove... (4 Replies)
Discussion started by: knott76
4 Replies

6. Shell Programming and Scripting

Values between ranges

Hi, I have two files file1 chr1_22450_22500 chr2_12300_12350 chr1_34500_34550 file2 11000_13000 15000_19000 33000_44000 If the file 1 ranges fall between file2 ranges then assign the value of file2 in column 2 to file1 output: chr2_12300_12350 11000_13000 chr1_34500_34550 ... (7 Replies)
Discussion started by: Diya123
7 Replies

7. Shell Programming and Scripting

How to convert multiple number ranges into sequence?

Looking for a simple way to convert ranges to a numerical sequence that would assign the original value of the range to the individual numbers that are on the range. Thank you given data 13196-13199 0 13200 4 13201 10 13202-13207 3 13208-13210 7 desired... (3 Replies)
Discussion started by: jcue25
3 Replies

8. Shell Programming and Scripting

How to change ip addressing format from CIDR notation to netmask and vice versa?

Hi all, I would appreciate if someone could share how to convert CIDR notation to netmask and vice versa. The value below is just an example. it could be different numbers/ip addresses. Initial Output, let say file1.txt Final Output, let say file2.txt (3 Replies)
Discussion started by: type8code0
3 Replies

9. Shell Programming and Scripting

Convert ip ranges to CIDR netblock

2 scripts to convert IP ranges to CIDR notation using awk, gawk or mawk. The scripts are much faster than using ipcalc and will return the same results. The first script is reliably compatible with awk, gawk and mawk but is over 3 times as slow as the second script which is reliably compatible with... (38 Replies)
Discussion started by: azdps
38 Replies
All times are GMT -4. The time now is 06:34 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy