Ok what i have is 8 separate files based on how many IP's are associated with the domain. I want to limit duplication of ip's within the individual files (ie. i don't care if the same ip is in 1 IP file and 4 IP file). Can someone help me with how to go through the files and remove lines that have have an IP already in a previous line (within that same file). In the 2 IP file the What and Where lines should be removed since each has an ip in the Who line. In the 4 IP file when the spock line is removed since it has an ip from Kirk that file is then ok since the McCoy line is then no longer duplicating 10.100.200.200. Hopefully this is understandable (it makes sense to me
). I have been able to get rid of domains that have all the same ip's associated by using sort -k? -k? -u to ignore the first field, but can't figure out how to do single ip's from a line and test against other lines
.
I'm doing this on a CentOS 5.8 box in a bash script (whole lot more processing going on all around this portion).
EXAMPLES: Colon separated lines in each file
domain.com:ip:ip:
1 IP file
Any.com:192.168.10.100:
Where.edu:192.168.10.200:
2 IP file
Who.com:
192.168.10.300:
192.168.10.200:
What.gov:10.0.0.150:
192.168.10.300:
Where.biz:
192.168.10.200:10.10.0.10:
When.tv:192.168.10.10:192.168.10.11:
4 IP file
Kirk.ufp:
10.0.100.100:10.0.200.100:10.0.200.200:10.0.100.200:
Spock.vsa:10.100.100.100:10.100.100.200:
10.100.200.200:
10.0.100.100
Mccoy.ama:
10.100.200.200:192.168.200.200:192.168.100.200:192.168.100.201