Hello everybody
I have been trying to extract the domain name from the bind query log with different options, however always get stuck with domains that end with link .co.uk or .co.nz.
I tried the following, however only provides the first level:
Current raw file:
Desired output file:
Is it possible to get the domain names through a command or must the list be compared to another file that contains a list of all domains on the internet?
Moderator's Comments:
Please do not use FONT and SIZE tags when posting to The UNIX & Linux Forums.
Please use CODE tags; not ICODE tags for multi-line sample input, output, and code.
Last edited by Don Cragun; 11-01-2015 at 06:13 PM..
Reason: Change ICODE tags to CODE tags, get rid of FONT and SIZE tags.
Is it possible to get the domain names through a command or must the list be compared to another file that contains a list of all domains on the internet?
You have to compare to another list that defines the sub-domains.
Follow this link, if still active, for more information.
A compilation list can be found here.
Thank you for the response Aia, however that post is quite old and does not seem to be active anymore. or have a solution as such.Also thank you for the publicsuffix list this is very helpful and has provided me with an new possible approach to the challenge.
Unfortunately I am very new to the shell scripting world and would appreciate assistance in this regard. Here is the idea:
The URL is longer than the publicsuffix listed items and the url is separated by "." so if there a possibility to grep or search the url starting from the right hand side and finding the most accurate match. Let me provide an example:
starting from the right hand site matching agains the publicsuffix list:
publicsuffix list for uk:
URL lookup:
-Match
-Match
-No Match
When No Match was returned getting the co.uk with one segment addition of the URL to end up with bbci.co.uk.
Would this be possible to script it in a possible way?
Thank you RudiC, could I kindly ask you to elaborate on the code, as mentioned before, I am very new to this. I have two files the one that contains the URL and the other one the publicsuffic list. Thank you
RudiC, thank you very much for providing this solution, it is truly appreciated. I checked through the publicsuffix list and found that the longest domain is 4 as such added this to the script you provided. Now it works and provides all the different domains. Here is the code I am now using:
I have a file like this:
http://article.wn.com/view/2010/11/26/IV_drug_policy_feels_HIV_patients_Red_Cross/ http://aidsjournal.com/,www.cfpa.org.cn/page1/page2 , www.youtube.com
http://seattletimes.nwsource.com/html/jerrybrewer/2013517803_brewer25.html... (1 Reply)
I have a file like this:
http://hello.com www.examplecom computer Company
I wanted to keep dot (.) infront of com. to make the file like this
http://hello.com www.example.com computer Company
I applied this expression
sed -r 's/com/.com/g'but what I get is:
http://hello.com ... (4 Replies)
Hello,
Am very new to perl , please help me here !!
I need help in reading a URL from command line using PERL:: Mechanize and needs all the contents from the URL to get into a file.
below is the script which i have written so far ,
#!/usr/bin/perl
use LWP::UserAgent;
use... (2 Replies)
Hi,
I have a problem where i have to hit multiple URL that are stored in a text file (input.txt) and save their output in different text file (output.txt) somewhat like :
cat input.txt
http://192.168.21.20:8080/PPUPS/international?NUmber=917875446856... (3 Replies)
Here is what I have so far:
find . -name "*php*" -or -name "*htm*" | xargs grep -i iframe | awk -F'"' '/<iframe*/{gsub(/.\*iframe>/,"\"");print $2}'
Here is an example content of a PHP or HTM(HTML) file:
<iframe src="http://ADDRESS_1/?click=5BBB08\" width=1 height=1... (18 Replies)
I am trying to find a way to test some code, but I need to rewrite a specific URL only from a specific HTTP_HOST
The call goes out to
http://SUB.DOMAIN.COM/showAssignment/7bde10b45efdd7a97629ef2fe01f7303/jsmodule/Nevow.Athena
The ID in the middle is always random due to the cookie.
I... (5 Replies)
Dear Expert,
i have linux box that is running in the windows domain, BUT did not being a member of the domain. as I am not the System Administrator so I have no control on the server in the network, such as modify dns entry , add the linux box in AD and domain record and so on that relevant.
... (2 Replies)
Hello,
I need to redirect an existing URL, how can i do that?
There's a current web address to a GUI that I have to redirect to another webaddress. Does anyone know how to do this?
This is on Unix boxes Linux.
example:
https://m45.testing.address.net/host.php
make it so the... (3 Replies)