Sponsored Content
Top Forums Shell Programming and Scripting extracting domain names out of a text file Post 302251305 by totus on Sunday 26th of October 2008 02:28:04 PM
Old 10-26-2008
Question extracting domain names out of a text file

I am needing to extract and list domain names out of a very large text file. The text file contains tlds .com .net .org and others as well as third level domains e.g. host1.domain.com and the names are placed within paragraphs of text.

Domains do not have a http:// prefix so I'm thinking the only thing to match on would be the tlds for example match ".com", extract everything before it up to "space" character.

How would I go about doing this?

grep, sed and awk?

Thank you gurus!Smilie

Last edited by totus; 10-26-2008 at 03:45 PM..
 

10 More Discussions You Might Find Interesting

1. IP Networking

using unregistered domain names

hey what the hell happens if you make sure (as best one can) that a domain name like anything.com is not used at all, and you set up your own DNS and use that name without registering with a registrar, i know if the address is in use you will make some people very upset and give many internet users... (2 Replies)
Discussion started by: norsk hedensk
2 Replies

2. UNIX for Dummies Questions & Answers

Using Sendmail for multiple domain names

Hi, We're an internet company with several domain names. Our mail server was originally set up to deal with xxx@domain1.com email addresses which works fine. The problem I have is that we're now also using a domain2.com, and sales@domain1.com isn't the same as sales@domain2.com. I've added... (1 Reply)
Discussion started by: captainash
1 Replies

3. Shell Programming and Scripting

processing file names using text files

Hi, I have to perform an iterative function on a set of 10 files. After the first round the output files are named differently than the input files. examples input file name = xxxx1.yyy output file name = xxxx1_0001.yyy I need to rename all of the output files to the original input... (5 Replies)
Discussion started by: ligander
5 Replies

4. Shell Programming and Scripting

please help, find domain names in string

Hello, i have a file contains the information like below /home/username/domain.com/log/access /home/username/domain23.net/log/access /home/reseller/username/domain.com/log/access using a loop i can read every line of the file but i wants to extract domain name like(domain.com,... (3 Replies)
Discussion started by: eyes_drinker
3 Replies

5. UNIX for Dummies Questions & Answers

extracting text and reusing the text to rename file

Hi, I have some ps files where I want to ectract/copy a certain number from and use that number to rename the ps file. eg: 'file.ps' contains following text: 14 (09 01 932688 0)t the text can be variable, the only fixed element is the '14 ('. The problem is that the fixed element can appear... (7 Replies)
Discussion started by: JohnDS
7 Replies

6. UNIX for Advanced & Expert Users

extracting the component names from SVN changes xml file

Hi All, The following is the sample xml which is generated by a tool called HUDSON when ever change occurs in SVN(Sub version namespace). In the given XML , path/paths tags ll be vary depends on no.of changes. now , my requirement is, need a script which can extract the payment and... (1 Reply)
Discussion started by: geervani
1 Replies

7. Shell Programming and Scripting

help extracting text from file

Hello I have a large file with lines beginning with 552, 553, 554, below is a small sample, I need to extract the data you can see below highlighted in bold from this file on the same location on every line and output it to a new file. Thank you in advance for any help 55201KL... (2 Replies)
Discussion started by: firefox2k2
2 Replies

8. UNIX for Dummies Questions & Answers

Extracting lines from a text file based on another text file with line numbers

Hi, I am trying to extract lines from a text file given a text file containing line numbers to be extracted from the first file. How do I go about doing this? Thanks! (1 Reply)
Discussion started by: evelibertine
1 Replies

9. UNIX for Dummies Questions & Answers

Get domain names from IP addresses of apache2 access.log

I am totally new to shell scripting. I want to see people from which domain access my website. I want to generate the domain names from IP addresses in the Apache access.log file. There are around 54 log files. I concatenate all the files into one. I am using Ubuntu 12.04 LTS. So I... (4 Replies)
Discussion started by: Ronni
4 Replies

10. UNIX for Dummies Questions & Answers

Extracting URL with domain

I have a file like this: http://article.wn.com/view/2010/11/26/IV_drug_policy_feels_HIV_patients_Red_Cross/ http://aidsjournal.com/,www.cfpa.org.cn/page1/page2 , www.youtube.com http://seattletimes.nwsource.com/html/jerrybrewer/2013517803_brewer25.html... (1 Reply)
Discussion started by: csim_mohan
1 Replies
Net::Domain::TLD(3pm)					User Contributed Perl Documentation				     Net::Domain::TLD(3pm)

NAME
Net::Domain::TLD - Work with TLD names SYNOPSIS
use Net::Domain::TLD qw(tlds tld_exists); my @ccTLDs = tlds('cc'); print "TLD ok " if tld_exists('ac','cc'); DESCRIPTION
The purpose of this module is to provide user with current list of available top level domain names including new ICANN additions and ccTLDs Currently TLD definitions have been acquired from the following sources: http://www.icann.org/tlds/ http://www.dnso.org/constituency/gtld/gtld.html http://www.iana.org/cctld/cctld-whois.htm PUBLIC METHODS
Each public function/method is described here. These are how you should interact with this module. "tlds" This routine returns the tlds requested. my @all_tlds = tlds; #array of tlds my $all_tlds = tlds; #hashref of tlds and their descriptions my @cc_tlds = tlds('cc'); #array of just 'cc' type tlds my $cc_tlds = tlds('cc'); #hashref of just 'cc' type tlds and their descriptions Valid types are: cc - country code domains gtld_open - generic domains that anyone can register gtld_restricted - generic restricted registration domains new_open - recently added generic domains new_restricted - new restricted registration domains "tld_exists" This routine returns true if the given domain exists and false otherwise. die "no such domain" unless tld_exists($tld); #call without tld type die "no such domain" unless tld_exists($tld, 'new_open'); #call with tld type COPYRIGHT
Copyright (c) 2003-2005 Alex Pavlovic, all rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. AUTHORS
Alexander Pavlovic <alex.pavlovic@taskforce-1.com> Ricardo SIGNES <rjbs@cpan.org> perl v5.10.1 2011-04-18 Net::Domain::TLD(3pm)
All times are GMT -4. The time now is 08:15 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy