Sponsored Content
Top Forums Shell Programming and Scripting Trying to extract domain and tld from list of urls. Post 302641875 by chamb1 on Wednesday 16th of May 2012 03:23:52 PM
Old 05-16-2012
Trying to extract domain and tld from list of urls.

I have done a fair amount of searching the threads, but I have not been able to cobble together a solution to my challenge. What I am trying to do is to line edit a file that will leave behind only the domain and tld of a long list of urls. The list looks something like this:
Code:
www.google.com
ja.wikipedia.org
bbc.co.uk
fr-fr.facebook.com

and I would like to end up with:
Code:
google.com
wikipedia.org
bbc.co.uk
facebook.com

I prefer bash, but am learning ruby and perl....though not very good at them yet. I have used ruby's URI function to extract the input links above...is there another ruby function I am overlooking for domain.tld?

Thanks!
 

10 More Discussions You Might Find Interesting

1. Email Antispam Techniques and Email Filtering

Sendmail Access DB TLD Blocking ....

Now this is a bit tricky, but works great if you can decide which Top Level Domains or TLDs you want to receive mail We are getting so much spam from countries we never receive useful mail, I've been experimenting with blocking entire TLDs using sendmail access_db as an antispam technique. ... (0 Replies)
Discussion started by: Neo
0 Replies

2. Solaris

List of Hostname under NIS Domain

How do I find a list of hosts under a domainname on a NIS+ I did check nisls command , I could not find any ??? (5 Replies)
Discussion started by: sriram003
5 Replies

3. Shell Programming and Scripting

Rsync to an external list of URLs

I'm going to have a text file formatted something like this: some_name http://www.someurl.com/ another_name http://www.anotherurl.com/ third_name http://www.thirdurl.com/ I need to write a script that can rsync from a file path I'll set, to each URL in the list. Any ideas? (8 Replies)
Discussion started by: ibsen
8 Replies

4. Shell Programming and Scripting

finding and removing patterns in a large list of urls

I have a list of urls for example: Google Google Base Yahoo! Yahoo! Yahoo! Video - It's On Google The problem is that Google and Google are duplicates as are Yahoo! and Yahoo!. I'm needing to find these conical www duplicates and append the text "DUP#" in from of both Google and... (3 Replies)
Discussion started by: totus
3 Replies

5. Windows & DOS: Issues & Discussions

How to: Linux BOX in Windows Domain (w/out joining the domain)

Dear Expert, i have linux box that is running in the windows domain, BUT did not being a member of the domain. as I am not the System Administrator so I have no control on the server in the network, such as modify dns entry , add the linux box in AD and domain record and so on that relevant. ... (2 Replies)
Discussion started by: regmaster
2 Replies

6. Shell Programming and Scripting

Extract URLs from HTML code using sed

Hello, i try to extract urls from google-search-results, but i have problem with sed filtering of html-code. what i wont is just list of urls thay apears between ........<p><a href=" and next following " in html code. here is my code, i use wget and pipelines to filtering. wget works, but... (13 Replies)
Discussion started by: L0rd
13 Replies

7. Shell Programming and Scripting

Extract urls from index.html downloaded using wget

Hi, I need to basically get a list of all the tarballs located at uri I am currently doing a wget on urito get the index.html page Now this index page contains the list of uris that I want to use in my bash script. can someone please guide me ,. I am new to Linux and shell scripting. ... (5 Replies)
Discussion started by: mnanavati
5 Replies

8. What is on Your Mind?

Tld.subdomain.name.subname

Way back in the early dawn of the 'net, there were two competing notations for specifying a FQDN, the familiar name.subdomain.domain.tld (such as news.bbc.co.uk) and the reversed tld.domain.subdomain.name (uk.co.bbc.news). And if memory serves, only the UK used the latter style of FQDN for a period... (0 Replies)
Discussion started by: derekludwig
0 Replies

9. Red Hat

List domain groups

Hi Need to list all gid for particular domain user. Actually in database getting error like one of the gid that user belongs is invalid. please suggest. thanks Paul (1 Reply)
Discussion started by: Mathew_paul
1 Replies

10. Solaris

How to list physical CPU on primary domain?

How to list physical CPU on primary domain? Sparc SPARC T5-4 psrinfo -p 1 in ILOM I see Processors: 4 / 4 (2 Replies)
Discussion started by: thomasj
2 Replies
IRB(1)							 Ruby Programmers Reference Guide						    IRB(1)

NAME
irb -- Interactive Ruby Shell SYNOPSIS
irb [--version] [-dfm] [-I directory] [-r library] [--[no]inspect] [--[no]readline] [--prompt mode] [--prompt-mode mode] [--inf-ruby-mode] [--simple-prompt] [--noprompt] [--tracer] [--back-trace-limit n] [--irb_debug n] [--] [program_file] [argument ...] DESCRIPTION
irb is the REPL(read-eval-print loop) environment for Ruby programs. OPTIONS
--version Prints the version of irb. -E external[:internal] --encoding external[:internal] Same as `ruby -E' . Specifies the default value(s) for external encodings and internal encoding. Values should be separated with colon (:). You can omit the one for internal encodings, then the value (Encoding.default_internal) will be nil. -I path Same as `ruby -I' . Specifies $LOAD_PATH directory -U Same as `ruby -U' . Sets the default value for internal encodings (Encoding.default_internal) to UTF-8. -d Same as `ruby -d' . Sets $DEBUG to true. -f Suppresses read of ~/.irbrc. -h --help Prints a summary of the options. -m Bc mode (load mathn, fraction or matrix are available) -r library Same as `ruby -r'. Causes irb to load the library using require. --inspect Uses `inspect' for output (default except for bc mode) --noinspect Doesn't use inspect for output --readline Uses Readline extension module. --noreadline Doesn't use Readline extension module. --prompt mode --prompt-mode mode Switch prompt mode. Pre-defined prompt modes are `default', `simple', `xmp' and `inf-ruby'. --inf-ruby-mode Uses prompt appropriate for inf-ruby-mode on emacs. Suppresses --readline. --simple-prompt Makes prompts simple. --noprompt No prompt mode. --tracer Displays trace for each execution of commands. --back-trace-limit n Displays backtrace top n and tail n. The default value is 16. --irb_debug n Sets internal debug level to n (not for popular use) ENVIRONMENT
IRBRC Also irb depends on same variables as ruby(1). FILES
~/.irbrc Personal irb initialization. EXAMPLES
% irb irb(main):001:0> 1 + 1 2 irb(main):002:0> def t(x) irb(main):003:1> x+1 irb(main):004:1> end => nil irb(main):005:0> t(3) => 4 irb(main):006:0> if t(3) == 4 irb(main):007:1> p :ok irb(main):008:1> end :ok => :ok irb(main):009:0> quit % SEE ALSO
ruby(1). REPORTING BUGS
Security vulnerabilities should be reported via an email to <security@ruby-lang.org>. Reported problems will be published after being fixed. And you can report other bugs and feature requests via the Ruby Issue Tracking System (http://bugs.ruby-lang.org). Do not report security vulnerabilities via the system because it publishes the vulnerabilities immediately. AUTHORS
Written by Keiju ISHITSUKA. UNIX
November 7, 2012 UNIX
All times are GMT -4. The time now is 10:29 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy