Sponsored Content
Full Discussion: Extracting URL with domain
Top Forums UNIX for Dummies Questions & Answers Extracting URL with domain Post 302910037 by csim_mohan on Tuesday 22nd of July 2014 06:36:57 AM
Old 07-22-2014
Extracting URL with domain

I have a file like this:
Code:
http://article.wn.com/view/2010/11/26/IV_drug_policy_feels_HIV_patients_Red_Cross/      http://aidsjournal.com/,www.cfpa.org.cn/page1/page2 , www.youtube.com

http://seattletimes.nwsource.com/html/jerrybrewer/2013517803_brewer25.html
http://www.moortowntoday.co.uk/your-moortown/Yorkshire-Evening-Post-First-for.6038672.jp        www.yorkshireeveningpost.co.uk/business/1/

I want to extract the URLs with the domain
Code:
http://article.wn.com        http://aidsjournal.com,www.cfpa.org.cn,www.youtube.com
http://seattletimes.nwsource.com      http://www.moortowntoday.co.uk ,www.yorkshireeveningpost.co.uk

Any suggestion to achieve this.

Last edited by csim_mohan; 07-22-2014 at 07:50 AM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

extracting domain names out of a text file

I am needing to extract and list domain names out of a very large text file. The text file contains tlds .com .net .org and others as well as third level domains e.g. host1.domain.com and the names are placed within paragraphs of text. Domains do not have a http:// prefix so I'm thinking the... (6 Replies)
Discussion started by: totus
6 Replies

2. Shell Programming and Scripting

Extracting anchor text and its URL from HTML files in BASH

Hi All, I have some HTML files and my requirement is to extract all the anchor text words from the HTML files along with their URLs and store the result in a separate text file separated by space. For example, <a href="/kid/stay_healthy/">Staying Healthy</a> which has /kid/stay_healthy/ as... (3 Replies)
Discussion started by: shoaibjameel123
3 Replies

3. UNIX for Dummies Questions & Answers

Awk: print all URL addresses between iframe tags without repeating an already printed URL

Here is what I have so far: find . -name "*php*" -or -name "*htm*" | xargs grep -i iframe | awk -F'"' '/<iframe*/{gsub(/.\*iframe>/,"\"");print $2}' Here is an example content of a PHP or HTM(HTML) file: <iframe src="http://ADDRESS_1/?click=5BBB08\" width=1 height=1... (18 Replies)
Discussion started by: striker4o
18 Replies

4. Shell Programming and Scripting

Extracting the file name from the specified URL

Hello Everyone, I am trying to write a shell script(or Perl Script) that would do the following: I have a file that contains the following lines: File: https://ims-svnus.com/dev/DB/trunk/feeds/templates/shell_script.txt -r860... (5 Replies)
Discussion started by: filter
5 Replies

5. Shell Programming and Scripting

Reading URL using Mechanize and dump all the contents of the URL to a file

Hello, Am very new to perl , please help me here !! I need help in reading a URL from command line using PERL:: Mechanize and needs all the contents from the URL to get into a file. below is the script which i have written so far , #!/usr/bin/perl use LWP::UserAgent; use... (2 Replies)
Discussion started by: scott_cog
2 Replies

6. Shell Programming and Scripting

Extracting the column containing URL from a text file

I have the file like this: Timestamp URL Text 1331635241000 http://example.com Peoples footage at www.test.com,http://example4.com 1331635231000 http://example1.net crack the nuts http://example6.com 1331635280000 http://example2.net ... (0 Replies)
Discussion started by: csim_mohan
0 Replies

7. Shell Programming and Scripting

Extracting the column containing URL from a text file

I have the file like this: Timestamp URL Text 1331635241000 http://example.com Peoples footage at www.test.com,http://example4.com 1331635231000 http://example1.net crack the nuts http://example6.com 1331635280000 http://example2.net ... (0 Replies)
Discussion started by: csim_mohan
0 Replies

8. Shell Programming and Scripting

Extracting the column containing URL from a text file

I have the file like this: Timestamp URL Text 1331635241000 http://example.com Peoples footage at www.test.com,http://example4.com 1331635231000 http://example1.net crack the nuts http://example6.com 1331635280000 http://example2.net ... (3 Replies)
Discussion started by: csim_mohan
3 Replies

9. UNIX for Dummies Questions & Answers

Putting the colon infront of the URL domain

I have a file like this: http://hello.com www.examplecom computer Company I wanted to keep dot (.) infront of com. to make the file like this http://hello.com www.example.com computer Company I applied this expression sed -r 's/com/.com/g'but what I get is: http://hello.com ... (4 Replies)
Discussion started by: csim_mohan
4 Replies

10. Shell Programming and Scripting

Get only domain from url file bind

Hello everybody I have been trying to extract the domain name from the bind query log with different options, however always get stuck with domains that end with link .co.uk or .co.nz. I tried the following, however only provides the first level: awk -F"." '{print $(NF-1)"."$NF}' list.txt >... (30 Replies)
Discussion started by: omuhans123
30 Replies
Image::ExifTool::Exif(3)				User Contributed Perl Documentation				  Image::ExifTool::Exif(3)

NAME
Image::ExifTool::Exif - Read EXIF meta information SYNOPSIS
This module is required by Image::ExifTool. DESCRIPTION
This module contains routines required by Image::ExifTool for processing EXIF meta information. AUTHOR
Copyright 2003-2010, Phil Harvey (phil at owl.phy.queensu.ca) This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. REFERENCES
http://www.exif.org/Exif2-2.PDF <http://www.exif.org/Exif2-2.PDF> <http://partners.adobe.com/asn/developer/pdfs/tn/TIFF6.pdf> <http://partners.adobe.com/public/developer/en/tiff/TIFFPM6.pdf> <http://www.adobe.com/products/dng/pdfs/dng_spec.pdf> <http://www.awaresystems.be/imaging/tiff/tifftags.html> <http://www.remotesensing.org/libtiff/TIFFTechNote2.html> <http://www.exif.org/dcf.PDF> http://park2.wakwak.com/~tsuruzoh/Computer/Digicams/exif-e.html <http://park2.wakwak.com/~tsuruzoh/Computer/Digicams/exif-e.html> http://www.fine-view.com/jp/lab/doc/ps6ffspecsv2.pdf <http://www.fine-view.com/jp/lab/doc/ps6ffspecsv2.pdf> <http://www.ozhiker.com/electronics/pjmt/jpeg_info/meta.html> http://hul.harvard.edu/jhove/tiff-tags.html <http://hul.harvard.edu/jhove/tiff-tags.html> <http://www.microsoft.com/whdc/xps/wmphoto.mspx> <http://www.asmail.be/msg0054681802.html> <http://crousseau.free.fr/imgfmt_raw.htm> <http://www.cybercom.net/~dcoffin/dcraw/> <http://www.digitalpreservation.gov/formats/content/tiff_tags.shtml> <http://community.roxen.com/developers/idocs/rfc/rfc3949.html> http://tools.ietf.org/html/draft-ietf-fax-tiff-fx-extension1-01 <http://tools.ietf.org/html/draft-ietf-fax-tiff-fx-extension1-01> ACKNOWLEDGEMENTS
Thanks to Matt Madrid for his help with the XP character code conversions. SEE ALSO
"EXIF Tags" in Image::ExifTool::TagNames, Image::ExifTool(3pm) perl v5.12.1 2010-03-12 Image::ExifTool::Exif(3)
All times are GMT -4. The time now is 12:13 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy