Sponsored Content
Top Forums UNIX for Advanced & Expert Users Parsing a file which contains urls from different sites Post 302357507 by solitare123 on Wednesday 30th of September 2009 03:37:45 AM
Old 09-30-2009
Parsing a file which contains urls from different sites

Hi

I have a file which have millions of urls from different sites. Count of lines are 4000000.
Code:
http://www.chipchick.com/2009/09/usb_hand_grenade.html
http://www.engadget.com/page/5
http://www.mp3raid.com/search/download-mp3/20173/michael_jackson_fall_again_instrumental.html
http://www.myacrobatpdf.com/8713/canon-speedlite-430ex-manual.html
http://www.mobileheart.com/cell-phone-screensavers/1167-Sony-Ericsson-W200-Screensavers.aspx
http://www.india-forums.com/forum_posts.asp?TID=1256207&TPN=2
http://gallery.mobile9.com/f/923680
http://www.phoronix.com/scan.php?page=article&item=xorg_vdpau_vaapi&num=1
http://www.experts-exchange.com/Software/Photos_Graphics
http://www.jigzone.com/mpc/expired.php
http://ultimatetop200.com/
http://www.mp3raid.com/search/for/the_maine/4.html
http://gallery.mobile9.com/f/907594?view=download
http://gallery.mobile9.com/f/907594
http://www.imdb.com/title/tt0813715/board/thread/147969365
http://www.imdb.com/name/nm0002028

I want some command or code which can give me count of urls from individual sites e.g imdb, experts-exchange. gallery.mobile9

Last edited by radoulov; 09-30-2009 at 07:33 AM.. Reason: please use code tags
 

8 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

find and replace urls

I need to archive a large website onto a DVD. Many of the links and image srcs are absolute URLs. As I don't want to alter them all manually, I'm looking for a perl or unix command that would remove: http://www.mydomain.com/mysubfolder/ and replace with: ./ Can anyone help me with this... (3 Replies)
Discussion started by: benkyma
3 Replies

2. Shell Programming and Scripting

Parsing of file for Report Generation (String parsing and splitting)

Hey guys, I have this file generated by me... i want to create some HTML output from it. The problem is that i am really confused about how do I go about reading the file. The file is in the following format: TID1 Name1 ATime=xx AResult=yyy AExpected=yyy BTime=xx BResult=yyy... (8 Replies)
Discussion started by: umar.shaikh
8 Replies

3. Shell Programming and Scripting

Remove external urls from .html file

Hi everyone. I have an html file with lines like so: link href="localFolder/..."> link href="htp://..."> img src="localFolder/..."> img src="htp://..."> I want to remove the links with http in the href and imgs with http in its src. I'm having trouble removing them because there... (4 Replies)
Discussion started by: CowCow339
4 Replies

4. Shell Programming and Scripting

Hashing URLs

So, I am writing a script that will read output from Bulk Extractor (which gathers data based on regular expressions). My script then reads the column that has the URL found, hashes it with MD5, then outputs the URL and hash to a file. Where I am stuck on is that I want to read the bulk... (7 Replies)
Discussion started by: twjolson
7 Replies

5. Web Development

How to redirect URLs in Apache?

I am a total newbie to Apache. I need to do this only for this weekend during an upgrade from old system to new system We have different URLs http://domain.name/xxx (xxx varies to any length and words - it can be /home, /login, /home/daily, /daily/report, etc). How do i redirect all those to... (0 Replies)
Discussion started by: GosarJunk
0 Replies

6. Post Here to Contact Site Administrators and Moderators

Not allowed to post URLs

Hi, I tried to post some perl code for discussion (wrapped in swaddling . However, a regex has an escaped backslash so the forum parser sees it as an URL? Had the same experience with the sample data that I tried to provide for the same discussion. It contains emails addresses,... (1 Reply)
Discussion started by: msutfin
1 Replies

7. Shell Programming and Scripting

Replacing urls from file

Hi ALL, I have a file A which contains A=www.google.com B=www.abcd.com C=www.nick.com D=567 file B Contains A=www.google1234.com B=www.bacd.com C=www.mick.com D=789 I wanted a script which can replace file A contents with B Contents (5 Replies)
Discussion started by: nikhil jain
5 Replies

8. Shell Programming and Scripting

Regex for URLs and files

Hi, I am looking for a regex that will validate a URL and files accessed in a browser. For example:http://www.google.co.uk http://www.google.com https://www.google.co.uk https://www.google.com ftp:// file:///somefile/on/a/server/accessed/from/browser/file.txt So far I have: ... (4 Replies)
Discussion started by: muay_tb
4 Replies
GALLERY-UPLOADER(1)					   gallery-uploader User Manual 				       GALLERY-UPLOADER(1)

NAME
gallery-uploader - program to upload pictures and video to Gallery SYNOPSIS
gallery-uploader [FILES] DESCRIPTION
This manual page documents briefly the gallery-uploader command. gallery-uploader is a program that allows to easily upload pictures and video to Gallery installations; Gallery (- http://gallery.menalto.com) is an advanced web photo album organizer. If FILES are given, they are the files which must be uploaded. If no argument is given instead, a small browser window will open where the user can select items to upload. HOW TO ENABLE AS NAUTILUS SCRIPT
Nautilus, GNOME's default file manager, supports scripts, which can be activated on any selected file(s) from the right-button menu; gallery-uploader plays perfectly the role of a Nautilus script. To enable it, the user can run in a terminal the command: ln -s /usr/share/nautilus-scripts/Gallery Uploader.py ~/.gnome2/nautilus-scripts/Gallery Uploader or use a program such as nautilus-scripts-manager; see: http://www.pietrobattiston.it/nautilus-scripts-manager. AUTHOR
Pietro Battiston <me@pietrobattiston.it> developed the program and wrote this manpage. COPYRIGHT
Copyright (C) 2010 Pietro Battiston Permission is granted to copy, distribute and/or modify this document under the terms of the GNU General Public License, Version 2 or (at your option) any later version published by the Free Software Foundation. On Debian systems, the complete text of the GNU General Public License can be found in /usr/share/common-licenses/GPL-2. gallery-uploader 03/29/2011 GALLERY-UPLOADER(1)
All times are GMT -4. The time now is 03:16 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy