Parsing a file which contains urls from different sites
Hi
I have a file which have millions of urls from different sites. Count of lines are 4000000.
I want some command or code which can give me count of urls from individual sites e.g imdb, experts-exchange. gallery.mobile9
Last edited by radoulov; 09-30-2009 at 07:33 AM..
Reason: please use code tags
I need to archive a large website onto a DVD. Many of the links and image srcs are absolute URLs. As I don't want to alter them all manually, I'm looking for a perl or unix command that would remove:
http://www.mydomain.com/mysubfolder/
and replace with:
./
Can anyone help me with this... (3 Replies)
Hey guys,
I have this file generated by me... i want to create some HTML output from it.
The problem is that i am really confused about how do I go about reading the file.
The file is in the following format:
TID1 Name1 ATime=xx AResult=yyy AExpected=yyy BTime=xx BResult=yyy... (8 Replies)
Hi everyone. I have an html file with lines like so:
link href="localFolder/...">
link href="htp://...">
img src="localFolder/...">
img src="htp://...">
I want to remove the links with http in the href and imgs with http in its src. I'm having trouble removing them because there... (4 Replies)
So, I am writing a script that will read output from Bulk Extractor (which gathers data based on regular expressions). My script then reads the column that has the URL found, hashes it with MD5, then outputs the URL and hash to a file.
Where I am stuck on is that I want to read the bulk... (7 Replies)
I am a total newbie to Apache. I need to do this only for this weekend during an upgrade from old system to new system
We have different URLs http://domain.name/xxx (xxx varies to any length and words - it can be /home, /login, /home/daily, /daily/report, etc).
How do i redirect all those to... (0 Replies)
Discussion started by: GosarJunk
0 Replies
6. Post Here to Contact Site Administrators and Moderators
Hi,
I tried to post some perl code for discussion (wrapped in swaddling . However, a regex has an escaped backslash so the forum parser sees it as an URL?
Had the same experience with the sample data that I tried to provide for the same discussion. It contains emails addresses,... (1 Reply)
Hi ALL,
I have a file A which contains
A=www.google.com
B=www.abcd.com
C=www.nick.com
D=567
file B Contains
A=www.google1234.com
B=www.bacd.com
C=www.mick.com
D=789
I wanted a script which can replace file A contents with B Contents (5 Replies)
Hi,
I am looking for a regex that will validate a URL and files accessed in a browser.
For example:http://www.google.co.uk
http://www.google.com
https://www.google.co.uk
https://www.google.com
ftp://
file:///somefile/on/a/server/accessed/from/browser/file.txt
So far I have:
... (4 Replies)
Discussion started by: muay_tb
4 Replies
LEARN ABOUT DEBIAN
gallery-uploader
GALLERY-UPLOADER(1) gallery-uploader User Manual GALLERY-UPLOADER(1)NAME
gallery-uploader - program to upload pictures and video to Gallery
SYNOPSIS
gallery-uploader [FILES]
DESCRIPTION
This manual page documents briefly the gallery-uploader command.
gallery-uploader is a program that allows to easily upload pictures and video to Gallery installations; Gallery (-
http://gallery.menalto.com) is an advanced web photo album organizer.
If FILES are given, they are the files which must be uploaded. If no argument is given instead, a small browser window will open where the
user can select items to upload.
HOW TO ENABLE AS NAUTILUS SCRIPT
Nautilus, GNOME's default file manager, supports scripts, which can be activated on any selected file(s) from the right-button menu;
gallery-uploader plays perfectly the role of a Nautilus script.
To enable it, the user can run in a terminal the command:
ln -s /usr/share/nautilus-scripts/Gallery Uploader.py ~/.gnome2/nautilus-scripts/Gallery Uploader
or use a program such as nautilus-scripts-manager; see: http://www.pietrobattiston.it/nautilus-scripts-manager.
AUTHOR
Pietro Battiston <me@pietrobattiston.it>
developed the program and wrote this manpage.
COPYRIGHT
Copyright (C) 2010 Pietro Battiston
Permission is granted to copy, distribute and/or modify this document under the terms of the GNU General Public License, Version 2 or (at
your option) any later version published by the Free Software Foundation.
On Debian systems, the complete text of the GNU General Public License can be found in /usr/share/common-licenses/GPL-2.
gallery-uploader 03/29/2011 GALLERY-UPLOADER(1)