Sponsored Content
Top Forums Shell Programming and Scripting How to remove urls from html files Post 302630879 by Corona688 on Thursday 26th of April 2012 12:05:54 PM
Old 04-26-2012
Code:
sed 's/href=\"[^"]*\"//g' inputfile > outputfile

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove html tags with bash

Hello, is there a way to go through a file and remove certain html tags with bash? If it needs sed or awk, that'll do too. The reason why I want this is, because I have a monitor script which generates a logfile in HTML and every time it generates a logfile, the tags are reproduced. The tags... (4 Replies)
Discussion started by: dejavu88
4 Replies

2. Shell Programming and Scripting

Extract URLs from HTML code using sed

Hello, i try to extract urls from google-search-results, but i have problem with sed filtering of html-code. what i wont is just list of urls thay apears between ........<p><a href=" and next following " in html code. here is my code, i use wget and pipelines to filtering. wget works, but... (13 Replies)
Discussion started by: L0rd
13 Replies

3. Shell Programming and Scripting

HTML code remove

Hello, I have one file which has been inserted intermittently with HTML web page. I would like to remove all text between "<html xmlns="http://www.w3.org/1999/xhtml">" and </html> tags. Can any one please suggest me sed regular expression for it. Thanks (3 Replies)
Discussion started by: nrbhole
3 Replies

4. Shell Programming and Scripting

Extract urls from index.html downloaded using wget

Hi, I need to basically get a list of all the tarballs located at uri I am currently doing a wget on urito get the index.html page Now this index page contains the list of uris that I want to use in my bash script. can someone please guide me ,. I am new to Linux and shell scripting. ... (5 Replies)
Discussion started by: mnanavati
5 Replies

5. Shell Programming and Scripting

Remove external urls from .html file

Hi everyone. I have an html file with lines like so: link href="localFolder/..."> link href="htp://..."> img src="localFolder/..."> img src="htp://..."> I want to remove the links with http in the href and imgs with http in its src. I'm having trouble removing them because there... (4 Replies)
Discussion started by: CowCow339
4 Replies

6. Shell Programming and Scripting

Need script to remove millions of tmp files in /html/cache/ directory

Hello, I just saw that on my vps (centOS) my oscommerce with a seo script has created millions of tmp files inside the /html/cache/ directory. I would need to remove all those files (millions), I tried via shell but the vps loads goes to very high and it hangs, is there some way to do a... (7 Replies)
Discussion started by: andymc1
7 Replies

7. Shell Programming and Scripting

How to remove string inside html tag <a>

Does anybody know how i can remove string from <a> tag? There are several hundred posts in a few forums that need to be cleaned up. The precise situation is ---------- <a href="http://mydomain.com/cgi-bin/anyboard.cgi?fvp=/family/sexuality_and_spirituality/&cmd=rA&cG=43"> ------------- my... (6 Replies)
Discussion started by: georgi58
6 Replies

8. UNIX for Dummies Questions & Answers

Remove all HTML, scripts and styles?

Hi all, How might I go about writing a program that will read all input as an HTML file, and subsequently strip all HTML, embedded scripts and style sheets from its input, leaving only text as the output? I am a beginner, so the simpler, the better. Thanks for any advice :) (4 Replies)
Discussion started by: Molly.P.
4 Replies

9. Shell Programming and Scripting

How to remove the values inside the html tags?

Hi, I have a txt file which contain this: <a href="linux">Linux</a> <a href="unix">Unix</a> <a href="oracle">Oracle</a> <a href="perl">Perl</a> I'm trying to extract the text in between these anchor tag and ignoring everything else using grep. I managed to ignore the tags but unable to... (6 Replies)
Discussion started by: KCApple
6 Replies

10. Shell Programming and Scripting

Regex for URLs and files

Hi, I am looking for a regex that will validate a URL and files accessed in a browser. For example:http://www.google.co.uk http://www.google.com https://www.google.co.uk https://www.google.com ftp:// file:///somefile/on/a/server/accessed/from/browser/file.txt So far I have: ... (4 Replies)
Discussion started by: muay_tb
4 Replies
gst-thumbnail(1)						   User Commands						  gst-thumbnail(1)

NAME
gst-thumbnail - thumbnailer for Nautilus SYNOPSIS
gst-thumbnail [gst-std-options] inputfile outputfile DESCRIPTION
Creates a PNG thumbnail for a video file, used by Nautilus. OPTIONS
The following options are supported: gst-std-options Standard options available for use with most GStreamer applications. See gst-std-options(5) for more information. inputfile A video file. outputfile A PNG thumbnail file. FILES
The following files are used by this application: /usr/bin/gst-thumbnail Executable for thumbnailer for Nautilus ATTRIBUTES
See attributes(5) for descriptions of the following attributes: +-----------------------------+-----------------------------+ | ATTRIBUTE TYPE | ATTRIBUTE VALUE | +-----------------------------+-----------------------------+ |Availability |SUNWgnome-media | +-----------------------------+-----------------------------+ |Interface stability |External | +-----------------------------+-----------------------------+ SEE ALSO
gst-complete(1), gst-compprep(1), gst-feedback(1), gst-inspect(1), gst-launch(1), gst-md5sum(1), gst-register(1), gst-typefind(1), gst- xmlinspect(1), gst-xmllaunch(1), gstreamer-properties(1), nautilus(1), libgstreamer-0.8(3), libgstgetbits(3), gst-std-options(5) NOTES
Original man page written by the GStreamer team at http://gstreamer.net/. Updated by Brian Cameron, Sun Microsystems Inc., 2004. SunOS 5.10 2 Sep 2004 gst-thumbnail(1)
All times are GMT -4. The time now is 07:30 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy