Sponsored Content
Full Discussion: Replacing HTML tags with sed
Top Forums UNIX for Dummies Questions & Answers Replacing HTML tags with sed Post 302852867 by Corona688 on Thursday 12th of September 2013 02:08:25 PM
Old 09-12-2013
sed uses regexes, not globs, which explains part of your difficulty.

Code:
files*

In shell globbing, * means "anything else". In regex, it means "zero or more of the previous". So files* would match files, filess, filesssssssssssssssss, but wouldn't match files\

The regex equivalent would be .*, where . is a special character mean "match anything". But I'd try something a little trickier, to match > so that part of the regex doesn't scan outside the tag it started in. [] let you specify a range to include or exclude. [A-Z] would match a single letter in A-Z range. [^A-Z] would match a single character not in the A-Z range. [^>] would match anything that's not an end-of-tag character.

So, [^>]* would match zero or more non-> characters, swallowing up the rest of the tag and stopping right before >.

This works on the HTML you posted:
Code:
sed 's#<a href="files\\[^>]*><img src="thumbnails\\[^>]*></a>#<img src=redact.png>#g'

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

unsing sed to strip html tags - help

Hi, I am trying to strip html tags of a string for example <TD>no problem</TD> the sesult should be no problem but could never get rid off all the tags sed 's/<..D>//g' Please help, I am new (3 Replies)
Discussion started by: zap
3 Replies

2. Shell Programming and Scripting

How to supplement HTML tags with SED

I am cleaning up HTML with sed. With the regexp <a name="+"></a><h>*<span class="mw-headline" >+</span></h> I can find the tags I need. But when I place them in a sed command, sed fails. So I started building up from a smaller command. This is where I am now: sed -r -e s/"<a... (3 Replies)
Discussion started by: DocBrewer
3 Replies

3. Shell Programming and Scripting

Problem with 'sed' command while using HTML tags

Hello, I am using sed as follows - sed 's/CONTACT SYSTEMS! Some payments have been rejected/<B><font color="red" size="5.0pt"CONTACT SYSTEMS! Some payments have been rejected</font></B>/' $REPORT_FILE But while executing this, I am getting the error as - sed: command garbled &... (5 Replies)
Discussion started by: The Observer
5 Replies

4. Shell Programming and Scripting

Replace space, that is not in html tags <> with new line using sed

Hi, I am working on transforming html code text into the .vert text format. I want to use linux utility sed. I have this regexp which should do the work: s/ \(?!*>\)/\n/g. I use it like this with sed: echo "you <we try> there" | sed 's/ \(?!*>\)/\n/g' ... The demanded output should be: you <we... (5 Replies)
Discussion started by: matt1311
5 Replies

5. Shell Programming and Scripting

sed - striping out html tags

I have pasted the contents of a log file (swmbackup.wrkstn.1262071383.sales2a) below: Workstation: sales2a<BR Vault sales2a-hogwarts will be initialized.<BR <font color="red"There was a problem mounting /mnt/sales2a/desktop$ </FONT<BR <font color="red"There was a problem mounting... (4 Replies)
Discussion started by: bigtonydallas
4 Replies

6. Shell Programming and Scripting

searching & replacing/removing only certain HTML tags

I generally save a lot of web pages for reading offline which works out great for school. Now I have to spend a lot of time on the bus and I am looking for the best way to read some of these webpages using my Nokia 7610. I have uploaded the files to my phone, but they are deadly deadly slow to... (2 Replies)
Discussion started by: naphelge
2 Replies

7. Shell Programming and Scripting

Replacing variable values in html tags

Hi please help me with this . I have a file test.txt with following content $cat test.txt <td>$test</td> <h2>$test2</h2> and I have a ksh with following content $cat test.ksh #!/bin/ksh test=3 test2=4 while read line do echo $line done < test.html I am expecting the output as (4 Replies)
Discussion started by: panduandpavan
4 Replies

8. Shell Programming and Scripting

help with sed needed to extract content from html tags

Hi I've searched for it for few hours now and i can't seem to find anything working like i want. I've got webpage, saved in file par with form like this: <html><body><form name='sendme' action='http://example.com/' method='POST'> <textarea name='1st'>abc123def678</textarea> <textarea... (9 Replies)
Discussion started by: seb001
9 Replies

9. Shell Programming and Scripting

Replace HTML tags using sed regex

I need all the end tags of </font> to be replaced with new line yet enclosing tag to be retained </font>. Please help me in this regard. Input: <font>abc</font>def<font>ghi</font> Output: <font>abc</font> def <font>ghi</font> (3 Replies)
Discussion started by: Badhrish
3 Replies

10. UNIX for Beginners Questions & Answers

How to parse a specifc value between html tags using sed?

Hi, im trying to read a Temperature value from html code. So far i have managed to reduce the whole html page down to this single line with the following sed command:sed -n '/Temperature/p' $temp_temperature | tee temp_string <TD width='350'>Temperature :</td><td>25... (2 Replies)
Discussion started by: naittis
2 Replies
Text::Glob(3)						User Contributed Perl Documentation					     Text::Glob(3)

NAME
Text::Glob - match globbing patterns against text SYNOPSIS
use Text::Glob qw( match_glob glob_to_regex ); print "matched " if match_glob( "foo.*", "foo.bar" ); # prints foo.bar and foo.baz my $regex = glob_to_regex( "foo.*" ); for ( qw( foo.bar foo.baz foo bar ) ) { print "matched: $_ " if /$regex/; } DESCRIPTION
Text::Glob implements glob(3) style matching that can be used to match against text, rather than fetching names from a filesystem. If you want to do full file globbing use the File::Glob module instead. Routines match_glob( $glob, @things_to_test ) Returns the list of things which match the glob from the source list. glob_to_regex( $glob ) Returns a compiled regex which is the equivalent of the globbing pattern. glob_to_regex_string( $glob ) Returns a regex string which is the equivalent of the globbing pattern. SYNTAX
The following metacharacters and rules are respected. "*" - match zero or more characters "a*" matches "a", "aa", "aaaa" and many many more. "?" - match exactly one character "a?" matches "aa", but not "a", or "aaa" Character sets/ranges "example.[ch]" matches "example.c" and "example.h" "demo.[a-c]" matches "demo.a", "demo.b", and "demo.c" alternation "example.{foo,bar,baz}" matches "example.foo", "example.bar", and "example.baz" leading . must be explictly matched "*.foo" does not match ".bar.foo". For this you must either specify the leading . in the glob pattern (".*.foo"), or set $Text::Glob::strict_leading_dot to a false value while compiling the regex. "*" and "?" do not match / "*.foo" does not match "bar/baz.foo". For this you must either explicitly match the / in the glob ("*/*.foo"), or set $Text::Glob::strict_wildcard_slash to a false value with compiling the regex. BUGS
The code uses qr// to produce compiled regexes, therefore this module requires perl version 5.005_03 or newer. AUTHOR
Richard Clamp <richardc@unixbeard.net> COPYRIGHT
Copyright (C) 2002, 2003, 2006, 2007 Richard Clamp. All Rights Reserved. This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself. SEE ALSO
File::Glob, glob(3) perl v5.16.2 2013-08-25 Text::Glob(3)
All times are GMT -4. The time now is 04:13 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy