11-03-2010
Selectively Find/Replace in a file?
I have a file that is HTML encoded. Each line has something like this on each line..
<href=http://link.com/username.aspx>username </a> more info.. <a href=http://link.com/info1.aspx>info1</a> more code... <a href=http://link.com/info2.aspx>info2</a>
I have one goal really.. to clean up the file so that I can more easily parse this info into a PHP application. I'm more familiar with php programming then using grep/sed and such though and I thought I would try to clean it up using a bash script.
So I would like to get rid of the HTML tags and replace them with more meaningfull / cleaner info. Basically I want it to look like this..
USERNAME-username INFO-info1, info2
This would make it easy for me in php to import those values into variables and arrays. I've tried messing around with grep and sed but I can't come up with anything. Any ideas?
Thanks a lot for your help!
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
I build several files by using the cut command to grab select fields(columns) from a really bid csv file. Each file is one column of data. I then put them together using paste command. Here is the code built in tcsh:
cut -d , -f 1 some.csv > 1.csv
cut -d , -f 10 some.csv > 10.csv
paste 1.csv... (2 Replies)
Discussion started by: yankee428
2 Replies
2. Shell Programming and Scripting
I have a rather long csh script that works, but it's terribly ungraceful and takes a while from various loops. I only know enough code to get myself into trouble, so I'm looking for some guidance.
I have a large file that is separated at intervals by the same line, like this:
... (2 Replies)
Discussion started by: fusi0n
2 Replies
3. Shell Programming and Scripting
Is there a way to do a find and replace in a .gz file in a single script ?
I can always unzip, find and replace and then zip it again but would hate to do this everytime.
Thanks !
Vivek (1 Reply)
Discussion started by: vashah
1 Replies
4. UNIX for Advanced & Expert Users
Dear users,
I am new to AWK and have been battling with this one for close to a week now. Some of you did offer some help last week but I think I may not have explained myself very well. So I am trying again.
I have a dataset that has the following format where the datasets repeat every... (5 Replies)
Discussion started by: sda_rr
5 Replies
5. UNIX for Dummies Questions & Answers
Dear Members,
Problem is suppose i have 50 lines in a file, 40 lines last character is "\" and the remaining 10 lines are good(i mean these 10 lines do not have "\" character)
How can i remove this character from the file.
Thanks (1 Reply)
Discussion started by: sandeep_1105
1 Replies
6. Shell Programming and Scripting
I'm trying to write a script that will do an ls of a location, echo it into a file, and then read that file and selectively delete files/folders, so it would go something like this:
cd $CLEAN_LOCN
ls >>$TMP_FILE
while read LINE
do
if LINE = $DONTDELETE
skip
elseif LINE =... (2 Replies)
Discussion started by: MaureenT
2 Replies
7. Shell Programming and Scripting
Legends,
I have a file /tmp/list.txt
I want to find "/bin/" and replace it with "/log/"
I tried the follwoing but no luck
Sandy: /tmp> perl -pi -e 's/\/bin\/\/log\/' /tmp/list.txt >> /tmp/try
Substitution pattern not terminated at -e line 1.
AND,
Sandy: /tmp> perl -pi -e... (2 Replies)
Discussion started by: sdosanjh
2 Replies
8. Shell Programming and Scripting
Hi i have a file in which i am doing some processing.
The code is as follows:
#!/bin/ksh
grep DATA File1.txt >> File2.txt
sed 's/DATA//' File2.txt | tr -d ‘ ‘ >> File4.xls
As you can see my output is going in a xl file.The output consist of four columns/feilds out of which the first... (20 Replies)
Discussion started by: Sharma331
20 Replies
9. Shell Programming and Scripting
Hello Forum.
I have a file called abc.sed with the following commands;
s/1/one/g
s/2/two/g
...
I also have a second file called abc.dat and would like to substitute all occurrences of "1 with one", "2 with two", etc and create a new file called abc_new.dat
sed -f abc.sed abc.dat >... (10 Replies)
Discussion started by: pchang
10 Replies
10. UNIX for Dummies Questions & Answers
I would like to extract all entries containing the following patterns: ccccta & ccccccccc from the following infile:
>P39PT-1224_Freq_900
cccctacgacggcattggtaatggctcccgcaagccatctctcttcagccaagg
>P39PT-784_Freq_2
cccctacgacggcattggtaatggcacccgcaagccatctctcttccccccccc
>P39PT-678_Freq_5... (4 Replies)
Discussion started by: Xterra
4 Replies
LEARN ABOUT DEBIAN
html::elementraw
HTML::ElementRaw(3pm) User Contributed Perl Documentation HTML::ElementRaw(3pm)
NAME
HTML::ElementRaw - Perl extension for HTML::Element(3).
SYNOPSIS
use HTML::ElementRaw;
$er = new HTML::ElementRaw;
$text = '<p>I would like this HTML to not be encoded</p>';
$er->push_content($text);
$h = new HTML::Element 'h2';
$h->push_content($er);
# Now $text will appear as you typed it, non-escaped,
# embedded in the HTML produced by $h.
print $h->as_HTML;
DESCRIPTION
Provides a way to graft raw HTML strings into your HTML::Element(3) structures. Since they represent raw text, these can only be leaves in
your HTML element tree. The only methods that are of any real use in this degenerate element are push_content() and as_HTML(). The
push_content() method will simply prepend the provided text to the current content. If you happen to pass an HTML::element to
push_content, the output of the as_HTML() method in that element will be prepended.
REQUIRES
HTML::Element(3)
AUTHOR
Matthew P. Sisk, <sisk@mojotoad.com>
COPYRIGHT
Copyright (c) 1998-2010 Matthew P. Sisk. All rights reserved. All wrongs revenged. This program is free software; you can redistribute it
and/or modify it under the same terms as Perl itself.
SEE ALSO
HTML::Element(3), HTML::ElementSuper(3), HTML::Element::Glob(3), HTML::ElementTable(3), perl(1).
perl v5.10.1 2010-06-09 HTML::ElementRaw(3pm)