Sponsored Content
Top Forums Shell Programming and Scripting Selectively Find/Replace in a file? Post 302468712 by dragin33 on Wednesday 3rd of November 2010 03:24:21 PM
Old 11-03-2010
Selectively Find/Replace in a file?

I have a file that is HTML encoded. Each line has something like this on each line..
<href=http://link.com/username.aspx>username </a> more info.. <a href=http://link.com/info1.aspx>info1</a> more code... <a href=http://link.com/info2.aspx>info2</a>

I have one goal really.. to clean up the file so that I can more easily parse this info into a PHP application. I'm more familiar with php programming then using grep/sed and such though and I thought I would try to clean it up using a bash script.

So I would like to get rid of the HTML tags and replace them with more meaningfull / cleaner info. Basically I want it to look like this..

USERNAME-username INFO-info1, info2

This would make it easy for me in php to import those values into variables and arrays. I've tried messing around with grep and sed but I can't come up with anything. Any ideas?

Thanks a lot for your help!
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Find replace within a file?

I build several files by using the cut command to grab select fields(columns) from a really bid csv file. Each file is one column of data. I then put them together using paste command. Here is the code built in tcsh: cut -d , -f 1 some.csv > 1.csv cut -d , -f 10 some.csv > 10.csv paste 1.csv... (2 Replies)
Discussion started by: yankee428
2 Replies

2. Shell Programming and Scripting

Selectively splitting a file with C-shell?

I have a rather long csh script that works, but it's terribly ungraceful and takes a while from various loops. I only know enough code to get myself into trouble, so I'm looking for some guidance. I have a large file that is separated at intervals by the same line, like this: ... (2 Replies)
Discussion started by: fusi0n
2 Replies

3. Shell Programming and Scripting

Find and replace in a gz file

Is there a way to do a find and replace in a .gz file in a single script ? I can always unzip, find and replace and then zip it again but would hate to do this everytime. Thanks ! Vivek (1 Reply)
Discussion started by: vashah
1 Replies

4. UNIX for Advanced & Expert Users

Selectively Reformating a file using AWK

Dear users, I am new to AWK and have been battling with this one for close to a week now. Some of you did offer some help last week but I think I may not have explained myself very well. So I am trying again. I have a dataset that has the following format where the datasets repeat every... (5 Replies)
Discussion started by: sda_rr
5 Replies

5. UNIX for Dummies Questions & Answers

Removing selectively the last character from a file

Dear Members, Problem is suppose i have 50 lines in a file, 40 lines last character is "\" and the remaining 10 lines are good(i mean these 10 lines do not have "\" character) How can i remove this character from the file. Thanks (1 Reply)
Discussion started by: sandeep_1105
1 Replies

6. Shell Programming and Scripting

echo ls to a file and then read file and selectively delete

I'm trying to write a script that will do an ls of a location, echo it into a file, and then read that file and selectively delete files/folders, so it would go something like this: cd $CLEAN_LOCN ls >>$TMP_FILE while read LINE do if LINE = $DONTDELETE skip elseif LINE =... (2 Replies)
Discussion started by: MaureenT
2 Replies

7. Shell Programming and Scripting

Find and Replace in File

Legends, I have a file /tmp/list.txt I want to find "/bin/" and replace it with "/log/" I tried the follwoing but no luck Sandy: /tmp> perl -pi -e 's/\/bin\/\/log\/' /tmp/list.txt >> /tmp/try Substitution pattern not terminated at -e line 1. AND, Sandy: /tmp> perl -pi -e... (2 Replies)
Discussion started by: sdosanjh
2 Replies

8. Shell Programming and Scripting

How to remove spaces from a file selectively?

Hi i have a file in which i am doing some processing. The code is as follows: #!/bin/ksh grep DATA File1.txt >> File2.txt sed 's/DATA//' File2.txt | tr -d ‘ ‘ >> File4.xls As you can see my output is going in a xl file.The output consist of four columns/feilds out of which the first... (20 Replies)
Discussion started by: Sharma331
20 Replies

9. Shell Programming and Scripting

Perl script to read string from file#1 and find/replace in file#2

Hello Forum. I have a file called abc.sed with the following commands; s/1/one/g s/2/two/g ... I also have a second file called abc.dat and would like to substitute all occurrences of "1 with one", "2 with two", etc and create a new file called abc_new.dat sed -f abc.sed abc.dat >... (10 Replies)
Discussion started by: pchang
10 Replies

10. UNIX for Dummies Questions & Answers

Selectively extracting entries from FASTA file

I would like to extract all entries containing the following patterns: ccccta & ccccccccc from the following infile: >P39PT-1224_Freq_900 cccctacgacggcattggtaatggctcccgcaagccatctctcttcagccaagg >P39PT-784_Freq_2 cccctacgacggcattggtaatggcacccgcaagccatctctcttccccccccc >P39PT-678_Freq_5... (4 Replies)
Discussion started by: Xterra
4 Replies
Feed::Find(3pm) 					User Contributed Perl Documentation					   Feed::Find(3pm)

NAME
Feed::Find - Syndication feed auto-discovery SYNOPSIS
use Feed::Find; my @feeds = Feed::Find->find('http://example.com/'); DESCRIPTION
Feed::Find implements feed auto-discovery for finding syndication feeds, given a URI. It (currently) passes all of the auto-discovery tests at http://diveintomark.org/tests/client/autodiscovery/. Feed::Find will discover the following feed formats: o RSS 0.91 o RSS 1.0 o RSS 2.0 o Atom USAGE
Feed::Find->find($uri) Given a URI $uri, use a variety of techniques to find the feeds associated with that page. If $uri itself points to a feed (i.e., if the Content-Type of the response is a recognized feed type), returns $uri. Returns a list of feed URIs. The following techniques are used: 1. <link> tag auto-discovery If the page contains any <link> tags in the <head> section, these tags are examined for recognized feed content types. The following content types are treated as feeds: application/x.atom+xml, application/atom+xml, application/xml, text/xml, application/rss+xml, and application/rdf+xml. 2. Scanning <a> tags If the page does not contain any known <link> tags, the page is then scanned for <a> tags for links to URIs with certain file extensions. The following extensions are treated as feeds: .rss, .xml, and .rdf. Note that this technique is employed only if the first technique returns no results. Feed::Find->find_in_html($html [, $base_uri ]) Given a reference to a string $html containing an HTML page, uses the same techniques as described above in find to find the feeds associated with that page. If you know the URI of the page, you should provide it in $base_uri, so that relative links can be properly made absolute. Feed::Find will attempt to determine the correct base URI, but unless that URI is specified in the HTML itself (in a "<meta>" tag), you'll need to supply it yourself. Returns a list of feed URIs. LICENSE
Feed::Find is free software; you may redistribute it and/or modify it under the same terms as Perl itself. AUTHOR &; COPYRIGHT Except where otherwise noted, Feed::Find is Copyright 2004 Benjamin Trott, ben+cpan@stupidfool.org. All rights reserved. perl v5.10.1 2011-01-28 Feed::Find(3pm)
All times are GMT -4. The time now is 11:37 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy