Script: Removing HTML tags and duplicate lines
Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted!
1. The problem statement, all variables and given/known data:
You will write a script that will remove all HTML tags from an HTML document and remove any consecutive duplicate lines, and save it as a text document. The user should have the option of including the name of an html file as an argument for the script, but if none is provided, then the script should prompt the user for the file name.
Something is wrong with this script I just can't figure out what it is. It says it can't find the file yet it exists. I also have no clue how to direct the updated file to a new text document properly.
2. Relevant commands, code, scripts, algorithms:
3. The attempts at a solution (include all code and scripts):
read -p "What file would you like to clean? " html
if [ -f $html ]
sed 's/<[^>]*>//g' $html | uniq > syllabus.html.txt
echo "File does not exist."
4. Complete Name of School (University), City (State), Country, Name of Professor, and Course Number (Link to Course):
Jackson Community College, Jackson MI, USA, M. Brinkman, CIS106
Note: Without school/professor/course information, you will be banned if you post here! You must complete the entire template (not just parts of it).
Last edited by tburns517; 03-15-2013 at 03:22 AM..
When running the script and typing in the file to be cleaned, in this case syllabus.html, I press enter and it goes to the next blank line, and uniq displays on the top of the window bar. The new file syllabus.html.txt does get created, but nothing gets put into the file.
Last edited by tburns517; 03-15-2013 at 02:25 PM..
|Thread Tools||Search this Thread|
|More UNIX and Linux Forum Topics You Might Find Helpful|
|Thread||Thread Starter||Forum||Replies||Last Post|
|Removing all except couple of html tags from html file||juubuntu||Shell Programming and Scripting||0||06-21-2012 08:07 AM|
|remove html tags,consecutive duplicate lines||clicstic||Shell Programming and Scripting||7||06-02-2011 09:04 AM|
|Removing Duplicate Lines per Section||petersf||Shell Programming and Scripting||5||01-18-2010 10:04 AM|
|removing the duplicate lines in a file||Sharmila_P||Shell Programming and Scripting||3||08-01-2008 02:54 AM|
|removing duplicate blank lines||rameezrajas||Shell Programming and Scripting||8||07-31-2008 08:39 AM|