Visit Our UNIX and Linux User Community


sed script to generate hyperlinks refuses to work


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting sed script to generate hyperlinks refuses to work
# 1  
Old 11-13-2008
sed script to generate hyperlinks refuses to work

Hi All,

I'm new to the forum and not a programmer, but I'm writing a bash script to preprocess definitions of technical terms by inserting hyperlinks pointing to other pages in the glossary before the pages are posted to our server, using a standard naming convention for the pages. The script searches through a set of text (all the definitions) and insert hyperlinks when it finds specific terms (the terms). The script also generates a cleaned-up version of the terms for the hyperlink, getting rid of uppercase, non-valid characters for filenames, etc. So the idea is to replace the term in the original file by the term plus a hyperlink to the page for that term.

The individual comands used to work (I swear) when I first wrote it, but in refining it I've busted it and struggled for two days now trying to get it to work, including step by step, or as a script...can't figure out how I've broken it. I hoping someone wil spot the error, grateful for any help you can provide, it doesn't have to be efficient, just work.

Symptom is only the last hyperlink in the terms in included, instead of all of them

Here's the (non-working) script as is, mainly just search and replace, grateful for any help you can provide....

#!/bin/bash
sh --version
debug=":"
debug="echo"
SEQ=/usr/bin/seq
#tr -s '\n' < definitions.txt > temp1.txt
#tr -s ' ' < temp1.txt > definitions.txt
sed -i "s/\./yyyyy/g" definitions.txt
sed -i "s/(/ ( /g" definitions.txt
sed -i "s/\//zzzzz/g" definitions.txt
#sed -i "s/(/( /g" definitions.txt
sed -i "s/,/ , /g" definitions.txt
sed -i "s/)/ ) /g" definitions.txt
tr ' ' '_' < definitions.txt > definitions_underscore
tr -s '\n' < terms > temp2
tr -s ' ' < temp2 > terms
tr A-Z a-z < terms > terms_lowercase
tr -d ' =;:`"<>,./?!@#$%^&(){}[]+~-' < terms_lowercase > terms_url
sed -i "s/\./yyyyy/g" terms
sed -i "s/\//zzzzz/g" terms
tr ' ' '_' < terms > terms_search
a=( $( cat terms_search ) )
b=( $( cat terms_url ) )
$debug " Number of elements in array is $(( ${#a[@]} ))"
for i in $($SEQ 0 $((${#a[@]} - 1)))
do
echo ${a[$i]}
echo ${b[$i]}
sed -i "s/_"${a[$i]}"_/\_\<a\_href="\"${b[$i]}"\.php\"\>"${a[$i]}"\<\/a\> /g" definitions_underscore
done
sed -i "s/__(__/(/g" definitions_underscore.txt
sed -i "s/__,/,/g" definitions_underscore.txt
sed -i "s/__)/)/g" definitions_underscore.txt
sed -i "s/yyyyy/\./g" definitions_underscore.txt
sed -i "s/zzzzz/\//g" definitions_underscore.txt
sed -i "s/yyyyy/\./g" terms.txt
sed -i "s/_\./\./g" terms.txt
sed -i "s/ \./\./g" terms.txt
tr '_' ' ' < definitions_underscore > definitions_linked.html
# 2  
Old 11-14-2008
It's difficult to debug without seeing some sample input data and the expected output data for that input.

I know you're not worried about efficiency, but you could buy a lot of speed improvement and brevity by simply combining your sed commands into one script, e.g. replace:

Code:
sed -i "s/__(__/(/g" definitions_underscore.txt
sed -i "s/__,/,/g" definitions_underscore.txt
sed -i "s/__)/)/g" definitions_underscore.txt
sed -i "s/yyyyy/\./g" definitions_underscore.txt
sed -i "s/zzzzz/\//g" definitions_underscore.txt

with this:

Code:
sed -i "
  s/__(__/(/g
  s/__,/,/g
  s/__)/)/g
  s/yyyyy/\./g
  s/zzzzz/\//g
" definitions_underscore.txt

Are definitions_underscore and definitions_underscore.txt really supposed to be two different files?
# 3  
Old 11-14-2008
stripped down vesion with working and real (not-working) data

Hey, thanks for the reply, I've cut this script down to basics (so not worrying about tidying up the output file, or missing occasional terms with special characters for the minute) But...still doesn't work. The terms and definitions are taken from an excel spreadsheet for processing, just cut and paste into text files, stored as unformatted.

I've witten some text data for the script which is procesed OK, but the real data isn't, the hyperlinks are just not added to the output file.

Have attached the stripped down script, the test data files and their output (working) and a chunk of the real data (not working). Grateful for any insight, just cannot see where's its going wrong!

Here's the same script cut and paste

Code:
#!/bin/bash
sh --version
debug=":"
debug="echo"
SEQ=/usr/bin/seq
tr ' ' '_' < terms > terms_search
tr A-Z a-z < terms_search > temp1
tr -d ' _=;:`"<>,./?!@#$%^&(){}[]+~-' < temp1 > terms_url
tr ' ' '_' < definitions > definitions_underscore
a=( $( cat terms_search ) )
b=( $( cat terms_url ) )
$debug " Number of elements in array is $(( ${#a[@]} ))"
for i in $($SEQ 0 $((${#a[@]} - 1)))
do
  echo ${a[$i]}
  echo ${b[$i]}
  sed -i "s/_"${a[$i]}"_/\_\<a\_href="\"${b[$i]}"\.php\"\>"${a[$i]}"\<\/a\> /g" definitions_underscore
done

# 4  
Old 11-16-2008
I found when testing with your live data that it encountered a sed script error when it attempted this substitution:

Code:
EAP/AKA_Authentication eapakaauthentication
sed: Function s/_EAP/AKA_Authentication_/\_\<a\_href="eapakaauthentication\.php"\>EAP/AKA_Authentication\<\/a\> /g cannot be parsed.

I fixed that by changing these lines:

Code:
tr ' ' '_' < terms | sed 's?/?\\/?g' > terms_search
tr A-Z a-z < terms_search | tr -d ' _=;:`"<>,./\\?!@#$%^&(){}[]£+~-' > terms_url

Consider adding set -o errexit at the beginning of the script so that it stops if such errors are encountered, otherwise you can easily miss them.

Previous Thread | Next Thread
Test Your Knowledge in Computers #66
Difficulty: Medium
IT and cybersecurity risk management is based on the intersection of these three areas, criticality, vulnerability, and threat.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Perl : to get all the hyperlinks from the xlsx sheet(hyperlinks not visible in excel sheet directly)

Hi folks, I have a requirement in perl to print all the hyperlink from the spreadsheet(xlsx). Spreadsheet contains few lines of hyperlink data (pic attached). P.S. Hyperlink is behind the data and not visible in excel sheet directly. Now using perl script I need to copy the hyperlinks in... (3 Replies)
Discussion started by: scriptscript
3 Replies

2. Shell Programming and Scripting

My script work on Linux but not work in sunos.

My script work on Linux but not work in sun os. my script. logFiles="sentLog1.log sentLog2.log" intial_time="0 0" logLocation="/usr/local/tomcat/logs/" sleepTime=600 failMessage=":: $(tput bold)Log not update$(tput rmso) = " successMessage="OK" arr=($logFiles)... (7 Replies)
Discussion started by: ooilinlove
7 Replies

3. Shell Programming and Scripting

Generate script perl mail log not work

Dear unix forum members, I'm working on a script that will parse a mail machine's logs and print a list of email addresses in this format: authen@domain.com | sender@domain | recipient@domain exam account1@domain1.com | sender2@domain2.com |... (3 Replies)
Discussion started by: puka
3 Replies

4. Shell Programming and Scripting

examples of hyperlinks in emails

Hi All, Im using bash on a sloaris box. I am using mailx to send emails from the unix box to internal email accounts. Does anyone have an example of how I can generate a hyperlink within the email body so that when the recipeint recieves the email, they can click on the link and the link... (1 Reply)
Discussion started by: satnamx
1 Replies

5. Shell Programming and Scripting

Using SED to generate new file from template

Hi there! I am using a BASH script to read a CSV file (containing variable values)using while read, and for every record I want SED to get a template from a file, and using the variables read from the CSV, write a new file. #!/bin/bash current_ifs=$IFS ; #backup original IFS, need ","... (12 Replies)
Discussion started by: ppucci
12 Replies

6. Windows & DOS: Issues & Discussions

Email address hyperlinks in PDF

Hi there, when I convert a Word document to PDF (with different tools) everything works fine, but my email address name.surname@xxxmail.com in hyperlinks is changed in surname@xxxmail.com. While if I try with the same address without the dot namesurname@xxxmail.com everything works correctly.... (2 Replies)
Discussion started by: Giordano Bruno
2 Replies

7. UNIX for Dummies Questions & Answers

Script doesn't work, but commands inside work

Howdie everyone... I have a shell script RemoveFiles.sh Inside this file, it only has two commands as below: rm -f ../../reportToday/temp/* rm -f ../../report/* My problem is that when i execute this script, nothing happened. Files remained unremoved. I don't see any error message as it... (2 Replies)
Discussion started by: cheongww
2 Replies

8. Shell Programming and Scripting

replace buttons with hyperlinks

I have a script that uploads file push buttons, but i want to replace buttons with hyperlinks. Please send suggessions Thanks (1 Reply)
Discussion started by: handshakeit
1 Replies

9. Shell Programming and Scripting

sed script. How does it work?

I'm using this command sed -e "s/'/'/g" -e 's/&quot;/"/g' -e 's/&amp;/\&/g' myfile.txt My question is does this command reads file 3 times applying different replacement each time or it reads it only once and do 3 replacements at the same time? My concern is, since I have big files (1 MB or more) that... (2 Replies)
Discussion started by: billy5
2 Replies

10. UNIX for Dummies Questions & Answers

Crontab refuses to save changes!

I'm trying to create a new crontab entry for a user on a debian box but whenever I do a crontab -e or crontab -u username -e then edit the crontab (with vim as my default editor) and :wq. I get the error message: no crontab for username - using an empty one crontab: no changes made to crontab... (18 Replies)
Discussion started by: fire>ant>
18 Replies

Featured Tech Videos