The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Dummies Questions & Answers
Google UNIX.COM


UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !!

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
how to filter out some paragraphs in a file cnlhap Shell Programming and Scripting 7 08-19-2008 12:03 PM
naming variables with variables Allasso Shell Programming and Scripting 2 06-27-2008 07:45 AM
how to sort paragraphs by date within a file nabmufti Shell Programming and Scripting 1 02-13-2008 02:33 PM
how to extract paragraphs from file in BASH script followed by prefix ! , !! and !!! nabmufti Shell Programming and Scripting 6 02-09-2008 05:32 PM
smitty, remove user, remove directory as well.. yls177 UNIX for Dummies Questions & Answers 2 11-10-2002 11:43 PM

Reply
 
Submit Tools LinkBack Thread Tools Search this Thread Display Modes
  #1  
Old 07-03-2008
Registered User
 

Join Date: Jul 2008
Posts: 5
Using sed to remove paragraphs with variables

Hi everyone,

I have a file with multiple entries and I would like to remove the ones that contain either /A"> or /A/, where A can be any letter of the alphabet. Here's an example of the entries:
<Topic r:id="Top/World/Fran">
<catid>476</catid>
<link r:resource="http://fr.news.yahoo.com/"/>
<link r:resource="http://news.google.fr/"/>
<link r:resource="http://actualite.free.fr"/>
</Topic>
<Topic r:id="Top/World/Fran/Act/A_la_Une">
<catid>32293</catid>
<link r:resource="http://www.pluralworld.com/"/>
<link r:resource="http://www.webdopresse.ch/"/>
</Topic>
<Topic r:id="Top/World/Fran/A">
<catid>32069</catid>
</Topic>
<Topic r:id="Top/World/Fran/B/Stuff">
<catid>32069</catid>
</Topic>
So, in this case, I want to have a new file that does not keep the last two entries as the first contains /A"> and the second contains /B/.

I have tried with the following code, but it removes everything!

#!/bin/sh
# Removes topics that begin with a certain value
inputFile=$1

tempFile=$inputFile.tmp

# A number of indexing categories exist that have to be removed
ALPHABET="ABCDEFGHIJKLMNOPQRSTUVWXYZ"
n=0

while [ $n -lt ${#ALPHABET} ]
do
sed -n '/\/${ALPHABET:n:1}\">/,/<\/Topic>/!p' $tempFile.start > $tempFile.end
mv $tempFile.end $tempFile.start

sed -n '/\/${ALPHABET:n:1}\//,/<\/Topic>/!p' $tempFile.start > $tempFile.end
mv $tempFile.end $tempFile.start

n=$(( $n + 1 ))
done


Can anyone out there please help?
Reply With Quote
Forum Sponsor
  #2  
Old 07-03-2008
Moderator
 

Join Date: Dec 2003
Location: /dev/fl
Posts: 1,061
The following works for ksh93
Code:
#!/bin/ksh93

while read line
do
   [[ "$line" = ~(E)(\/[[:alpha:]]\/|\/[[:alpha:]]\"\>$) ]] || print $line
done < file

exit 0
Output is
Code:
<Topic r:id="Top/World/Fran">
<catid>476</catid>
<link r:resource="http://fr.news.yahoo.com/"/>
<link r:resource="http://news.google.fr/"/>
<link r:resource="http://actualite.free.fr"/>
</Topic>
<Topic r:id="Top/World/Fran/Act/A_la_Une">
<catid>32293</catid>
<link r:resource="http://www.pluralworld.com/"/>
<link r:resource="http://www.webdopresse.ch/"/>
</Topic>
<catid>32069</catid>
</Topic>
<catid>32069</catid>
</Topic>
Reply With Quote
Google The UNIX and Linux Forums
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes




All times are GMT -7. The time now is 09:11 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008. All Rights Reserved.Ad Management by RedTyger Visit The Complex Event Processing Blog

Content Relevant URLs by vBSEO 3.2.0