Sponsored Content
Top Forums Shell Programming and Scripting Extract value inside <text> tag for a particular condition. Post 302270363 by parshant_bvcoe on Sunday 21st of December 2008 10:49:18 AM
Old 12-21-2008
Extract value inside <text> tag for a particular condition.

Hi All!

I have obtained following output from a tool "pdftohtml" ::

So, my input is as under:

<text top="246" left="160" width="84" height="16" font="3">Business purpose</text>
<text top="260" left="506" width="220" height="16" font="3">giving the right information and new insights </text>
<text top="296" left="160" width="67" height="16" font="3">Characteristic</text>
<text top="296" left="278" width="111" height="16" font="3">Operational processing</text>
<text top="296" left="506" width="120" height="16" font="3">Informational processing</text>
<text top="318" left="160" width="55" height="16" font="3">Orientation</text>
<text top="318" left="278" width="56" height="16" font="3">Transaction</text>
<text top="318" left="506" width="42" height="16" font="3">Analysis</text>
<text top="340" left="160" width="43" height="16" font="3">Function</text>
------
----

Now, i want to write a shell script that checks the value of "left" attribute in in each <text> tag and if this value is equal to 160, it saves the content enclosed inside a particular <text> tag in an arbitrary file inside <p> tag.

So, i want output as follows:

<p>Business purpose</p>
<p>Characteristic</p>
<p>Orientation</p>
<p>Function</p>
------
-----

Any help will be Truly Appreciated. Thanks in advance !!!
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How do I extract text only from html file without HTML tag

I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part. Same problem happens in "type" command in MS-DOS. I know you can do it by opening it in Internet Explorer,... (4 Replies)
Discussion started by: los111
4 Replies

2. Shell Programming and Scripting

Finding a string inside A Tag

I have umpteen number of files containing HTML A tags in the below format or I want to find all the lines that contain the word Login= I used this command grep "Login=" * This gave me normal lines as well which contain the word Login= for example, it returned lines which... (2 Replies)
Discussion started by: dahlia84
2 Replies

3. Shell Programming and Scripting

extract xml tag based on condition

Hi All, I have a large xml file of invoices. The file looks like below: <INVOICES> <INVOICE> <NAME>Customer A</NAME> <INVOICE_NO>1234</INVOICE_NO> </INVOICE> <INVOICE> <NAME>Customer A</NAME> <INVOICE_NO>2345</INVOICE_NO> </INVOICE> <INVOICE> <NAME>Customer A</NAME>... (9 Replies)
Discussion started by: angshuman
9 Replies

4. Shell Programming and Scripting

Replace text inside XML file based on condition

Hi All, I want to change the name as SEQ_13 ie., <Property Name="Name">SEQ_13</Property> when the Stage Type is PxSequentialFile ie., <Property Name="StageType">PxSequentialFile</Property> :wall: Input.XML <Main> <Record Identifier="V0S13" Type="CustomStage" Readonly="0">... (3 Replies)
Discussion started by: kmsekhar
3 Replies

5. Shell Programming and Scripting

How to remove string inside html tag <a>

Does anybody know how i can remove string from <a> tag? There are several hundred posts in a few forums that need to be cleaned up. The precise situation is ---------- <a href="http://mydomain.com/cgi-bin/anyboard.cgi?fvp=/family/sexuality_and_spirituality/&cmd=rA&cG=43"> ------------- my... (6 Replies)
Discussion started by: georgi58
6 Replies

6. Shell Programming and Scripting

How can i find texts inside a html tag using sed?

How can i find texts inside a html tag using sed? Html texts: What i tried: cat infile | sed -e 's/\(<kbd*\)\(.*\)\(kbd>\)/\2/ Expected result like this: sed -i -e 's/@colophon/@@colophon/' \ -e 's/doc@cygnus.com/doc@@cygnus.com/' bfd/doc/bfd.texinfo (5 Replies)
Discussion started by: cola
5 Replies

7. Shell Programming and Scripting

Help with XML tag value extraction based on matching condition

sample xml file part <DocumentMinorVersion>0</DocumentMinorVersion> <DocumentVersion>1</DocumentVersion> <EffectiveDate>2017-05-30T00:00:00Z</EffectiveDate> <FollowOnFrom> <ContractRequest _LoadId="export_AJ6iAFoh6g0rE9"> <_LocalId>CRW2218451</_LocalId> ... (4 Replies)
Discussion started by: paul1234
4 Replies

8. Shell Programming and Scripting

Help with XML tag value extraction based on condition

sample xml file part <?xml version="1.0" encoding="UTF-8"?><ContractWorkspace xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" _LoadId="export_AJ6iAFmh+pQHq1" xsi:noNamespaceSchemaLocation="ContractWorkspace.xsd"> <_LocalId>CW2218471</_LocalId> <Active>true</Active> ... (3 Replies)
Discussion started by: paul1234
3 Replies

9. Shell Programming and Scripting

Help with tag value extraction from xml file based on a matching condition

Hi , I have a situation where I need to search an xml file for the presence of a tag <FollowOnFrom> and also , presence of partial part of the following tag <ContractRequest _LoadId and if these 2 exist ,then extract the value from the following tag <_LocalId> which is "CW2094139". There... (2 Replies)
Discussion started by: paul1234
2 Replies

10. UNIX for Beginners Questions & Answers

Replacing tag based on condition

Hi All, I am having a file like below. The file will having information about the records.If you see the file the file is header and data. For example it have 1 men tag and the tag id will be come after headers. The change is I want to convert All pets tag from P to X. I did a sed like below... (5 Replies)
Discussion started by: arunkumar_mca
5 Replies
TFBS::PatternI(3pm)					User Contributed Perl Documentation				       TFBS::PatternI(3pm)

NAME
TFBS::PatternI - interface definition for all pattern objects (currently includes matrices and word (consensus and regular expressions ) DESCRIPTION
TFBS::PatternI is a draft class that should contain general interface for matrix and other (future) pattern objects. It is not defined and not used yet, as I need to ponder over certain unresolved issues in general pattern definition. User feedback is more than welcome. FEEDBACK
Please send bug reports and other comments to the author. AUTHOR - Boris Lenhard Boris Lenhard <Boris.Lenhard@cgb.ki.se> APPENDIX
The rest of the documentation details each of the object methods. Internal methods are preceded with an underscore. ID Title : ID Usage : my $ID = $icm->ID() $pfm->ID('M00119'); Function: Get/set on the ID of the pattern (unique in a DB or a set) Returns : pattern ID (a string) Args : none for get, string for set name Title : name Usage : my $name = $pwm->name() $pfm->name('PPARgamma'); Function: Get/set on the name of the pattern Returns : pattern name (a string) Args : none for get, string for set class Title : class Usage : my $class = $pwm->class() $pfm->class('forkhead'); Function: Get/set on the structural class of the pattern Returns : class name (a string) Args : none for get, string for set tag Title : tag Usage : my $acc = $pwm->tag('acc') $pfm->tag(source => "Gibbs"); Function: Get/set on the structural class of the pattern Returns : tag value (a scalar/reference) Args : tag name (string) for get, tag name (string) and value (any scalar/reference) for set all_tags Title : all_tags Usage : my %tag = $pfm->all_tags(); Function: get a hash of all tags for a matrix Returns : a hash of all tag values keyed by tag name Args : none delete_tag Title : delete_tag Usage : $pfm->delete_tag('score'); Function: get a hash of all tags for a matrix Returns : nothing Args : a string (tag name) perl v5.14.2 2008-01-24 TFBS::PatternI(3pm)
All times are GMT -4. The time now is 11:02 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy