Sponsored Content
Top Forums Shell Programming and Scripting Extract value inside <text> tag for a particular condition. Post 302270363 by parshant_bvcoe on Sunday 21st of December 2008 10:49:18 AM
Old 12-21-2008
Extract value inside <text> tag for a particular condition.

Hi All!

I have obtained following output from a tool "pdftohtml" ::

So, my input is as under:

<text top="246" left="160" width="84" height="16" font="3">Business purpose</text>
<text top="260" left="506" width="220" height="16" font="3">giving the right information and new insights </text>
<text top="296" left="160" width="67" height="16" font="3">Characteristic</text>
<text top="296" left="278" width="111" height="16" font="3">Operational processing</text>
<text top="296" left="506" width="120" height="16" font="3">Informational processing</text>
<text top="318" left="160" width="55" height="16" font="3">Orientation</text>
<text top="318" left="278" width="56" height="16" font="3">Transaction</text>
<text top="318" left="506" width="42" height="16" font="3">Analysis</text>
<text top="340" left="160" width="43" height="16" font="3">Function</text>
------
----

Now, i want to write a shell script that checks the value of "left" attribute in in each <text> tag and if this value is equal to 160, it saves the content enclosed inside a particular <text> tag in an arbitrary file inside <p> tag.

So, i want output as follows:

<p>Business purpose</p>
<p>Characteristic</p>
<p>Orientation</p>
<p>Function</p>
------
-----

Any help will be Truly Appreciated. Thanks in advance !!!
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How do I extract text only from html file without HTML tag

I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part. Same problem happens in "type" command in MS-DOS. I know you can do it by opening it in Internet Explorer,... (4 Replies)
Discussion started by: los111
4 Replies

2. Shell Programming and Scripting

Finding a string inside A Tag

I have umpteen number of files containing HTML A tags in the below format or I want to find all the lines that contain the word Login= I used this command grep "Login=" * This gave me normal lines as well which contain the word Login= for example, it returned lines which... (2 Replies)
Discussion started by: dahlia84
2 Replies

3. Shell Programming and Scripting

extract xml tag based on condition

Hi All, I have a large xml file of invoices. The file looks like below: <INVOICES> <INVOICE> <NAME>Customer A</NAME> <INVOICE_NO>1234</INVOICE_NO> </INVOICE> <INVOICE> <NAME>Customer A</NAME> <INVOICE_NO>2345</INVOICE_NO> </INVOICE> <INVOICE> <NAME>Customer A</NAME>... (9 Replies)
Discussion started by: angshuman
9 Replies

4. Shell Programming and Scripting

Replace text inside XML file based on condition

Hi All, I want to change the name as SEQ_13 ie., <Property Name="Name">SEQ_13</Property> when the Stage Type is PxSequentialFile ie., <Property Name="StageType">PxSequentialFile</Property> :wall: Input.XML <Main> <Record Identifier="V0S13" Type="CustomStage" Readonly="0">... (3 Replies)
Discussion started by: kmsekhar
3 Replies

5. Shell Programming and Scripting

How to remove string inside html tag <a>

Does anybody know how i can remove string from <a> tag? There are several hundred posts in a few forums that need to be cleaned up. The precise situation is ---------- <a href="http://mydomain.com/cgi-bin/anyboard.cgi?fvp=/family/sexuality_and_spirituality/&cmd=rA&cG=43"> ------------- my... (6 Replies)
Discussion started by: georgi58
6 Replies

6. Shell Programming and Scripting

How can i find texts inside a html tag using sed?

How can i find texts inside a html tag using sed? Html texts: What i tried: cat infile | sed -e 's/\(<kbd*\)\(.*\)\(kbd>\)/\2/ Expected result like this: sed -i -e 's/@colophon/@@colophon/' \ -e 's/doc@cygnus.com/doc@@cygnus.com/' bfd/doc/bfd.texinfo (5 Replies)
Discussion started by: cola
5 Replies

7. Shell Programming and Scripting

Help with XML tag value extraction based on matching condition

sample xml file part <DocumentMinorVersion>0</DocumentMinorVersion> <DocumentVersion>1</DocumentVersion> <EffectiveDate>2017-05-30T00:00:00Z</EffectiveDate> <FollowOnFrom> <ContractRequest _LoadId="export_AJ6iAFoh6g0rE9"> <_LocalId>CRW2218451</_LocalId> ... (4 Replies)
Discussion started by: paul1234
4 Replies

8. Shell Programming and Scripting

Help with XML tag value extraction based on condition

sample xml file part <?xml version="1.0" encoding="UTF-8"?><ContractWorkspace xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" _LoadId="export_AJ6iAFmh+pQHq1" xsi:noNamespaceSchemaLocation="ContractWorkspace.xsd"> <_LocalId>CW2218471</_LocalId> <Active>true</Active> ... (3 Replies)
Discussion started by: paul1234
3 Replies

9. Shell Programming and Scripting

Help with tag value extraction from xml file based on a matching condition

Hi , I have a situation where I need to search an xml file for the presence of a tag <FollowOnFrom> and also , presence of partial part of the following tag <ContractRequest _LoadId and if these 2 exist ,then extract the value from the following tag <_LocalId> which is "CW2094139". There... (2 Replies)
Discussion started by: paul1234
2 Replies

10. UNIX for Beginners Questions & Answers

Replacing tag based on condition

Hi All, I am having a file like below. The file will having information about the records.If you see the file the file is header and data. For example it have 1 men tag and the tag id will be come after headers. The change is I want to convert All pets tag from P to X. I did a sed like below... (5 Replies)
Discussion started by: arunkumar_mca
5 Replies
ns_adp_registertag(3aolserver)				    AOLserver Built-In Commands 			    ns_adp_registertag(3aolserver)

__________________________________________________________________________________________________________________________________________________

NAME
ns_adp_registeradp, ns_adp_registerproc, ns_adp_registerscript, ns_adp_registertag, ns_register_adptag - ADP registered tags SYNOPSIS
ns_adp_registeradp tag ?endtag? adp ns_adp_registerproc tag ?endtag? proc ns_adp_registerscript tag ?endtag? script ns_adp_registertag tag ?endtag? adp ns_register_adptag tag ?endtag? script _________________________________________________________________ DESCRIPTION
These commands enable definition of HTML tags within an ADP file which are expanded and evaluated by the server before returning output to the client. Tags are defined as eitehr a single tag with options, e.g., <mytag a=b c=d> or as an opening/closing tag pair, e.g., <mytag> text </mytag>. This approach is an alternative to direct calls via the <% script %> syntax as described in the ns_adp man page. ns_adp_registeradp tag ?endtag? adp ns_adp_registertag tag ?endtag? adp These commands are identical and register an ADP code fragment to be invoked when the specified tag is encountered while parsing an ADP. The tag argument specifies the tag that will trigger invocation of the ADP fragment, which is specified by the adp argument. If the endtag argument is specified, then the ADP fragment will be invoked with two arguments: The first will be the enclosed con- tent, and the second will be the name of an ns_set with any attributes specified in the tag. If no endtag argument is specified, the ADP fragment will only be passed the name of the ns_set. The arguments may be retrieved using ns_adp_bindargs or ns_adp_argc and ns_adp_argv. When the ADP fragment is invoked, its result is inserted in the output instead of the tag (or, if the endtag was specified, in place of the tag, end tag, and the enclosed content). Note: Care must be taken when using this function from inside an ADP, because the adpstring is likely to contain script delimiters (<% ... %>) which will prematurely terminate script fragments. It is probably easier to restrict use of this function to .tcl files. ns_adp_registerproc tag ?endtag? proc This commands register a Tcl procedure to be evaluated when the given tag is encountered. The tag argument specifies the tag that will trigger a call to the procedure specified by the proc argument. The procedure will be called with a variable number of arguments, one for each of the attributes provided in the tag. If the endtag argument is specified, the procedure will also receive a final argument with the contents of the text enclosed between the tags. No evaluation of the content will be performed, it will be passed as a single text block. When the procedure is invoked, its result is inserted in the output instead of the tag (or, if the endtag was specified, in place of the tag, end tag, and the enclosed content). ns_adp_registerscript tag ?endtag? script ns_register_adptag tag ?endtag? script These commands are identical and register a Tcl script to be evaluated when the given tag is encountered. The tag argument speci- fies the tag that will trigger evaluation of the script specified by the script argument. If the endtag argument is specified, then the script will be modified with two arguments appended: The first will be the enclosed content, and the second will be the name of an ns_set with any attributes specified in the tag. If no endtag argument is specified, the script will be modified with just the name of the ns_set appended. When the script is evaluated, its result is inserted in the output instead of the tag (or, if the endtag was specified, in place of the tag, end tag, and the enclosed content). EXAMPLES
The following is a simple way of handling conditional content in ADPs: proc remember {input tagset} { global _adp_memory set tagname [ns_set iget $tagset name] if {[string match "" $tagname]} { set _adp_memory($tagname) $input return "" } else { return $input } } proc recall {name} { global _adp_memory if {[info exists _adp_memory($name)]} { set parsecommand [list ns_adp_parse -string] lappend parsecommand $_adp_memory($name) ns_puts -nonewline [uplevel $parsecommand] } else { ns_log Error "[ns_adp_argv 0]: Unable to recall } } If the preceding Tcl has been executed (perhaps during server startup), then the following ADP fragment displays the results of a database query in a table, or shows "No rows in result." if there are no rows: <% set rows {} set db [ns_db gethandle] ns_db exec "select somecolumn from sometable" set row [ns_db bindargs $db] while {[ns_db getrow $db $row] > 0} { lappend rows [ns_set get $row "somecolumn"] } ns_db releasehandle $db %> <remember name="hasrows_header"> <table> </remember> <remember name="hasrows_rows"> <tr> <td><%=$column%></td> </tr> </remember> <remember name="hasrows_footer"> </table> </remember> No rows in result. <remember name="norows"> <% if {[llength $rows] > 0} { recall "hasrows_header" foreach row $rows { set column $row recall "hasrows_rows" } recall "hasrows_footer" } else { recall "norows" } %> The following example demonstrates the use of ns_adp_registertag: ns_adp_registertag printdate { The current date is: <%=[ns_httptime [ns_time]]%> } Once defined, typically in a startup script, you could simple include the "<printdate>" tag to append the text with current date into the output buffer. SEE ALSO
ns_adp(1), ns_adp_eval(n), ns_adp_safeeval(n), ns_adp_include(n) KEYWORDS
ADP, dynamic pages, registered tag AOLserver 4.0 ns_adp_registertag(3aolserver)
All times are GMT -4. The time now is 10:17 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy