12-21-2008
Extract value inside <text> tag for a particular condition.
Hi All!
I have obtained following output from a tool "pdftohtml" ::
So, my input is as under:
<text top="246" left="160" width="84" height="16" font="3">Business purpose</text>
<text top="260" left="506" width="220" height="16" font="3">giving the right information and new insights </text>
<text top="296" left="160" width="67" height="16" font="3">Characteristic</text>
<text top="296" left="278" width="111" height="16" font="3">Operational processing</text>
<text top="296" left="506" width="120" height="16" font="3">Informational processing</text>
<text top="318" left="160" width="55" height="16" font="3">Orientation</text>
<text top="318" left="278" width="56" height="16" font="3">Transaction</text>
<text top="318" left="506" width="42" height="16" font="3">Analysis</text>
<text top="340" left="160" width="43" height="16" font="3">Function</text>
------
----
Now, i want to write a shell script that checks the value of "left" attribute in in each <text> tag and if this value is equal to 160, it saves the content enclosed inside a particular <text> tag in an arbitrary file inside <p> tag.
So, i want output as follows:
<p>Business purpose</p>
<p>Characteristic</p>
<p>Orientation</p>
<p>Function</p>
------
-----
Any help will be Truly Appreciated. Thanks in advance !!!
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part.
Same problem happens in "type" command in MS-DOS.
I know you can do it by opening it in Internet Explorer,... (4 Replies)
Discussion started by: los111
4 Replies
2. Shell Programming and Scripting
I have umpteen number of files containing HTML A tags in the below format
or
I want to find all the lines that contain the word Login=
I used this command
grep "Login=" *
This gave me normal lines as well which contain the word Login= for example, it returned lines which... (2 Replies)
Discussion started by: dahlia84
2 Replies
3. Shell Programming and Scripting
Hi All,
I have a large xml file of invoices. The file looks like below:
<INVOICES>
<INVOICE>
<NAME>Customer A</NAME>
<INVOICE_NO>1234</INVOICE_NO>
</INVOICE>
<INVOICE>
<NAME>Customer A</NAME>
<INVOICE_NO>2345</INVOICE_NO>
</INVOICE>
<INVOICE>
<NAME>Customer A</NAME>... (9 Replies)
Discussion started by: angshuman
9 Replies
4. Shell Programming and Scripting
Hi All,
I want to change the name as SEQ_13
ie., <Property Name="Name">SEQ_13</Property>
when the Stage Type is PxSequentialFile
ie., <Property Name="StageType">PxSequentialFile</Property> :wall:
Input.XML
<Main>
<Record Identifier="V0S13" Type="CustomStage" Readonly="0">... (3 Replies)
Discussion started by: kmsekhar
3 Replies
5. Shell Programming and Scripting
Does anybody know how i can remove string from <a> tag?
There are several hundred posts in a few forums that need to be cleaned up.
The precise situation is
----------
<a href="http://mydomain.com/cgi-bin/anyboard.cgi?fvp=/family/sexuality_and_spirituality/&cmd=rA&cG=43">
-------------
my... (6 Replies)
Discussion started by: georgi58
6 Replies
6. Shell Programming and Scripting
How can i find texts inside a html tag using sed?
Html texts:
What i tried:
cat infile | sed -e 's/\(<kbd*\)\(.*\)\(kbd>\)/\2/
Expected result like this:
sed -i -e 's/@colophon/@@colophon/' \ -e 's/doc@cygnus.com/doc@@cygnus.com/' bfd/doc/bfd.texinfo (5 Replies)
Discussion started by: cola
5 Replies
7. Shell Programming and Scripting
sample xml file part
<DocumentMinorVersion>0</DocumentMinorVersion>
<DocumentVersion>1</DocumentVersion>
<EffectiveDate>2017-05-30T00:00:00Z</EffectiveDate>
<FollowOnFrom>
<ContractRequest _LoadId="export_AJ6iAFoh6g0rE9">
<_LocalId>CRW2218451</_LocalId>
... (4 Replies)
Discussion started by: paul1234
4 Replies
8. Shell Programming and Scripting
sample xml file part
<?xml version="1.0" encoding="UTF-8"?><ContractWorkspace xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" _LoadId="export_AJ6iAFmh+pQHq1" xsi:noNamespaceSchemaLocation="ContractWorkspace.xsd">
<_LocalId>CW2218471</_LocalId>
<Active>true</Active>
... (3 Replies)
Discussion started by: paul1234
3 Replies
9. Shell Programming and Scripting
Hi ,
I have a situation where I need to search an xml file for the presence of a tag
<FollowOnFrom> and also , presence of partial part of the following tag <ContractRequest _LoadId and if these 2 exist ,then
extract the value from the following tag <_LocalId> which is
"CW2094139". There... (2 Replies)
Discussion started by: paul1234
2 Replies
10. UNIX for Beginners Questions & Answers
Hi All,
I am having a file like below. The file will having information about the records.If you see the file the file is header and data. For example it have 1 men tag and the tag id will be come after headers. The change is I want to convert All pets tag from P to X. I did a sed like below... (5 Replies)
Discussion started by: arunkumar_mca
5 Replies
LEARN ABOUT DEBIAN
swfextract
swfextract(1) swftools swfextract(1)
NAME
swfextract - a tool for extracting data out of swf files.
Synopsis
swfextract [options] [file.swf]
DESCRIPTION
swfextracts allows one to extract swf movieclips and objects out of swf files.
SWF files are animation files which can be displayed in Web Browsers using the Flash Plugin.
OPTIONS
-h, --help
Print short help message and exit
-o, --output filename
Write output to file filename
-V, --version
Print version info and exit
-i, --id ids
ids is a range of IDs to extract. E.g. 1-10,14
-j, --jpegs ids
ids is a range of JPEG IDs to extract. E.g. 1-2,3,14-
-p, --pngs ids
ids is a range of PNG IDs to extract. E.g. -10,20-30
-f, --frame frames
frames is a range of frames to extract. E.g. 1-10,20-30,40-
-n, --name name
Set the name of the object to extract to name.
-w, --hollow
Copy empty frames to the output file, too.
-P, --placeobject
Copy original placeobject tag for the given object into the output file (Use with -i). This means that the object is at the same
position in the generated movie as in the original movie.
-j, --jpegs range
Extract jpeg pictures in range
-p, --pngs range
Extract png pictures in range
-m, --mp3
Extract main mp3 stream (There may be substreams in the Movieclips, as well. To extract these, first extract the Movieclips with -i
and then use -m)
AUTHOR
Matthias Kramm <kramm@quiss.org>
swfdump January 2003 swfextract(1)