Multiline html tag parse shell script Post: 303044115

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How do I extract text only from html file without HTML tag

I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part. Same problem happens in "type" command in MS-DOS. I know you can do it by opening it in Internet Explorer,...

2. Shell Programming and Scripting

how to use html tag in shell scripting

Hai friends I have a small doubt.. how can we use html tag in shell scripting code : echo "<html>" echo "<body>" echo " welcome to peace world " echo "</body>" echo "</html>" output displayed like this: <html> <body> welcome to peace world </body> </html>

3. UNIX for Advanced & Expert Users

shell script to parse html file

hi all, i have a html file something similar to this. <tr class="evenrow"> <td class="data">added</td><td class="data">xyz@abc.com</td> <td class="data">filename.sql</td><td class="modifications-data">08/25/2009 07:58:40</td><td class="data">Added TK prof script</td> </tr> <tr...

4. Shell Programming and Scripting

Parse HTML tag parameters and text

Hi! I have a bunch of HTML files, which I want to parse to CSV files. Every page has a table in it, and I need to parse each row into a csv record. With awk and sed, I managed to put every table row in separate lines. So my file looks like this: <TR> .... </TR> <TR> .... </TR> ...One...

5. Shell Programming and Scripting

Script to delete HTML tag

Guys, I have a little script that I got of the internet and that I use in Squid to block ads. I used that script with linux but now i have moved my servers to freebsd. I have a step learning curve there but it is fun: Back to the script issue. The script used to work i with linux but...

6. Shell Programming and Scripting

awk Script to parse a XML tag

I have an XML tag like this: <property name="agent" value="/var/tmp/root/eclipse" /> Is there way using awk that i can get the value from the above tag. So the output should be: /var/tmp/root/eclipse Help will be appreciated. Regards, Adi

7. Shell Programming and Scripting

Search for a html tag and print the entire tag

I want to print from <fruits> to </fruits> tag which have <fruit> as mango. Also i want both <fruits> and </fruits> in output. Please help eg. <fruits> <fruit id="111">mango<fruit> . another 20 lines . </fruits>

8. Shell Programming and Scripting

Using shell command need to parse multiple nested tag value of a XML file

I have this XML file - <gp> <mms>1110012</mms> <tg>988</tg> <mm>LongTime</mm> <lv> <lkid>StartEle=ONE, Desti = Motion</lkid> <kk>12</kk> </lv> <lv> <lkid>StartEle=ONE, Source = Velocity</lkid> <kk>2</kk> </lv> <lv> ...

9. Shell Programming and Scripting

XML Parse between to tag with upper tag

Hi Guys Here is my Input : <?xml version="1.0" encoding="UTF-8"?> <xn:MeContext id="01736"> <xn:VsDataContainer id="01736"> <xn:attributes> <xn:vsDataType>vsDataMeContext</xn:vsDataType> ...

10. Shell Programming and Scripting

How to remove html tag which has multiple lines in SHELL?

I want to clean a html file. I try to remove the script part in the html and remove the rest of tags and empty lines. The code I try to use is the following: sed '/<script/,/<\/script>/d' webpage.html | sed -e 's/<*>//g' | sed '/^\s*$/d' > output.txt However, in this method, I can not...

LEARN ABOUT DEBIAN

basex

basex(1)							 The XML Database							  basex(1)

NAME

       basex - XML database system and XPath/XQuery processor (command line mode)

SYNOPSIS

       basex [-bcdiLosuvVwxz] [query]

DESCRIPTION

       basex is a fast and powerful, yet light-weight and platform independent XML database system and XPath/XQuery processor.

OPTIONS

       A short description of option can be obtained from

	   $ basex -h

       or by browsing http://docs.basex.org/wiki/Startup_Options#BaseX_Standalone

DATABASE COMMANDS

       A list of supported Database commands can be obtained from

	   $ basex -c help

       or by browsing http://docs.basex.org/wiki/Commands

EXAMPLES

       o  XQuery evaluation (no database, no interaction, script mode):

	  $ basex -Lq 19+23
	  42
	  $ basex -Lq "<answer>{ 23+19 }</answer>"
	  <answer>42</answer>

       o  Import an XML file into database, output its content (query its root) and be verbose:

	   $ basex -Vc "CREATE DB input /usr/share/doc/basex/examples/input.xml; XQUERY /"
	   Database 'input' created in 136.84 ms.
	   <html>
	     <!-- Header -->
	     <head id="0">
	       <title>XML</title>
	     </head>
	     <!-- Body -->
	     <body id="1" bgcolor="#FFFFFF" text="#000000" link="#0000CC">
	       <h1>Databases &amp; XML</h1>
	       <div align="right">
		 <b>Assignments</b>
		 <ul>
		   <li>Exercise 1</li>
		   <li>Exercise 2</li>
		 </ul>
	       </div>
	     </body>
	     <?pi bogus?>
	   </html>

	   Query: /

	   Compiling:

	   Result: root()

	   Parsing: 5.08 ms
	   Compiling: 27.2 ms
	   Evaluating: 0.87 ms
	   Printing: 13.7 ms
	   Total Time: 46.86 ms

	   Hit(s): 1 Item
	   Updated: 0 Items
	   Printed: 358 Bytes

	   Query executed in 42.52 ms.

       o  XPath evaluation (with existing database):

	   $ basex -Lc "OPEN input; XQUERY //li[1]"
	   <li>Exercise 1</li>

       o  Retrieve XML from the web and perform XPath query:

	   $ basex -Lq "doc('http://files.basex.org/examples/input.xml')//li"
	   <li>Exercise 1</li>
	   <li>Exercise 2</li>

       o  W3C XQuery Full-Text (make use of full-text index and perform fuzzy query with a typing error):

	   $ basex
	   BaseX 7.1 [Standalone]
	   Try "help" to get more information.

	   > SET FTINDEX on
	   Full-Text Index: ON
	   > CREATE DB input /usr/share/doc/basex/examples/input.xml
	   Database 'input' created in 94.42 ms.
	   > XQUERY //b[text() contains text 'Asisgnment' using fuzzy]
	   <b>Assignments</b>
	   Query executed in 8.37 ms.

       o  Update the database and show result:

	   > XQUERY delete node //ul
	   Query executed in 2.79 ms.
	   > XQUERY replace value of node //b with 'Debian rules'
	   Query executed in 2.94 ms.
	   > XQUERY //div
	   <div align="right">
	     <b>Debian rules</b>
	   </div>
	   Query executed in 1.01 ms.

       o  Open an input xml file, execute a query and write result into file:

	   $ basex -Li /usr/share/doc/basex/examples/input.xml -q //div -o out.xml
	   $ cat out.xml
	   <div align="right">
	     <b>Assignments</b>
	     <ul>
	       <li>Exercise 1</li>
	       <li>Exercise 2</li>
	     </ul>
	   </div>

       o  Query an already existing database called 'input'. If a file named 'input' exists in current working directory it takes precedence:

	   $ basex -Li input -q //div
	   <div align="right">
	     <b>Assignments</b>
	     <ul>
	       <li>Exercise 1</li>
	       <li>Exercise 2</li>
	     </ul>
	   </div>

       o  Let basex process query input from standard in:

	  $ echo '19+23' | basex -Lq-
	  42

       o  Execute commands from script file:

	  $ cat commands.txt
	  create db debian <debian_db/>
	  xquery /
	  list
	  $ basex -LC commands.txt | grep debian
	  <debian_db/>
	  debian	      1 	 4639	    debian.xml

       o  Parse non well-formed HTML (needs libtagsoup-java installed):

	  $ cat bad.html
	  <html>
	    <ul>
	      <li>A
	      <li>B
	    </ul>
	  </html>

	  $ basex -c 'set parser html; set htmlopt method=html,nons=true; create db htmldb bad.html'
	  $ basex -q "doc('htmldb')"
	  <html>
	    <body>
	      <ul>
		<li>A</li>
		<li>B</li>
	      </ul>
	    </body>
	  </html>

	  For further documentation on how to configure the HTML Parser refer to
	  http://docs.basex.org/wiki/Parsers#HTML_Parser

SEE ALSO

       basexgui(1), basexserver(1), basexclient(1)

       ~/.basex
	      BaseX (standalone and server) properties

       ~/.basexgui
	      BaseX additional GUI properties

       ~/.basexperm
	      user name, passwords, and permissions

       ~/.basexevents
	      contains all existing events

       ~/BaseXData
	      Default database directory

       ~/BaseXData/.logs
	      Server logs

       ~/BaseXRepo
	      Package repository

       BaseX Documentation Wiki: http://docs.basex.org

HISTORY

       BaseX  started  as  a  research project of the Database and Information Systems Group (DBIS) at the University of Konstanz in 2005 and soon
       turned into a feature-rich open source XML database and XPath/XQuery processor.

LICENSE

       New (3-clause) BSD License

AUTHOR

       BaseX is developed by a bunch of people called 'The BaseX Team' <http://basex.org/about-us/> led by Christian Gruen <cg@basex.org>.

       The man page was written by Alexander Holupirek <alex@holupirek.de> while packaging BaseX for Debian GNU/Linux.

								   26 June 2012 							  basex(1)

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How do I extract text only from html file without HTML tag

Discussion started by: los111

2. Shell Programming and Scripting

how to use html tag in shell scripting

Discussion started by: jrex1983

3. UNIX for Advanced & Expert Users

shell script to parse html file

Discussion started by: sais

4. Shell Programming and Scripting

Parse HTML tag parameters and text

Discussion started by: senszey