A simpler XML tool


 
Thread Tools Search this Thread
The Lounge What is on Your Mind? A simpler XML tool
# 8  
Old 10-24-2011
Quote:
Originally Posted by Corona688
Perhaps. Xpath looks verbose, ugly, complicated, and redundant though. It's not really meant for streams.
Basic XPath syntax for specifying nodes is pretty simple.

^html.head.title - /html/head/title
table - //table
tag:key=value - tag[key=value]

Not much difference. But it was just a thought - seems easier to re-use an existing syntax than invent a new one...

Last edited by CarloM; 10-24-2011 at 06:34 AM..
# 9  
Old 10-24-2011
Quote:
Originally Posted by Corona688
We've been getting a lot of XML questions lately, and I suspect it's only going to get worse better ...
Very strange... I did not edit the blue word in there, and it doesn't tell me which moderator did this or why.

I meant 'worse' in that if you know anything about awk and xml, you'll know they aren't suitable for each other, but we keep getting asked anyway.

---------- Post updated at 10:40 AM ---------- Previous update was at 10:35 AM ----------

Quote:
Originally Posted by figaro
Great initiative. Let us know how far you are getting and whether you are prepared to release in the public domain.
Right now it can do this:
Code:
$ ./mox < cnn_topstories.rss /item/title 2> /dev/null

<title>Satellite expected to hit Earth this weekend</title>

<title>Libyan elections 'coming within months'</title>

<title>NATO to end mission by Oct. 31</title>

<title>What next?</title>

<title>Gadhafi's demise and the Arab Spring</title>

<title>Ahmadinejad: U.S. hated around world</title>

<title>Heir to Saudi throne dies in New York</title>

<title>At least 9 killed in Yemen clashes</title>

<title>China in mourning after toddler's death</title>

<title>Tunisia set for first Arab Spring election</title>

$

IOW, pretty much the same as my OP. But the parser is much better now.

I intend to release it into the public domain should I come up with anything useful.

I'm adopting some parts of the xpath syntax but don't think I'll implement the whole thing the same way. A lot of it does not apply.

Though I should check out this xmlgawk and see if it will do what I want.

---------- Post updated at 12:01 PM ---------- Previous update was at 10:40 AM ----------

xmlgawk refused to run properly unless actually installed, no way to test it without that.

It needed library files teased into random places by hand, make install didn't bother installing extensions -- like the xml one. The INSTALL and README's were still the useless generic ones.

Its XML parser is pointlessly strict. Feeding it a webpage inevitably causes it to die with "mismatched tag" before it bothers processing any data at all.

Their first trivial example didn't work at all, I had to modify it into something it would.

Not impressed.
# 10  
Old 10-24-2011
Quote:
Originally Posted by Corona688
[...]
xmlgawk refused to run properly unless actually installed, no way to test it without that.

It needed library files teased into random places by hand, make install didn't bother installing extensions -- like the xml one. The INSTALL and README's were still the useless generic ones.

Its XML parser is pointlessly strict. Feeding it a webpage inevitably causes it to die with "mismatched tag" before it bothers processing any data at all.

Their first trivial example didn't work at all, I had to modify it into something it would.

Not impressed.
It worked for me, when configured with:

Code:
./configure --disable-shared --enable-static-extensions

I don't know if it's supposed to work for html parsing though ...
This User Gave Thanks to radoulov For This Post:
# 11  
Old 10-24-2011
HTML is a subset of XML with a few weird bits. If you can't parse HTML, you're ignoring most of the XML in the universe. The xgawk documentation claims it's supposed to be nonvalidating for the purpose of parsing less-than-ideal XML...

Building in the extensions statically is a very good idea.
Login or Register to Ask a Question

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Simpler crontab entry to execute pgm on last day of the month

The following bash command line works for the last day of the month. Test by replacing the 1 with tomorrows day of month number && echo "Day before tomorrow"Can it be used within crontab? As * * 28-31 * * && echo "Today ls last day of month" >>/tmp/crontabtestI tried to test crontab with... (1 Reply)
Discussion started by: lsatenstein
1 Replies

2. Shell Programming and Scripting

Need simpler version of these commands

Hi all, I am trying to grep a file with the word grand and get all the fields.. Then replace multiple spaces with single space and then get 8 th field and add all these numbers . I am able to do it but with so amny commands which i feel can be done in a simpler way Please let me know if... (4 Replies)
Discussion started by: Hypesslearner
4 Replies

3. Shell Programming and Scripting

Need simpler way to find all my disk space utilization using df -h

Hi All, I am using SSH Tectia terminal to get the disk space utilization of a particular folder /opt/logs in all the servers one by one using the command df -h and looking through the list of folders manually to get /opt/logs folder disk space used percentage . The problem here is , it... (2 Replies)
Discussion started by: aakhan2011
2 Replies

4. UNIX for Dummies Questions & Answers

Simpler next month year program

I have created this program to get the next month and year. Is there a simpler way. #!/bin/ksh string=`cat Date.txt` year=`echo $string | cut -c 1-4` month=`echo $string | cut -c 5-6` echo $year$month mon=`expr $month + 1` if ; then mon=0$mon echo $mon fi if ; then month=01 ... (2 Replies)
Discussion started by: w020637
2 Replies

5. Shell Programming and Scripting

A simpler way to do this (save a list of files based on part of their name)

Hello, I have a script that checks every file with a specific extension in a specific directory. The file names contain some numerical output and I am recording the file names with the best n outcomes. The script finds all files in the directory with the extension .out.txt and uses awk to... (12 Replies)
Discussion started by: LMHmedchem
12 Replies

6. Shell Programming and Scripting

Is there a simpler way to validate user input for float?

I'm trying to only read price (FLOAT (i.e 1.10, 3.14, etc etc)) If the input is just an integer, I will add a .00 behind. (i.e 3 becomes 3.00 , 20 becomes 20.00) If the input is without 2 decimal places, I'll add a 0. (i.e 3.1 becomes 3.10) I tried using the below code, it works but I don't... (6 Replies)
Discussion started by: andylbh
6 Replies

7. Programming

How to simplify this perl script to a cleaner simpler look?

my $branch_email_e = $FORM{r_Branch}; my $hostbranch_email_e = $FORM{r_Host_Branch}; my $branch_email_f = $FORM{r_Direction_generale}; my $hostbranch_email_f = $FORM{r_Direction_generale_daccueil}; my $branch_realname_e = ''; my $branch_realname_f = ''; ... (4 Replies)
Discussion started by: callyvan
4 Replies

8. Shell Programming and Scripting

Is there a simpler way to achieve this?

Hi all I have the following which is part of a larger interactive script for adding virtual hosts to Apache's configuration (it was built for non-technical administrators). I'm curious as to whether there is a simpler way of achieving the same thing. All it does is look into the... (3 Replies)
Discussion started by: mlott
3 Replies
Login or Register to Ask a Question