I'm afraid it's not a one-liner anymore but it is the shortest even marginally-compliant parser I've written:
It processes tag-by-tag instead of line-by-line, and keeps a list of the tags its seen. "<html><body><h1>" would put "H1%BODY%HTML" in TAGS, for example. Then you can check what tags you're inside, and print accordingly.
How can i read all the unique words in a file, i used -
cat comment_file.txt | /usr/xpg6/bin/tr -sc 'A-Za-z' '/012'
and
cat comment_file.txt | /usr/xpg6/bin/tr -sdc 'A-Za-z' '/012'
but they didnt worked..... (5 Replies)
I have a file with 14million lines and I would like to extract all the unique lines from the file into another text file.
For example:
Contents of file1
happy
sad
smile
happy
funny
sad
I want to run a command against file one that only returns the unique lines (ie 1 line for happy... (3 Replies)
Hi all,
I have got a problem while comparing 2 text files and the result should contains the unique values(Non repeatable).
For eg:
file1.txt
1
2
3
4
file2.txt
2
3
So after comaping the above 2 files I should get only 1 and 4 as the output. Pls help me out. (7 Replies)
Dear Colleagues:
I have .rtf files of a collection of newspaper articles. Each newspaper article starts with a variation of the phrase "Document * of 20" and is separated from the next article with the character string "==================="
I would like to be able to take the text composing... (3 Replies)
Hi all! Im trying to extract a portion of text from a file and put it into a new file. I need all the lines between <Placement> and </Placement> including the Placemark lines themselves. Is there a way to extract all instances of these and not just the first one found? I've tried using sed and... (4 Replies)
I'm attempting to write a script to identify users who have sudo access on a server. I only want to extract the ID's of the sudo users after a unique line of text. The list of sudo users goes to the EOF so I only need the script to start after the unique line of text. I already have a script to... (1 Reply)
Hi Gurus,
I have 100 tab-delimited text files each with 21 columns. I want to extract only 2nd and 5th column from each text file. However, the values in both 2bd and 5th column contain duplicate values but the combination of these values in a row are not duplicate. I want to extract only those... (3 Replies)
I am trying to use awk to print the unique entries in $2
So in the example below there are 3 lines but 2 of the lines match in $2 so only one is used in the output.
File.txt
chr17:29667512-29667673 NF1:exon.1;NF1:exon.2;NF1:exon.38;NF1:exon.4;NF1:exon.46;NF1:exon.47 703.807... (5 Replies)
Trying to print the unique values in $2 before the -, currently the count is displayed. Hopefully, the below is close. Thank you :).
file
chr2:46603668-46603902 EPAS1-902|gc=54.3 253.1
chr2:211471445-211471675 CPS1-1205|gc=48.3 264.7
chr19:15291762-15291983 NOTCH3-1003|gc=68.8 195.8... (3 Replies)
I can find and replace text when the delimiters are unique. What I cannot do is replace text using two NON-unique delimiters:
Ex.,
"This html code <text blah >contains <garbage blah blah >. All tags must go,<text > but some must be replaced with <garbage blah blah > without erasing other... (5 Replies)
Discussion started by: bedtime
5 Replies
LEARN ABOUT DEBIAN
xmlparsing
xmlparsing(3) Coin xmlparsing(3)NAME
xmlparsing - XML Parsing with Coin For Coin 3.0, we added an XML parser to Coin. This document describes how it can be used for generic
purposes.
Why another XML parser, you might ask? First of all, the XML parser is actually a third-party parser, expat. Coin needed one, and many
Coin-dependent projects needed one as well. We therefore needed to expose an API for it. However, integrating a 3rd-party parser into Coin,
we can not expose its API directly, or other projects also using Expat would get conflicts. We therefore needed to expose the XML API with
a unique API, hence the API you see here. It is based on a XML DOM API we use(d) in a couple of other projects, but it has been tweaked to
fit into Coin and to be wrapped over Expat (the original implementation just used flex).
The XML parser is both a streaming parser and a DOM parser. Being a streaming parser means that documents can be read in without having to
be fully contained in memory. When used as a DOM parser, the whole document is fully parsed in first, and then inspected by client code by
traversing the DOM. The two modes can actually be mixed arbitrarily if ending up with a partial DOM sounds useful.
The XML parser has both a C API and a C++ API. The C++ API is just a wrapper around the C API, and only serves as convenience if you prefer
to read/write C++ code (which is tighter) over more verbose C code.
The C API naming convention may look a bit strange, unless you have written libraries to be wrapped for scheme/lisp-like languages before.
Then you might be familiar with the convention of suffixing your functions based on their behaviour/usage meaning. Mutating functions are
suffixed with '!', or '_x' for (eXclamation point), and predicates are suffixed with '?', or '_p' in C.
The simplest way to use the XML parser is to just call cc_xml_read_file(filename) and then traverse the DOM model through using
cc_xml_doc_get_root(), cc_xml_elt_get_child(), and cc_xml_elt_get_attr().
See also:
XML related functions and objects, cc_xml_doc, cc_xml_elt, cc_xml_attr
Version 3.1.3 Wed May 23 2012 xmlparsing(3)