reformatting xml file, sed or awk I think (possibly perl)
I have some xml files that cannot be read using a standard parser, or I am using the wrong parser. The issues seems to be spaces in some of the tags.
Here is a sample,
The parser isn't able to find the number 2, so that information is lost, etc. It seems as if it would like,
I seems like it would be pretty simple to script up something to convert between these two formats.
I believe that you can also do,
Which would be much easier to manage in sed, since it only involves one line.
I don't know much of anything about xml, so suggestions would be appreciated.
I need to know the way. I have got parsing down some nodes. But I was unable to get the child node perfectly. If you have code please send it. It will be very useful for me. (0 Replies)
Hi, I want to get data from Xml file by using sed or awk command. I want to get the following result :
mon titre 1;Createur1;Dossier1
mon titre 1;Createur1;Dossier1
and save it in cvs file (fichier.cvs).
FROM this Xml file (test.xml):
<playlist version="1">
<trackList>
<track>... (1 Reply)
hello,
new to this forum. but i have a requirement to extract the value from multiple xml node and print out the values to new file with comma seperated. would like to know how this would be done using either sed/perl or some unix script. an example would be tremendous...
sample input file:... (2 Replies)
Hello everyone,
Firstly i do not require alot of help.. i am right at the end of finishing my scipt but cannot find a solution to the last part.
What i need to do is, prompt the user for a file to work with, which i have done.
promt the user for an output file - which is done.
#!/bin/bash... (14 Replies)
I hopefully have a simple request - I need to process multiple files reformatting the output based on tags at the beginning of each line. So the data for the new 3 lines of the output file are in the HDR line and then the details are in the DTL tagged lines.
for ifile in $indir
do
echo... (1 Reply)
I have a Unix file with 200,000 records, and need to remove all records from the file that have the character ‘I' in position 68 (68 bytes from the left). I have searched for similar problems and it appears that it would be possible with sed, awk or perl but I do not know enough about any of these... (7 Replies)
Hi All,
I need help from any of you.Would be so thankful for your help.
I/P
DDDD,1045,161,1557,429,1694,800,1911,1113,2460,1457,2917>
1609,3113,1869,3317,2732,3701,3727,4132,5857,5107>
9004,6496
DDDD,1125,157,1558,429,1694,800,1911,1117,2432,1444,2906>... (2 Replies)
Hello,
I have a requirement to extract the value from multiple xml node and print out the values to new file to compare.
Would be done using either awk/perl or some unix script.
For example sample input file:
.....
.....
<factories xmi:type="resources.jdbc:DataSource"... (2 Replies)
Hi Everyone,
I'm new here and I was checking this old post:
/shell-programming-and-scripting/180669-splitting-file-into-several-smaller-files-using-perl.html
(cannot paste link because of lack of points)
I need to do something like this but understand very little of perl.
I also check... (4 Replies)
Sorry for the long/weird title but I'm stuck on a problem I have. I have this XML file:
</member>
<member>
<name>TransactionID</name>
<value><string>123456789123456</string></value>
</member>
<member>
<name>Number</name>
... (9 Replies)
Discussion started by: cozzin
9 Replies
LEARN ABOUT DEBIAN
xml_pp
XML_PP(1p) User Contributed Perl Documentation XML_PP(1p)NAME
xml_pp - xml pretty-printer
SYNOPSYS
xml_pp [options] [<files>]
DESCRIPTION
XML pretty printer using XML::Twig
OPTIONS
-i[<extension>]
edits the file(s) in place, if an extension is provided (no space between "-i" and the extension) then the original file is backed-up
with that extension
The rules for the extension are the same as Perl's (see perldoc perlrun): if the extension includes no "*" then it is appended to the
original file name, If the extension does contain one or more "*" characters, then each "*" is replaced with the current filename.
-s <style>
the style to use for pretty printing: none, nsgmls, nice, indented, record, or record_c (see XML::Twig docs for the exact description
of those styles), 'indented' by default
-p <tag(s)>
preserves white spaces in tags. You can use several "-p" options or quote the tags if you need more than one
-e <encoding>
use XML::Twig output_encoding (based on Text::Iconv or Unicode::Map8 and Unicode::String) to set the output encoding. By default the
original encoding is preserved.
If this option is used the XML declaration is updated (and created if there was none).
Make sure that the encoding is supported by the parser you use if you want to be able to process the pretty_printed file (XML::Parser
does not support 'latin1' for example, you have to use 'iso-8859-1')
-l loads the documents in memory instead of outputing them as they are being parsed.
This prevents a bug (see BUGS) but uses more memory
-f <file>
read the list of files to process from <file>, one per line
-v verbose (list the current file being processed)
-- stop argument processing (to process files that start with -)
-h display help
EXAMPLES
xml_pp foo.xml > foo_pp.xml # pretty print foo.xml
xml_pp < foo.xml > foo_pp.xml # pretty print from standard input
xml_pp -v -i.bak *.xml # pretty print .xml files, with backups
xml_pp -v -i'orig_*' *.xml # backups are named orig_<filename>
xml_pp -i -p pre foo.xhtml # preserve spaces in pre tags
xml_pp -i.bak -p 'pre code' foo.xml # preserve spaces in pre and code tags
xml_pp -i.bak -p pre -p code foo.xml # same
xml_pp -i -s record mydb_export.xml # pretty print using the record style
xml_pp -e utf8 -i foo.xml # output will be in utf8
xml_pp -e iso-8859-1 -i foo.xml # output will be in iso-8859-1
xml_pp -v -i.bak -f lof # pretty print in place files from lof
xml_pp -- -i.xml # pretty print the -i.xml file
xml_pp -l foo.xml # loads the entire file in memory
# before pretty printing it
xml_pp -h # display help
BUGS
Elements with mixed content that start with an embedded element get an extra
<elt><b>b</b>toto<b>bold</b></elt>
will be output as
<elt>
<b>b</b>toto<b>bold</b></elt>
Using the "-l" option solves this bug (but uses more memory)
TODO
update XML::Twig to use Encode with perl 5.8.0
AUTHOR
Michel Rodriguez <mirod@xmltwig.com>
perl v5.12.4 2011-05-18 XML_PP(1p)