Sponsored Content
Top Forums Shell Programming and Scripting Need an efficient way to search for a tag in an xml file having millions of rows Post 302603496 by Sheel on Thursday 1st of March 2012 06:38:39 AM
Old 03-01-2012
Have tried all the options (grep . sed & awk) but none of these seem to perform well when the file has 1 billion rows in it. There is one catch though. The input xml file has all the tags in a single row. i.e. this single row gets divided into 1 billion rows after indentation.
This indentation is manual. Can you guys help me with a command that indents the file first and then may be the search command could return the results faster.

e.g. Right Now the InputFile is

Quote:
<Root><?xml version="1.0" encoding="UTF-8"?<Person><Name>John</Name></Person></Root>
I need a command to convert this file into the format below

Quote:
<?xml version="1.0" encoding="UTF-8"?>
<Root>
<Person>
<Name>John</Name>
</Person>
</Root>
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

XML tag replacement from different XML file

We have 2 XML file 1. ORIGINAL.xml file and 2. ATTRIBUTE.xml files, In the ORIGINAL.xml we need some modification as <resourceCode>431048</resourceCode>under <item type="Manufactured"> tag - we need to grab the 431048 value from tag and pass it to database table in unix shell script to find the... (0 Replies)
Discussion started by: balrajg
0 Replies

2. Shell Programming and Scripting

Changing particular tag value of xml file

Hi All, I have number of xml file like : ______________________________________________________ <?xml version="1.0" standalone="no"?> <!-- Created by Symology Ltd on 13/02/2012 - USER_BATCH_ID 0011091684 --> <!-- RECIPIENT_URL: HTTP://194.168.0.81:3408 --> <EToNrequest ... (7 Replies)
Discussion started by: krsnadasa
7 Replies

3. Shell Programming and Scripting

How to retrieve the value from XML tag whose end tag is in next line

Hi All, Find the following code: <Universal>D38x82j1JJ </Universal> I want to retrieve the value of <Universal> tag as below: Please help me. (3 Replies)
Discussion started by: mjavalkar
3 Replies

4. Shell Programming and Scripting

How to add the multiple lines of xml tags before a particular xml tag in a file

Hi All, I'm stuck with adding multiple lines(irrespective of line number) to a file before a particular xml tag. Please help me. <A>testing_Location</A> <value>LA</value> <zone>US</zone> <B>Region</B> <value>Russia</value> <zone>Washington</zone> <C>Country</C>... (0 Replies)
Discussion started by: mjavalkar
0 Replies

5. Emergency UNIX and Linux Support

Trying to parse a xml file for only one tag

I have a xml file in where I need to parse only a particular tag and print the output in the shell script. Here is the tag info in the xml file <dp:file> This is dp file output </dp:file> Output should be printed as This is dp file output. Please help.Thank you. (5 Replies)
Discussion started by: chandu123
5 Replies

6. Shell Programming and Scripting

To search for a particular tag in xml and collate all similar tag values and display them count

I want to basically do the below thing. Suppose there is a tag called object1. I want to display an output for all similar tag values under heading of Object 1 and the count of the xmls. Please help File: <xml><object1>house</object1><object2>child</object2>... (9 Replies)
Discussion started by: srkmish
9 Replies

7. Shell Programming and Scripting

Efficient way to search array in text file by awk

I have one array SPLNO with approx 10k numbers.Now i want to search the subscriber number from MDN.TXT file (containing approx 1.5 lac record)from the array.if subscriber number found in array it will perform below operation.my issue is that it's taking more time because for one number it's search... (6 Replies)
Discussion started by: siramitsharma
6 Replies

8. Shell Programming and Scripting

sed search and replace after xml tag

Hi All, I'm new to sed. In following XML file <interface type='direct'> <mac address='52:54:00:86:ce:f6'/> <source dev='eno1' mode='bridge'/> <model type='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> ... (8 Replies)
Discussion started by: varunrapelly
8 Replies

9. Shell Programming and Scripting

Moving XML tag/contents after specific XML tag within same file

Hi Forum. I have an XML file with the following requirement to move the <AdditionalAccountHolders> tag and its content right after the <accountHolderName> tag within the same file but I'm not sure how to accomplish this through a Unix script. Any feedback will be greatly appreciated. ... (19 Replies)
Discussion started by: pchang
19 Replies

10. UNIX for Beginners Questions & Answers

Grepping multiple XML tag results from XML file.

I want to write a one line script that outputs the result of multiple xml tags from a XML file. For example I have a XML file which has below XML tags in the file: <EMAIL>***</EMAIL> <CUSTOMER_ID>****</CUSTOMER_ID> <BRANDID>***</BRANDID> Now I want to grep the values of all these specified... (1 Reply)
Discussion started by: shubh752
1 Replies
Font::TTF::XMLparse(3pm)				User Contributed Perl Documentation				  Font::TTF::XMLparse(3pm)

NAME
Font::TTF::XMLparse - provides support for XML parsing. Requires Expat module XML::Parser::Expat SYNOPSIS
use Font::TTF::Font; use Font::TTF::XMLparse; $f = Font::TTF::Font->new; read_xml($f, $ARGV[0]); $f->out($ARGV[1]); DESCRIPTION
This module contains the support routines for parsing XML and generating the Truetype font structures as a result. The module has been separated from the rest of the package in order to reduce the dependency that this would bring, of the whole package on XML::Parser. This way, people without the XML::Parser can still use the rest of the package. The package interacts with another package through the use of a context containing and element 'receiver' which is an object which can possibly receive one of the following messages: XML_start This message is called when an open tag occurs. It is called with the context, tag name and the attributes. The return value has no meaning. XML_end This messages is called when a close tag occurs. It is called with the context, tag name and attributes (held over from when the tag was opened). There are 3 possible return values from such a message: undef This is the default return value indicating that default processing should occur in which either the current element on the tree, or the text of this element should be stored in the parent object. $context This magic value marks that the element should be deleted from the parent. Nothing is stored in the parent. (This rather than '' is used to allow 0 returns.) anything Anything else is taken as the element content to be stored in the parent. In addition, the context hash passed to these messages contains the following keys: xml This is the expat xml object. The context is also available as $context->{'xml'}{' mycontext'}. But that is a long winded way of not saying much! font This is the base object that was passed in for XML parsing. receiver This holds the current receiver of parsing events. It may be set in associated application to adjust which objects should receive messages when. It is also stored in the parsing stack to ensure that where an object changes it during XML_start, that that same object that received XML_start will receive the corresponding XML_end stack This is the parsing stack, used internally to hold the current receiver and attributes for each element open, as a complete hierarchy back to the root element. tree This element contains the storage tree corresponding to the parent of each element in the stack. The default action is to push undef onto this stack during XML_start and then to resolve this, either in the associated application (by changing $context->{'tree'}[-1]) or during XML_end of a child element, by which time we know whether we are dealing with an array or a hash or what. text Character processing is to insert all the characters into the text element of the context for available use later. METHODS
perl v5.10.1 2011-02-25 Font::TTF::XMLparse(3pm)
All times are GMT -4. The time now is 09:21 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy