The standards say that grep and other text processing utilities produce unspecified behavior when an input file is not a text file. By definition, text files cannot have any lines longer than the LINE_MAX limit on your system. (On most systems, LINE_MAX is set to 2048 bytes (including the <newline> line terminator.) Your sample file includes lines that are more than 6950 bytes long. Unless the grep man page on your system indicates that it can process text file with unlimited line lengths (or at least lines with lengths longer than whatever the maximum line length is in your files), any results you get from a script using:
cannot be trusted.
If you are trying to count unique customer numbers, you'll need something more powerful than grep. If we make the very wild assumption that the <customer customer-no="xxxxxxxxxxxxxxxxx"> tag is the first tag on any line in which it appears and that awk on your system (another text processing utility) supports line lengths at least as long as the longest lines in your XML files, you could try:
If awk can't handle lines that long on your system and the <customer customer-no="xxxxxxxxxxxxxxxxx"> tag is the first tag on any line in which it appears and appears at the start of each of those lines, you could try the following:
Using other standard utilities, could you try something like this:-
It might be pretty heavy on processing, but it seems to work for me. If egrep is being unpredictable, try putting the cut first instead. That would give cut more lines to process, but I suppose egrep then has shorter lines to consider. I'm not sure which will perform better.
I'm searching for the names of a TV show in the XML file I've attached at the end of this post. What I'm trying to do now is pull out/list the data from each of the <SeriesName> tags throughout the document. Currently, I'm only able to get data the first instance of that XML field using the... (9 Replies)
I want to write a one line script that outputs the result of multiple xml tags from a XML file. For example I have a XML file which has below XML tags in the file:
<EMAIL>***</EMAIL>
<CUSTOMER_ID>****</CUSTOMER_ID>
<BRANDID>***</BRANDID>
Now I want to grep the values of all these specified... (1 Reply)
Hi,
I'm having a xml file with multiple xml header. so i want to split the file into multiple files.
Sample.xml consists multiple headers so how can we split these multiple headers into multiple files in unix.
eg :
<?xml version="1.0" encoding="UTF-8"?>
<ml:individual... (3 Replies)
Hi All,
We need to split a large xml into multiple valid xml with same header(2lines) and footer(last line) for N number of letterId.
In the example below we have first 2 lines as header and last line as footer.(They need to be in each split xml file)
Header:
<?xml version="1.0"... (5 Replies)
Hi Everyone,
I'm new here and I was checking this old post:
/shell-programming-and-scripting/180669-splitting-file-into-several-smaller-files-using-perl.html
(cannot paste link because of lack of points)
I need to do something like this but understand very little of perl.
I also check... (4 Replies)
Hi All,
I have two xml files.
One is having below input
<NameValuePair>
<name>Daemon</name>
<value>tcp:7474</value>
</NameValuePair>
<NameValuePair>
<name>Network</name>
<value></value>
</NameValuePair>
... (2 Replies)
I am trying to parse the XML Google contact file using tools like xmllint and I even dived into the XSL Style Sheets using xsltproc but I get nowhere.
I can not supply any sample file as it contains private data but you can download your own contacts using this script:
#!/bin/sh
# imports... (9 Replies)
HI All,
I have to split a xml file into multiple xml files and append it in another .xml file. for example below is a sample xml and using shell script i have to split it into three xml files and append all the three xmls in a .xml file. Can some one help plz.
eg:
<?xml version="1.0"?>... (4 Replies)
Hi All,
I'm stuck with adding multiple lines(irrespective of line number) to a file before a particular xml tag. Please help me.
<A>testing_Location</A>
<value>LA</value>
<zone>US</zone>
<B>Region</B>
<value>Russia</value>
<zone>Washington</zone>
<C>Country</C>... (0 Replies)
I have an xml file:
<AutoData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<Table1>
<Data1 10 </Data1>
<Data2 20 </Data2>
<Data3 40 </Data3>
<Table1>
</AutoData>
and I have to remove the portion xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" only.
I tried using sed... (10 Replies)