Using the following bash script on OS X El Capitan (version 10.11.5) with a 2.8 GHz Intel Core i7 (4 core) processor and a 1TB SSD holding my data and code, the following script:
Code:
#!/bin/bash
printf 'perl results:\n'
time perl -nle '/<factories.*baseQueueName/ and @r=/(?:jndiN|baseQueueN|n)ame="([^"]+)/g and print join ",",@r[1,2,0]' resources.xml
printf '\nawk results:\n'
time awk '
BEGIN { # Define ERE to match double-quoted string.
dqsERE = "\"[^\"]*\""
# Construct array of extended regular expression to match attributes...
# First, the attribute name...
EREs[++nEREs] = " jndiName="
EREs[++nEREs] = " baseQueueName="
EREs[++nEREs] = " name="
# Save the lengths of the attribute names and add an ERE to match the
# double-qouted string following the attribute name.
for(i = 1; i <= nEREs; i++) {
offset[i] = length(EREs[i]) + 1
EREs[i] = EREs[i] dqsERE
}
}
/<factories.*baseQueueName/ {
# We have an XML line to process.
# Clear the output string.
out = ""
for(i = 1; i <= nEREs; i++t) {
# For each desired attribute, look for a match...
if(match($0, EREs[i]))
# A match was found for this attribute, add the data
# from the double-quoted string to the output string.
out = out substr($0, RSTART + offset[i],
RLENGTH - offset[i] - 1)
# Whether or not a match was found, add a field separator to
# the output string.
out = out ((i < nEREs) ? "," : "")
}
# Print the accumulated output string.
print out
}' resources.xml
printf '\nOriginal script results:\n'
time { grep '<factories.*baseQueueName' resources.xml | while read line; do
QUEUE_JNDI_NAME=$( echo $line | grep -o 'jndiName="[^"]*"' | cut -d'"' -f2 )
BASE_QUEUE_NAME=$( echo $line | grep -o 'baseQueueName="[^"]*"' | cut -d'"' -f2 )
QUEUE_NAME=$( echo $line | grep -o 'name="[^"]*"' | cut -d'"' -f2 )
echo "$QUEUE_JNDI_NAME,$BASE_QUEUE_NAME,$QUEUE_NAME"
done
}
produces output with the average times (from 10 runs):
Code:
perl results:
jms/Queue1,TEST.QUEUE1,Queue1
jms/Queue2,TEST.QUEUE2,Queue2
real 0m0.007s
user 0m0.002s
sys 0m0.003s
awk results:
jms/Queue1,TEST.QUEUE1,Queue1
jms/Queue2,TEST.QUEUE2,Queue2
real 0m0.002s
user 0m0.001s
sys 0m0.001s
Original script results:
jms/Queue1,TEST.QUEUE1,Queue1
jms/Queue2,TEST.QUEUE2,Queue2
real 0m0.017s
user 0m0.011s
sys 0m0.016s
Note that grep on OS X does not have a -P option, so I had to modify your script to use basic REs instead of perl REs.
Note that even with commented awk code, my awk script runs in 1/3 the time needed for Aia's perl script (with the grep folded into the perl script).
Could we assume that you didn't time the grep | perl pipeline, but instead just timed the perl script that did not select only lines matching the pattern <factories.*baseQueueName; or is awk really that much slower on your system compared to perl?
I have a rather large file with XML-style content. Each line contains one full XML entry. For example:
1:<Message><DNIS>1234</DNIS><UCID>3456</UCID><TransferGroup>XYZXYZ</TransferGroup></Message>
2:<Message><DNIS>9999</DNIS><UCID>2584</UCID><TransferGroup>ABCABC</TransferGroup></Message>... (1 Reply)
Greetings,
I am very new to the UNIX shell scripting and would like to learn. However, I am currently stuck on how to process the below sample of code from an XML file using UNIX comands:
<ATTRIBUTE NAME="Memory" VALUE="512MB"/>
<ATTRIBUTE NAME="CPU Speed" VALUE="3.0GHz"/>
<ATTRIBUTE... (5 Replies)
I have an xml file.I want to change the value of some tag:
<WASConfig version='1.1'>
<JavaVirtualMachine>
<scope>
<server>
<hostNode>myAsNode</hostNode>
<name>myserver</name>
</server>
</scope>
<Settings>
<Setting>
... (5 Replies)
Hi,
I like to set a variable "name" automatically by reading an xml file. My code looks like this:
set name = `awk '/<generationTime>/,/<\/generationTime>/ p' $xml_name`
the "name" is thus set to
<generationTime>2004-12-01T08:23:50.000000</generationTime>
How can I separate this line,... (3 Replies)
Hi All,
Please help me out in resolving this..
<secondTag enabled='true' processName='test1' pidFile='/tmp/test1.pid' />
From the above tag, I'm trying to retrieve the value of enabled and pidFile attributes by means of processName attribute.
Would be thankful in resolving this..... (5 Replies)
Hi,
I need to capture all the attributes with delete next to it. The source XML file is attached.
The output should contain something like this below:
Attributes = legacyExchangeDN
Action = Delete
Username = Hero Joker
Loginid = joker09
OU =... (4 Replies)
hi, i am new to unix and i have a problem.
--------------------------------------------------------------
sebben.xml
<envelope>
<email> sebben@example.com
</email>
</envelope>
script_mail written in the vi editor.
#!/bin/sh
script to change the value in attribute <email>
echo... (3 Replies)
Hi
I have got a XML file which has got content as follows:
<FUNCall81110000 Tag="81110000" CallDate="25/08/11" CallTime="00:03:22" TotalUsageValue="30" MeasurementUnit="1"/>
I want to remove TotalUsageValue="30" only
and TotalUsageValue="XXXXX"
here XXX can be any value. (1 Reply)
Hi, please help on this. I want extract values of xml file structure and print in determined way.
<ProjectName> --> only appears once
<StructList> --> is the top node
<Struct> node --> could be more than 1
NameID, STX, STY, PRX, PRY --> appears only 1 time within each <Struct> node... (10 Replies)