Sponsored Content
Top Forums Shell Programming and Scripting How to extract xml attribute values using awk inline.? Post 302974861 by Don Cragun on Sunday 5th of June 2016 01:12:59 AM
Old 06-05-2016
Using the following bash script on OS X El Capitan (version 10.11.5) with a 2.8 GHz Intel Core i7 (4 core) processor and a 1TB SSD holding my data and code, the following script:
Code:
#!/bin/bash
printf 'perl results:\n'
time perl -nle '/<factories.*baseQueueName/ and @r=/(?:jndiN|baseQueueN|n)ame="([^"]+)/g and print join ",",@r[1,2,0]' resources.xml

printf '\nawk results:\n'
time awk '
BEGIN {	# Define ERE to match double-quoted string.
	dqsERE = "\"[^\"]*\""

	# Construct array of extended regular expression to match attributes...
	# First, the attribute name...
	EREs[++nEREs] = " jndiName="
	EREs[++nEREs] = " baseQueueName="
	EREs[++nEREs] = " name="

	# Save the lengths of the attribute names and add an ERE to match the
	# double-qouted string following the attribute name.
	for(i = 1; i <= nEREs; i++) {
		offset[i] = length(EREs[i]) + 1
		EREs[i] = EREs[i] dqsERE
	}
}
/<factories.*baseQueueName/ {
	# We have an XML line to process.
	# Clear the output string.
	out = ""
	for(i = 1; i <= nEREs; i++t) {
		# For each desired attribute, look for a match...
		if(match($0, EREs[i]))
			# A match was found for this attribute, add the data
			# from the double-quoted string to the output string.
			out = out substr($0, RSTART + offset[i],
				RLENGTH - offset[i] - 1)
		# Whether or not a match was found, add a field separator to
		# the output string.
		out = out ((i < nEREs) ? "," : "")
	}
	# Print the accumulated output string.
	print out
}' resources.xml

printf '\nOriginal script results:\n'
time {	grep '<factories.*baseQueueName' resources.xml | while read line; do
	QUEUE_JNDI_NAME=$( echo $line | grep -o 'jndiName="[^"]*"'      | cut -d'"' -f2 )
	BASE_QUEUE_NAME=$( echo $line | grep -o 'baseQueueName="[^"]*"' | cut -d'"' -f2 )
	QUEUE_NAME=$(      echo $line | grep -o 'name="[^"]*"'          | cut -d'"' -f2 )
	echo "$QUEUE_JNDI_NAME,$BASE_QUEUE_NAME,$QUEUE_NAME"
	done
}

produces output with the average times (from 10 runs):
Code:
perl results:
jms/Queue1,TEST.QUEUE1,Queue1
jms/Queue2,TEST.QUEUE2,Queue2

real	0m0.007s
user	0m0.002s
sys	0m0.003s

awk results:
jms/Queue1,TEST.QUEUE1,Queue1
jms/Queue2,TEST.QUEUE2,Queue2

real	0m0.002s
user	0m0.001s
sys	0m0.001s

Original script results:
jms/Queue1,TEST.QUEUE1,Queue1
jms/Queue2,TEST.QUEUE2,Queue2

real	0m0.017s
user	0m0.011s
sys	0m0.016s

Note that grep on OS X does not have a -P option, so I had to modify your script to use basic REs instead of perl REs.

Note that even with commented awk code, my awk script runs in 1/3 the time needed for Aia's perl script (with the grep folded into the perl script).

Could we assume that you didn't time the grep | perl pipeline, but instead just timed the perl script that did not select only lines matching the pattern <factories.*baseQueueName; or is awk really that much slower on your system compared to perl?
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract XML Element Values

I have a rather large file with XML-style content. Each line contains one full XML entry. For example: 1:<Message><DNIS>1234</DNIS><UCID>3456</UCID><TransferGroup>XYZXYZ</TransferGroup></Message> 2:<Message><DNIS>9999</DNIS><UCID>2584</UCID><TransferGroup>ABCABC</TransferGroup></Message>... (1 Reply)
Discussion started by: sharpi03
1 Replies

2. Shell Programming and Scripting

read xml tag attribute and store it in variable

Hi, How to read xml tag attributes and store into variable in shell script? Thanks, Swetha (5 Replies)
Discussion started by: swetha123
5 Replies

3. Shell Programming and Scripting

Extracting the value of an attribute tag from XML

Greetings, I am very new to the UNIX shell scripting and would like to learn. However, I am currently stuck on how to process the below sample of code from an XML file using UNIX comands: <ATTRIBUTE NAME="Memory" VALUE="512MB"/> <ATTRIBUTE NAME="CPU Speed" VALUE="3.0GHz"/> <ATTRIBUTE... (5 Replies)
Discussion started by: JesterMania
5 Replies

4. Shell Programming and Scripting

Changing attribute value in xml file using shell

I have an xml file.I want to change the value of some tag: <WASConfig version='1.1'> <JavaVirtualMachine> <scope> <server> <hostNode>myAsNode</hostNode> <name>myserver</name> </server> </scope> <Settings> <Setting> ... (5 Replies)
Discussion started by: javaholics
5 Replies

5. Shell Programming and Scripting

how to extract part of xml line via awk?

Hi, I like to set a variable "name" automatically by reading an xml file. My code looks like this: set name = `awk '/<generationTime>/,/<\/generationTime>/ p' $xml_name` the "name" is thus set to <generationTime>2004-12-01T08:23:50.000000</generationTime> How can I separate this line,... (3 Replies)
Discussion started by: friend
3 Replies

6. Shell Programming and Scripting

Extracting the value of an middle attribute tag from XML

Hi All, Please help me out in resolving this.. <secondTag enabled='true' processName='test1' pidFile='/tmp/test1.pid' /> From the above tag, I'm trying to retrieve the value of enabled and pidFile attributes by means of processName attribute. Would be thankful in resolving this..... (5 Replies)
Discussion started by: mjavalkar
5 Replies

7. Shell Programming and Scripting

Extract values from an XML File

Hi, I need to capture all the attributes with delete next to it. The source XML file is attached. The output should contain something like this below: Attributes = legacyExchangeDN Action = Delete Username = Hero Joker Loginid = joker09 OU =... (4 Replies)
Discussion started by: prvnrk
4 Replies

8. Shell Programming and Scripting

Change attribute value in xml using shell script

hi, i am new to unix and i have a problem. -------------------------------------------------------------- sebben.xml <envelope> <email> sebben@example.com </email> </envelope> script_mail written in the vi editor. #!/bin/sh script to change the value in attribute <email> echo... (3 Replies)
Discussion started by: sebbenw
3 Replies

9. UNIX for Dummies Questions & Answers

Urgent - XML Attribute Remove

Hi I have got a XML file which has got content as follows: <FUNCall81110000 Tag="81110000" CallDate="25/08/11" CallTime="00:03:22" TotalUsageValue="30" MeasurementUnit="1"/> I want to remove TotalUsageValue="30" only and TotalUsageValue="XXXXX" here XXX can be any value. (1 Reply)
Discussion started by: muchyog
1 Replies

10. Shell Programming and Scripting

Extract values from xml file script

Hi, please help on this. I want extract values of xml file structure and print in determined way. <ProjectName> --> only appears once <StructList> --> is the top node <Struct> node --> could be more than 1 NameID, STX, STY, PRX, PRY --> appears only 1 time within each <Struct> node... (10 Replies)
Discussion started by: Ophiuchus
10 Replies
All times are GMT -4. The time now is 07:38 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy