Extract XML block when value is matched (Shell script)


Login or Register for Dates, Times and to Reply

 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Extract XML block when value is matched (Shell script)
# 1  
Extract XML block when value is matched (Shell script)

Hi everyone,


So i'm struggling with an xml (log file) where we get information about some devices, so the logfile is filled with multiple "blocks" like that.

Based on the <devId> i want to extract this part of the xml file. If possible I want it to have an script for this, cause we'll use this function quite a lot.


Already tried with grep, but i only get the line with the devId which isn't the result i want.
Tried to fiddle with xmllint but my knowledge isn't that advanced to play with it.




Code:
2019-06-16 20:20:11,695 | INFO  | CHEDULER_TOVSDC] | vsdc                             | 94 - org.apache.camel.camel-core - 2.17.3 | There is no task ack to send
2019-06-16 20:20:11,901 | INFO  | r[VSDCFE_ALARMS] | vsdc                             | 94 - org.apache.camel.camel-core - 2.17.3 | VSDCFE_ALARMS: Device alarm isssue message received contains {"time":"2019-06-16T18:20:10","deviceId":"number","alarms":[{"obis":"0;0;97;98;20;255","attributeId":"2","classId":"1","value":"0600000000"},{"obis":"0;0;97;98;21;255","attributeId":"2","classId":"1","value":"0600000000"},{"obis":"0;0;97;98;22;255","attributeId":"2","classId":"1","value":"0600800000"}]}
2019-06-16 20:20:11,914 | INFO  | r[VSDCFE_ALARMS] | AlarmTaskProcessor               | 237 - vsdc-alarm-manager - 2.0.83.43 | Alarm descriptor value is <= 0
2019-06-16 20:20:11,914 | INFO  | r[VSDCFE_ALARMS] | AlarmTaskProcessor               | 237 - vsdc-alarm-manager - 2.0.83.43 | Alarm descriptor value is <= 0
2019-06-16 20:20:11,914 | INFO  | r[VSDCFE_ALARMS] | vsdc                             | 94 - org.apache.camel.camel-core - 2.17.3 | Sending alarm device task request xml to task manager:  <?xml version="1.0" encoding="UTF-8"?>
<taskReq
    xmlns=""
       taskId="ALARM_number_1560709211907" taskType="DLMS" version="4" isActivation="false"
        execPriority="3">
    <targets>
        <devID>number</devID>
    </targets>
    <schedule>
        <start>2019-06-16T20:23:11.907+02:00</start>
        <stop>2019-06-16T21:20:11.908+02:00</stop>
    </schedule>
    <dlmsParams mode="unicast">
        <unicast timeout="45" maxTry="0" />
    </dlmsParams>
    <resultParams>
        <priority>urgent</priority>
        <mode>all</mode>
        <useCache>none</useCache>
    </resultParams>
    <transactions count="1">
            <transaction id="1">
                <dlms operation="SETM" association="1" >
                    <setm order="1" obis="0;0;97;98;22;255" attribute="2" classId="1">
                        <xdr>0600800000</xdr>
                    </setm>
                    <setm order="2" obis="0;0;97;98;2;255" attribute="2" classId="1">
                        <xdr>0600000000</xdr>
                       </setm>
                </dlms>
            </transaction>
    </transactions>
</taskReq>

2019-06-16 20:20:11,938 | INFO  | ActiveMQ Task-1  | FailoverTransport                | 77 - org.apache.activemq.activemq-osgi - 5.12.1 | Successfully connected to 
2019-06-16 20:20:11,950 | INFO  | CHEDULER_TOVSDC] | TaskProcessor                    | 246 - vsdc-task-manager - 2.0.83.43 | Task Scheduling validation
2019-06-16 20:20:11,952 | INFO  | CHEDULER_TOVSDC] | TaskProcessor                    | 246 - vsdc-task-manager - 2.0.83.43 | save task taskid ALARM_number_1560709211907
2019-06-16 20:20:11,956 | INFO  | r[VSDCFE_ALARMS] | DeviceIssueAlarmProcessor        | 237 - vsdc-alarm-manager - 2.0.83.43 | Starting device issue alarm process
2019-06-16 20:20:11,961 | INFO  | CHEDULER_TOVSDC] | SchedulerProcessor               | 246 - vsdc-task-manager - 2.0.83.43 | schedule creator for Task taskId: ALARM_number_1560709211907
2019-06-16 20:20:11,961 | INFO  | r[VSDCFE_ALARMS] | ldnFromDinAdapterRouteAlarm      | 94 - org.apache.camel.camel-core - 2.17.3 | message send from vsdc : <?xml version="1.0" encoding="UTF-8"?><alarms xmlns="">
    <deviceAlarms>
        <alarm devId="number" obis="0;0;97;98;20;255" classId="1" attribute="2" time="2019-06-16T20:20:10.000+02:00">0600000000</alarm>
        <alarm devId="number" obis="0;0;97;98;21;255" classId="1" attribute="2" time="2019-06-16T20:20:10.000+02:00">0600000000</alarm>
        <alarm devId="number" obis="0;0;97;98;22;255" classId="1" attribute="2" time="2019-06-16T20:20:10.000+02:00">0600800000</alarm>
    </deviceAlarms>
</alarms> 
2019-06-16 20:20:11,961 | INFO  | CHEDULER_TOVSDC] | SchedulerProcessor               | 246 - vsdc-task-manager - 2.0.83.43 | NON PERIODIC
2019-06-16 20:20:11,962 | INFO  | CHEDULER_TOVSDC] | SchedulerProcessor               | 246 - vsdc-task-manager - 2.0.83.43 | prepare Scheduling taskId [ALARM_number_1560709211907]
2019-06-16 20:20:11,962 | INFO  | CHEDULER_TOVSDC] | JobsDatePlannerServiceImpl       | 246 - vsdc-task-manager - 2.0.83.43 | First creator execution Date: 2019-06-16T20:22:11.907+02:00
2019-06-16 20:20:11,962 | INFO  | CHEDULER_TOVSDC] | SchedulerProcessor               | 246 - vsdc-task-manager - 2.0.83.43 | STOP DATE = 2019-06-16T21:20:11.908+02:00
2019-06-16 20:20:11,963 | INFO  | ActiveMQ Task-1  | FailoverTransport                | 77 - org.apache.activemq.activemq-osgi - 5.12.1 | Successfully connected to 
2019-06-16 20:20:11,974 | INFO  | CHEDULER_TOVSDC] | SchedulerProcessor               | 246 - vsdc-task-manager - 2.0.83.43 | schedule finalizer for Task taskId: ALARM_number_1560709211907
2019-06-16 20:20:11,975 | INFO  | CHEDULER_TOVSDC] | SchedulerProcessor               | 246 - vsdc-task-manager - 2.0.83.43 | NON PERIODIC
2019-06-16 20:20:11,975 | INFO  | r[VSDCFE_ALARMS] | vsdmc                             | 94 - org.apache.camel.camel-core -  2.17.3 | Device alarm message is sent to M2M containing <?xml  version="1.0" encoding="UTF-8"?><alarms  xmlns="">
    <deviceAlarms>
        <alarm devId="number" obis="0;0;97;98;20;255" classId="1" attribute="2" time="2019-06-16T20:20:10.000+02:00">0600000000</alarm>
        <alarm devId="number" obis="0;0;97;98;21;255" classId="1" attribute="2" time="2019-06-16T20:20:10.000+02:00">0600000000</alarm>
        <alarm devId="number" obis="0;0;97;98;22;255" classId="1" attribute="2" time="2019-06-16T20:20:10.000+02:00">0600800000</alarm>
    </deviceAlarms>
</alarms>

# 2  
Can you post expected output?
# 3  
Hi anbu,


So the codeblock that I posted is the result that i want to achieve. We have multiple blocks like that (we're close to 1m lines) and i just want to extract 1 block out of it.
I don't know if it would help to deliver your a bigger sample of code.
# 4  
I think this is what you want.
It will extract all lines between successive "<devID>number</devID>" lines.


Code:
#!/bin.bash
write="N"
count=0
while read line
do
if [ "${line:0:7}" = "<devID>" ]
then
    write="Y"
    count=0
    #cat /dev/null >output.log
    #uncomment the above line to create a log of only the last block,
    #otherwise all blocks will be extracted
fi
if [ "$write" = "Y" ]
then
    echo "$line" >>output.log
    let count=$count + 1
fi
if [ "$write" = "Y" -a  $count -ne 1 ]
then
    if [ "$line:0:7}" = "<devID>" ]
    then
        write="N"
        count=0
    fi 
 fi
done

# 5  
First of all thank you for your response jgt Smilie
The answer that you gave me isn't actually the result that I want to achieve. (My explanation isn't clear also, so the fault lies with me)
I'll try to explain it a little better.


So we have log files in our VM's and we get information from some devices through Sim cards. Due to bad practices of our sub-contractors, we have to fetch some information through the logs.
But because we have more than 4k devices that sends logs daily we have to retrieve within those +400k line log files what kind of information it sent.


The snippet that I've put in my initial post is actually the final result that I want from the log files. So you could say that above and below my snippet there is a multitude of information and xml syntax that i don't really need.


Unfortunately there isn't really a specific line where I have to look for.
Code:
if [ "${line:0:7}" = "<devID>" ]

The only thing I noticed is that my block could also start from:

Code:
2019-06-16 20:20:11,901 | INFO  | r[VSDCFE_ALARMS] | vsdc                             | 94 - org.apache.camel.camel-core - 2.17.3 | VSDCFE_ALARMS: Device alarm isssue message received contains {"time":"2019-06-16T18:20:10","deviceId":"number","alarms":[{"obis":"0;0;97;98;20;255","attributeId":"2","classId":"1","value":"0600000000"},{"obis":"0;0;97;98;21;255","attributeId":"2","classId":"1","value":"0600000000"},{"obis":"0;0;97;98;22;255","attributeId":"2","classId":"1","value":"0600800000"}]}

and end with

Code:
</alarms>

so everything in between has to be included into the output. Note also that not all "blocks" aren't the same. Depending on the information it could contain 10 or more lines for each device.
# 6  
If you want to print block which contains pattern devId="number"

Code:
awk -v RS="</alarms>" -v ORS="</alarms>" ' devId="number" ' logfile

This User Gave Thanks to anbu23 For This Post:
Login or Register for Dates, Times and to Reply

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #209
Difficulty: Medium
The development work for OSPF prior to its publication as open standard was undertaken largely by the Digital Equipment Corporation.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How can I extract XML block around matching search string?

I want to extract XML block surrounding search string Ex: print XML block for string "myapp1-ear" surrounded by "<application> .. </application>" Input XML: <?xml version="1.0" encoding="UTF-8"?> <deployment-request> <requestor> <first-name>kchinnam</first-name> ... (16 Replies)
Discussion started by: kchinnam
16 Replies

2. Shell Programming and Scripting

Extract Matched Records from XML

Hi All, I have a requirement to extract para in XML file on the basis of another list file having specific parameters. I will extract these para from XML and import in one scheduler tool. file2 <FOLDER DATACENTER="ControlMserver" VERSION="800" PLATFORM="UNIX" FOLDER_NAME="SH_AP_INT_B01"... (3 Replies)
Discussion started by: looney
3 Replies

3. Shell Programming and Scripting

Parse xml in shell script and extract records with specific condition

Hi I have xml file with multiple records and would like to extract records from xml with specific condition if specific tag is present extract entire row otherwise skip . <logentry revision="21510"> <author>mantest</author> <date>2015-02-27</date> <QC_ID>334566</QC_ID>... (12 Replies)
Discussion started by: madankumar.t@hp
12 Replies

4. Shell Programming and Scripting

How to extract data from XML file using shell scripting?

Hi , I have input file as XML. following are input data #complex.xml Code: <?xml version="1.0" encoding="UTF-8"?><TEST_doc xmlns="http://www.w3.org/2001/XMLSchema-instance"> <ENTRY uid="123456"> <protein> <name>PROT001</name> <organism>Human</organism> ... (1 Reply)
Discussion started by: arun_kohan
1 Replies

5. Shell Programming and Scripting

How to extract data from xml file using shell scripting?

Hi evry1, This is my 1st post in this forum.Pls help me I want to extract some data froma xml file which has 2000 lines using shell scripting. Actually my xml file has some "audio and video codes" which i need to arrange in a column wise format after extracting it using shell scripting.I... (4 Replies)
Discussion started by: arun_kohan
4 Replies

6. Shell Programming and Scripting

Shell script to extract data in repeating tags from xml

Hi, I am new to shell scripting. I need to extract data between repeating tags from an xml file and store the data in an array to process it further. <ns1:root xmlns:ns1="http://example.com/config"> <ns1:interface>in1</ns1:interface> <ns1:operation attribute1="true" attribute2="abd"... (2 Replies)
Discussion started by: sailendra
2 Replies

7. Shell Programming and Scripting

Extract selective block from XML file

Hi, There's an xml file produced from a front-end tool as shown below: <INPUT DATABASE ="ORACLE" DBNAME ="UNIX" NAME ="FACT_TABLE" OWNERNAME ="DIPS"> <INPUTFIELD DATATYPE ="double" DEFAULTVALUE ="" DESCRIPTION ="" NAME ="STORE_KEY" PICTURETEXT ="" PORTTYPE ="INPUT" PRECISION ="15" SCALE... (6 Replies)
Discussion started by: dips_ag
6 Replies

8. Shell Programming and Scripting

How to remove xml namespace from xml file using shell script?

I have an xml file: <AutoData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <Table1> <Data1 10 </Data1> <Data2 20 </Data2> <Data3 40 </Data3> <Table1> </AutoData> and I have to remove the portion xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" only. I tried using sed... (10 Replies)
Discussion started by: Gary1978
10 Replies

9. Shell Programming and Scripting

Help with shell script to extract data from XML file

Hello Scripting Gurus, I need help with extracting data from the XML file using shell script. The data is in a large XML and I need to extract the id values of all completedworkflows. Here is a sample of it. Input and output data is also in the attached text files. <wfregistry>... (5 Replies)
Discussion started by: yajaykumar
5 Replies

10. Shell Programming and Scripting

extract data from xml- shell script using awk

Hi, This is the xml file that i have. - <front-servlet platform="WAS4.0" request-retriever="SiteMinder-aware" configuration-rescan-interval="60000"> <concurrency-throttle maximum-concurrency="50" redirect-page="/jsp/defaulterror.jsp" /> - <loggers> <instrumentation... (5 Replies)
Discussion started by: nishana
5 Replies

Featured Tech Videos