Sponsored Content
Top Forums Shell Programming and Scripting sed filtering lines by range fails 1-line-ranges Post 302724111 by bakunin on Wednesday 31st of October 2012 08:56:10 AM
Old 10-31-2012
Many thanks for your helpful suggestions.

I modified the function a bit and noticed, that i don't need the last step "pStripTags" if i modify the sed-script to strip the tags immediately. Here is the revised function. I have added "tee -a <tracefile>" commands to control the various steps of the recursion. For production they can safely be removed as they only serve debugging purposes:

Code:
# ------------------------------------------------------------------------------
# pGetXML                        extract certain values from a layered XML code
# ------------------------------------------------------------------------------
# Author.....: bakunin, with help of various unix.com members
# last update: 2012 08 23    by: bakunin
# ------------------------------------------------------------------------------
# Revision Log:
#
# ------------------------------------------------------------------------------
# Usage:
#     pGetXML tag1[/option1] [tag2[/option2] ..]
#
#
#     Example:
#          cat file | pGetXML foo/opt1 bar/opt2
#          will search for a range of "<foo ...opt1..> ... </foo>" and in the
#          resulting stream search for a range of "<bar ..opt2..> ... </bar>
#          The result will be reformatted to a single line and the enclosing
#          tags will be removed. This text:
#
#          <foo type=opt2>
#               <sometag>
#          </foo>
#          <foo type=opt1>
#               <bar>
#                    somevalue
#               </bar
#               <bar type=opt2>searched_for</bar>
#          </foo>
#
#          will result only in "searched_for", because in the first foo-tag the
#          option doesn't match, the same goes for the first bar-tag 
#
# Prerequisites:
# - none
# ------------------------------------------------------------------------------
# Documentation:
# Extracts values from an XML file of nested tags presented at <stdin>.
# The given list of tags is searched recursively. Only the tag name has to
# be given, so
#
#             pGetXML foo
#
# will return the content of "<foo> .. </foo>". It is possible to refine tags
# by using "options", which will be searched for in the tag definition (see
#  below).
#
# Output goes to <stdout>.
#
#     Parameters: tag1[/opt1] [tag2[/opt2] ..tagN[/optN]] 
#     returns: void
# ------------------------------------------------------------------------------
# known bugs:
#
#     none
# ------------------------------------------------------------------------------
# ..........................(C) 2012 bakunin ..................................
# ------------------------------------------------------------------------------

function pGetXML
{
typeset chTag="$1"
typeset chOpt="$1"
typeset chLine=""

if [ "${chOpt#*/}" = "${chOpt}" ] ; then
     chOpt=""
else
     chOpt="${chOpt#*/}"
     chTag="${chTag%/*}"
fi

# DEBUG start
#      print -u2 - "inside pGetXML...."
#      print -u2 - "chTag=${chTag}"
#      print -u2 - "chOpt=${chOpt}"
#      print -u2 - "Args=$*\n"
# DEBUG end

if [ -n "$chTag" ] ; then
     shift
     sed -n '/<'"$chTag"'[^>]*'"$chOpt"'[^>]*>/ {
               :next
               /<\/'"$chTag"'[^>]*>/! {
                    N
                    b next
               }
             }
             /<\/'"$chTag"'[^>]*>/ {
               s/\n//g
               s/^.*<'"$chTag"'[^>]*'"$chOpt"'[^>]*>//
               s/<\/'"$chTag"'[^>]*>.*$//p
             }' |\
     tee -a xxx.$(date +'%H%M%N').out |\
     pGetXML $*
else
     tee -a xxx.last.out |\
     while read chLine ; do
          print - "$chLine"
     done
fi

return 0
}

bakunin
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove a range of lines from a file using sed

Hi I am having some issue editing a file in sed. What I want to do is, in a loop pass a variable to a sed command. Sed should then search a file for a line that matches that variable, then remove all lines below until it reaches a line starting with a constant. I have managed to write a... (14 Replies)
Discussion started by: Andy82
14 Replies

2. Shell Programming and Scripting

Sed print range of lines between line number and pattern

Hi, I have a file as below This is the line one This is the line two <\XMLTAG> This is the line three This is the line four <\XMLTAG> Output of the SED command need to be as below. This is the line one This is the line two <\XMLTAG> Please do the need to needful to... (4 Replies)
Discussion started by: RMN
4 Replies

3. Shell Programming and Scripting

Generate Regex numeric range with specific sub-ranges

hi all, Say i have a range like 0 - 1000 and i need to split into diffrent files the lines which are within a specific fixed sub-range. I can achieve this manually but is not scalable if the range increase. E.g cat file1.txt Response time 2 ms Response time 15 ms Response time 101... (12 Replies)
Discussion started by: varu0612
12 Replies

4. Shell Programming and Scripting

Grep range of lines to print a line number on match

Hi Guru's, I am trying to grep a range of line numbers (based on match) and then look for another match which starts with a special character '$' and print the line number. I have the below code but it is actually printing the line number counting starting from the first line of the range i am... (15 Replies)
Discussion started by: Kevin Tivoli
15 Replies

5. Shell Programming and Scripting

Awk/sed : help on:Filtering multiple lines to one:

Experts Good day, I want to filter multiple lines of same error of same day , to only 1 error of each day, the first line from the log. Here is the file: May 26 11:29:19 cmihpx02 vmunix: NFS write failed for server cmiauxe1: error 5 (RPC: Timed out) May 26 11:29:19 cmihpx02 vmunix: NFS... (4 Replies)
Discussion started by: rveri
4 Replies

6. Shell Programming and Scripting

sed pattern fails to delete line of numbers

We are using Red Hat Linux. I have a flat file with among other things, the following lines, which appear occasionally throughout the file: Using sed, I delete this line: L;L;L;L;R;R;R;L;R;L;R;R;R;L;L;L With: /^;;;;;*/d Works fine every time. However, I cannot delete... (6 Replies)
Discussion started by: bloomlock
6 Replies

7. Shell Programming and Scripting

sed replace range of characters in each line

Hi, I'm trying to replace a range of characters by their position in each line by spaces. I need to replace characters 95 to 145 by spaces in each line. i tried below but it doesn't work sed -r "s/^(.{94})(.{51})/\ /" inputfile.txt > outputfile.txt can someone please help me... (3 Replies)
Discussion started by: Kevin Tivoli
3 Replies

8. Shell Programming and Scripting

sed variable expansion fails for substitution in range

I'm trying to change "F" to "G" in lines after the first one: 'FUE.SER' 5 1 1 F0501 F0401 F0502 2 1 F0301 E0501 F0201 E0502 F0302 3 1 F0503 E0503 E0301 E0201 E0302 E0504 F0504 4 1 F0402 F0202 E0202 F0101 E0203 F0203 F0403 5 1 F0505 E0505 E0303 E0204 E0304 E0506... (10 Replies)
Discussion started by: larrl
10 Replies

9. UNIX for Beginners Questions & Answers

Sed/awk to delete a regex between range of lines

Hi Guys I am looking for a solution to one problem to remove parentheses in a range of lines. Input file module bist_logic_inst(a, ab , dhd, dhdh , djdj, hdh, djjd, jdj, dhd, dhp, dk ); input a; input ab; input dhd; input djdj; input dhd; output hdh; output djjd; output jdj;... (5 Replies)
Discussion started by: kshitij
5 Replies

10. UNIX for Beginners Questions & Answers

Cannot subset ranges from another range set

Ca21chr2_C_albicans_SC5314 2159343 2228327 Ca21chr2_C_albicans_SC5314 636587 638608 Ca21chr2_C_albicans_SC5314 5286 50509 Ca21chr2_C_albicans_SC5314 634021 636276 Ca21chr2_C_albicans_SC5314 1886545 1900975 Ca21chr2_C_albicans_SC5314 610758 613544... (9 Replies)
Discussion started by: cryptodice
9 Replies
XMLSTARLET(1)                                                    xmlstarlet Manual                                                   XMLSTARLET(1)

NAME
xmlstarlet - command line XML/XSLT toolkit SYNOPSIS
xmlstarlet [<options>] [<command>] [<cmd-options>] INTRODUCTION
XMLStarlet is a set of command line utilities (tools) which can be used to transform, query, validate, and edit XML documents and files us- ing simple set of shell commands in similar way it is done for plain text files using UNIX grep, sed, awk, diff, patch, join, etc commands. This set of command line utilities can be used by those who deal with many XML documents on UNIX shell command prompt as well as for auto- mated XML processing with shell scripts. OPTIONS
--version Display the version of xmlstarlet. --help Display help. COMMANDS
Type: xmlstarlet <command> --help <ENTER> for command help Available commands include: ed (or edit) Edit/update XML document(s). sel (or select) Select data or query XML document(s) (XPATH, etc). tr (or transform) Transform XML documents(s) using XSLT. val (or validate) Validate XML document(s) (well-formed/DTD/XSD/RelaxNG). fo (or format) Format XML document(s). el (or elements) Display element structure of XML document. c14n (or canonic) XML canonicalization. ls (or list) List directory as XML. esc (or escape) Escape special XML characters. unesc (or unescape) Unescape special XML characters. pyx (or xmln) Convert XML into PYX format (based on ESIS - ISO 8879). p2x (or depyx) Convert PYX into XML. REFERENCES
XMLStarlet is a command line toolkit to query/edit/check/transform XML documents (for more information see http://xmlstar.source- forge.net/). AUTHOR
Mikhail Grushinskiy. XMLSTARLET(1)
All times are GMT -4. The time now is 03:31 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy