Sponsored Content
Top Forums Shell Programming and Scripting sed filtering lines by range fails 1-line-ranges Post 302724111 by bakunin on Wednesday 31st of October 2012 08:56:10 AM
Old 10-31-2012
Many thanks for your helpful suggestions.

I modified the function a bit and noticed, that i don't need the last step "pStripTags" if i modify the sed-script to strip the tags immediately. Here is the revised function. I have added "tee -a <tracefile>" commands to control the various steps of the recursion. For production they can safely be removed as they only serve debugging purposes:

Code:
# ------------------------------------------------------------------------------
# pGetXML                        extract certain values from a layered XML code
# ------------------------------------------------------------------------------
# Author.....: bakunin, with help of various unix.com members
# last update: 2012 08 23    by: bakunin
# ------------------------------------------------------------------------------
# Revision Log:
#
# ------------------------------------------------------------------------------
# Usage:
#     pGetXML tag1[/option1] [tag2[/option2] ..]
#
#
#     Example:
#          cat file | pGetXML foo/opt1 bar/opt2
#          will search for a range of "<foo ...opt1..> ... </foo>" and in the
#          resulting stream search for a range of "<bar ..opt2..> ... </bar>
#          The result will be reformatted to a single line and the enclosing
#          tags will be removed. This text:
#
#          <foo type=opt2>
#               <sometag>
#          </foo>
#          <foo type=opt1>
#               <bar>
#                    somevalue
#               </bar
#               <bar type=opt2>searched_for</bar>
#          </foo>
#
#          will result only in "searched_for", because in the first foo-tag the
#          option doesn't match, the same goes for the first bar-tag 
#
# Prerequisites:
# - none
# ------------------------------------------------------------------------------
# Documentation:
# Extracts values from an XML file of nested tags presented at <stdin>.
# The given list of tags is searched recursively. Only the tag name has to
# be given, so
#
#             pGetXML foo
#
# will return the content of "<foo> .. </foo>". It is possible to refine tags
# by using "options", which will be searched for in the tag definition (see
#  below).
#
# Output goes to <stdout>.
#
#     Parameters: tag1[/opt1] [tag2[/opt2] ..tagN[/optN]] 
#     returns: void
# ------------------------------------------------------------------------------
# known bugs:
#
#     none
# ------------------------------------------------------------------------------
# ..........................(C) 2012 bakunin ..................................
# ------------------------------------------------------------------------------

function pGetXML
{
typeset chTag="$1"
typeset chOpt="$1"
typeset chLine=""

if [ "${chOpt#*/}" = "${chOpt}" ] ; then
     chOpt=""
else
     chOpt="${chOpt#*/}"
     chTag="${chTag%/*}"
fi

# DEBUG start
#      print -u2 - "inside pGetXML...."
#      print -u2 - "chTag=${chTag}"
#      print -u2 - "chOpt=${chOpt}"
#      print -u2 - "Args=$*\n"
# DEBUG end

if [ -n "$chTag" ] ; then
     shift
     sed -n '/<'"$chTag"'[^>]*'"$chOpt"'[^>]*>/ {
               :next
               /<\/'"$chTag"'[^>]*>/! {
                    N
                    b next
               }
             }
             /<\/'"$chTag"'[^>]*>/ {
               s/\n//g
               s/^.*<'"$chTag"'[^>]*'"$chOpt"'[^>]*>//
               s/<\/'"$chTag"'[^>]*>.*$//p
             }' |\
     tee -a xxx.$(date +'%H%M%N').out |\
     pGetXML $*
else
     tee -a xxx.last.out |\
     while read chLine ; do
          print - "$chLine"
     done
fi

return 0
}

bakunin
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove a range of lines from a file using sed

Hi I am having some issue editing a file in sed. What I want to do is, in a loop pass a variable to a sed command. Sed should then search a file for a line that matches that variable, then remove all lines below until it reaches a line starting with a constant. I have managed to write a... (14 Replies)
Discussion started by: Andy82
14 Replies

2. Shell Programming and Scripting

Sed print range of lines between line number and pattern

Hi, I have a file as below This is the line one This is the line two <\XMLTAG> This is the line three This is the line four <\XMLTAG> Output of the SED command need to be as below. This is the line one This is the line two <\XMLTAG> Please do the need to needful to... (4 Replies)
Discussion started by: RMN
4 Replies

3. Shell Programming and Scripting

Generate Regex numeric range with specific sub-ranges

hi all, Say i have a range like 0 - 1000 and i need to split into diffrent files the lines which are within a specific fixed sub-range. I can achieve this manually but is not scalable if the range increase. E.g cat file1.txt Response time 2 ms Response time 15 ms Response time 101... (12 Replies)
Discussion started by: varu0612
12 Replies

4. Shell Programming and Scripting

Grep range of lines to print a line number on match

Hi Guru's, I am trying to grep a range of line numbers (based on match) and then look for another match which starts with a special character '$' and print the line number. I have the below code but it is actually printing the line number counting starting from the first line of the range i am... (15 Replies)
Discussion started by: Kevin Tivoli
15 Replies

5. Shell Programming and Scripting

Awk/sed : help on:Filtering multiple lines to one:

Experts Good day, I want to filter multiple lines of same error of same day , to only 1 error of each day, the first line from the log. Here is the file: May 26 11:29:19 cmihpx02 vmunix: NFS write failed for server cmiauxe1: error 5 (RPC: Timed out) May 26 11:29:19 cmihpx02 vmunix: NFS... (4 Replies)
Discussion started by: rveri
4 Replies

6. Shell Programming and Scripting

sed pattern fails to delete line of numbers

We are using Red Hat Linux. I have a flat file with among other things, the following lines, which appear occasionally throughout the file: Using sed, I delete this line: L;L;L;L;R;R;R;L;R;L;R;R;R;L;L;L With: /^;;;;;*/d Works fine every time. However, I cannot delete... (6 Replies)
Discussion started by: bloomlock
6 Replies

7. Shell Programming and Scripting

sed replace range of characters in each line

Hi, I'm trying to replace a range of characters by their position in each line by spaces. I need to replace characters 95 to 145 by spaces in each line. i tried below but it doesn't work sed -r "s/^(.{94})(.{51})/\ /" inputfile.txt > outputfile.txt can someone please help me... (3 Replies)
Discussion started by: Kevin Tivoli
3 Replies

8. Shell Programming and Scripting

sed variable expansion fails for substitution in range

I'm trying to change "F" to "G" in lines after the first one: 'FUE.SER' 5 1 1 F0501 F0401 F0502 2 1 F0301 E0501 F0201 E0502 F0302 3 1 F0503 E0503 E0301 E0201 E0302 E0504 F0504 4 1 F0402 F0202 E0202 F0101 E0203 F0203 F0403 5 1 F0505 E0505 E0303 E0204 E0304 E0506... (10 Replies)
Discussion started by: larrl
10 Replies

9. UNIX for Beginners Questions & Answers

Sed/awk to delete a regex between range of lines

Hi Guys I am looking for a solution to one problem to remove parentheses in a range of lines. Input file module bist_logic_inst(a, ab , dhd, dhdh , djdj, hdh, djjd, jdj, dhd, dhp, dk ); input a; input ab; input dhd; input djdj; input dhd; output hdh; output djjd; output jdj;... (5 Replies)
Discussion started by: kshitij
5 Replies

10. UNIX for Beginners Questions & Answers

Cannot subset ranges from another range set

Ca21chr2_C_albicans_SC5314 2159343 2228327 Ca21chr2_C_albicans_SC5314 636587 638608 Ca21chr2_C_albicans_SC5314 5286 50509 Ca21chr2_C_albicans_SC5314 634021 636276 Ca21chr2_C_albicans_SC5314 1886545 1900975 Ca21chr2_C_albicans_SC5314 610758 613544... (9 Replies)
Discussion started by: cryptodice
9 Replies
XML::Writer::Simple(3pm)				User Contributed Perl Documentation				  XML::Writer::Simple(3pm)

NAME
XML::Writer::Simple - Create XML files easily! SYNOPSIS
use XML::Writer::Simple dtd => "file.dtd"; print xml_header(encoding => 'iso-8859-1'); print para("foo",b("bar"),"zbr"); # if you want CGI but you do not want CGI :) use XML::Writer::Simple ':html'; USAGE
This module takes some ideas from CGI to make easier the life for those who need to generated XML code. You can use the module in three flavours (or combine them): tags When importing the module you can specify the tags you will be using: use XML::Writer::Simple tags => [qw/p b i tt/]; print p("Hey, ",b("you"),"! ", i("Yes ", b("you"))); that will generate <p>Hey <b>you</b>! <i>Yes <b>you</b></i></p> dtd You can supply a DTD, that will be analyzed, and the tags used: use XML::Writer::Simple dtd => "tmx.dtd"; print tu(seg("foo"),seg("bar")); xml You can supply an XML (or a reference to a list of XML files). They will be parsed, and the tags used: use XML::Writer::Simple xml => "foo.xml"; print foo("bar"); partial You can supply an 'partial' key, to generate prototypes for partial tags construction. For instance: use XML::Writer::Simple tags => qw/foo bar/, partial => 1; print start_foo; print ... print end_foo; You can also use tagsets, where sets of tags from a well known format are imported. For example, to use HTML: use XML::Writer::Simple ':html'; EXPORT
This module export one function for each element at the dtd or xml file you are using. See below for details. FUNCTIONS
import Used when you 'use' the module, should not be used directly. xml_header This function returns the xml header string, without encoding definition, with a trailing new line. Default XML encoding should be UTF-8, by the way. You can force an encoding passing it as argument: print xml_header(encoding=>'iso-8859-1'); powertag Used to specify a powertag. For instance: powertag("ul","li"); ul_li([qw/foo bar zbr ugh/]); will generate <ul> <li>foo</li> <li>bar</li> <li>zbr</li> <li>ugh</li> </ul> You can also supply this information when loading the module, with use XML::Writer::Simple powertags=>["ul_li","ol_li"]; Powertags support three level tags as well: use XML::Writer::Simple powertags=>["table_tr_td"]; print table_tr_td(['a','b','c'],['d','e','f']); AUTHOR
Alberto Simo~es, "<ambs@cpan.org>" BUGS
Please report any bugs or feature requests to "bug-xml-writer-simple@rt.cpan.org", or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=XML-Writer-Simple <http://rt.cpan.org/NoAuth/ReportBug.html?Queue=XML-Writer-Simple>. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes. COPYRIGHT AND LICENSE
Copyright 1999-2012 Project Natura. This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.14.2 2012-06-05 XML::Writer::Simple(3pm)
All times are GMT -4. The time now is 06:07 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy