Sponsored Content
Top Forums Shell Programming and Scripting sed filtering lines by range fails 1-line-ranges Post 302724013 by bakunin on Wednesday 31st of October 2012 06:27:08 AM
Old 10-31-2012
sed filtering lines by range fails 1-line-ranges

The following is part of a larger project and sed is (right now) a given. I am working on a recursive Korn shell function to "peel off" XML tags from a larger text. Just for context i will show the complete function (not working right now) here:

Code:
function pGetXML
{
typeset chTag="$1"
typeset chOpt="$1"
typeset chLine=""

if [ "${chOpt#*/}" = "${chOpt}" ] ; then
     chOpt=""
else
     chOpt="${chOpt#*/}"
     chTag="${chTag%/*}"
fi

print -u2 - "inside pGetXML...."
print -u2 - "chTag=${chTag}"
print -u2 - "chOpt=${chOpt}"
print -u2 - "Args=$*\n"

if [ -n "$chTag" ] ; then
     shift
     sed -n '/<'"$chTag"'[^>]*'"$chOpt"'[^>]*>/,/<\/'"$chTag"'[^>]*>/p' |\
     pGetXML $*
else
     while read chLine ; do
          pStripTags "$chLine"
     done
fi

return 0
}

The function will be called like

Code:
pGetXML "arg1/type=opt1" "arg2/type=opt2" "Value"...

and is intended to "peel off" layers of XML tags from a file organized like this:

Code:
<arg1 type=opt1>
     <arg2 type=opt2>
          <Value>blabla</Value>
     </arg2>
     <othertag>
          <Value>foo bar</Value>
     </othertag>
</arg1>

The function should first print everything from "<arg1>" to "</arg1>" (the "option" is used because there could be other tags with the same name i am not interested in, like "<arg1 type=else>"), in the second instance filter from that only the lines "<arg2>...</arg2>" and in the third pass only the lines "<Value>...</Value>". The function "pStripTags" simply strips off the tags leaving the text inside.

Well, this is what was intended and it kind of works, but in the last step "sed" fails to do as expected when opening and closing tag of the range is on eht same line. I am at this stage down to this portion of the text (this is verified):

Code:
     <arg2 type=opt2>
          <Value>blabla</Value>
     </arg2>

and the sed command (verified with "set -xv") is this:

Code:
sed -n '/<Value[^>]*[^>]*>/,/<\/Value[^>]*>/p'

I would have expected it to only print line 2, but it doesn't. Instead it prints line 2 and 3.

The objective is to create a sed script that will fit into the recursive function. Any pointers will be welcome.

bakunin
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove a range of lines from a file using sed

Hi I am having some issue editing a file in sed. What I want to do is, in a loop pass a variable to a sed command. Sed should then search a file for a line that matches that variable, then remove all lines below until it reaches a line starting with a constant. I have managed to write a... (14 Replies)
Discussion started by: Andy82
14 Replies

2. Shell Programming and Scripting

Sed print range of lines between line number and pattern

Hi, I have a file as below This is the line one This is the line two <\XMLTAG> This is the line three This is the line four <\XMLTAG> Output of the SED command need to be as below. This is the line one This is the line two <\XMLTAG> Please do the need to needful to... (4 Replies)
Discussion started by: RMN
4 Replies

3. Shell Programming and Scripting

Generate Regex numeric range with specific sub-ranges

hi all, Say i have a range like 0 - 1000 and i need to split into diffrent files the lines which are within a specific fixed sub-range. I can achieve this manually but is not scalable if the range increase. E.g cat file1.txt Response time 2 ms Response time 15 ms Response time 101... (12 Replies)
Discussion started by: varu0612
12 Replies

4. Shell Programming and Scripting

Grep range of lines to print a line number on match

Hi Guru's, I am trying to grep a range of line numbers (based on match) and then look for another match which starts with a special character '$' and print the line number. I have the below code but it is actually printing the line number counting starting from the first line of the range i am... (15 Replies)
Discussion started by: Kevin Tivoli
15 Replies

5. Shell Programming and Scripting

Awk/sed : help on:Filtering multiple lines to one:

Experts Good day, I want to filter multiple lines of same error of same day , to only 1 error of each day, the first line from the log. Here is the file: May 26 11:29:19 cmihpx02 vmunix: NFS write failed for server cmiauxe1: error 5 (RPC: Timed out) May 26 11:29:19 cmihpx02 vmunix: NFS... (4 Replies)
Discussion started by: rveri
4 Replies

6. Shell Programming and Scripting

sed pattern fails to delete line of numbers

We are using Red Hat Linux. I have a flat file with among other things, the following lines, which appear occasionally throughout the file: Using sed, I delete this line: L;L;L;L;R;R;R;L;R;L;R;R;R;L;L;L With: /^;;;;;*/d Works fine every time. However, I cannot delete... (6 Replies)
Discussion started by: bloomlock
6 Replies

7. Shell Programming and Scripting

sed replace range of characters in each line

Hi, I'm trying to replace a range of characters by their position in each line by spaces. I need to replace characters 95 to 145 by spaces in each line. i tried below but it doesn't work sed -r "s/^(.{94})(.{51})/\ /" inputfile.txt > outputfile.txt can someone please help me... (3 Replies)
Discussion started by: Kevin Tivoli
3 Replies

8. Shell Programming and Scripting

sed variable expansion fails for substitution in range

I'm trying to change "F" to "G" in lines after the first one: 'FUE.SER' 5 1 1 F0501 F0401 F0502 2 1 F0301 E0501 F0201 E0502 F0302 3 1 F0503 E0503 E0301 E0201 E0302 E0504 F0504 4 1 F0402 F0202 E0202 F0101 E0203 F0203 F0403 5 1 F0505 E0505 E0303 E0204 E0304 E0506... (10 Replies)
Discussion started by: larrl
10 Replies

9. UNIX for Beginners Questions & Answers

Sed/awk to delete a regex between range of lines

Hi Guys I am looking for a solution to one problem to remove parentheses in a range of lines. Input file module bist_logic_inst(a, ab , dhd, dhdh , djdj, hdh, djjd, jdj, dhd, dhp, dk ); input a; input ab; input dhd; input djdj; input dhd; output hdh; output djjd; output jdj;... (5 Replies)
Discussion started by: kshitij
5 Replies

10. UNIX for Beginners Questions & Answers

Cannot subset ranges from another range set

Ca21chr2_C_albicans_SC5314 2159343 2228327 Ca21chr2_C_albicans_SC5314 636587 638608 Ca21chr2_C_albicans_SC5314 5286 50509 Ca21chr2_C_albicans_SC5314 634021 636276 Ca21chr2_C_albicans_SC5314 1886545 1900975 Ca21chr2_C_albicans_SC5314 610758 613544... (9 Replies)
Discussion started by: cryptodice
9 Replies
All times are GMT -4. The time now is 04:00 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy