Grabbing text between two lines with shell variables.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Grabbing text between two lines with shell variables.
# 1  
Old 02-21-2018
Grabbing text between two lines with shell variables.

I would like to grab complex html text between lines using variables. I am running Debian and using mksh shell.

Here is the part of the html that I want to extract from. I would like to extract the words 'to love,' and I would like to use the above and below lines as reference points.
Code:
                <span class="lemma_definition">
                 to love
         </span>

Working script that does not use variables:
Code:
#!/bin/sh

URL="perseus.tufts.edu/hopper/morph?l=amo&la=la"

# Working: prints top definition:
wget -q -O- "$URL" | awk '/<span class="lemma_definition">/,/<\/span>/ {{ if (!/>/) {{$1=$1}1; print $0}} }'

NOTES:

(!/>/) = If there is a '>' just ignore.
{$1=$1}1; = Gets rid of spaces in result else it comes out as: ' <several spaces are here> to love'

Displays the proper text:
Code:
to love


Faulty code attempting to use variables:
Code:
#!/bin/sh

URL="perseus.tufts.edu/hopper/morph?l=amo&la=la"

wIn='<span class="lemma_definition">'
wOut='</span>'

# Faulty code with variables:
wget -q -O- "$URL" | awk -v vIn="$wIn" -v vOut="$wOut" '/vIn/,/vOut/ {{ if (!/>/) {{$1=$1}1; print $0}} }'

Prints nil.

How can I properly use the variables to make it work like the non-variable code? I've been reading tutorials but have not come across this situation yet.
# 2  
Old 02-21-2018
Code:
wget -q -O- "$URL" | awk -v vIn="$wIn" -v vOut="$wOut" '$0 ~ vIn,$0 ~ vOut {{ if (!/>/) {{$1=$1}1; print $0}} }'

This User Gave Thanks to vgersh99 For This Post:
# 3  
Old 02-21-2018
Or use a control variable (p).
Code:
... | awk -v vIn="$wIn" -v vOut="$wOut" '($0 ~ vOut) {p=0} p {$1=$1; print} ($0 ~ vIn) {p=1}'

The order of the three statements determine if the boundaries are included. Here both are excluded.
This is Regular Expression: special charcters need to be escaped in $wIn and $wOut.
The following variant works with plain strings:
Code:
... | awk -v vIn="$wIn" -v vOut="$wOut" 'index($0,vOut) {p=0} p  {$1=$1; print} index($0,vIn) {p=1}'


Last edited by MadeInGermany; 02-21-2018 at 05:05 PM..
This User Gave Thanks to MadeInGermany For This Post:
# 4  
Old 02-21-2018
Code:
wget -q -O- "$URL" | awk -v vIn="$wIn" -v vOut="$wOut" '$0 ~ vIn,$0 ~ vOut {{ if (!/>/) {{$1=$1}1; print $0}} }'

This worked perfectly! Smilie

...so close, yet so far! Smilie

Quote:
Originally Posted by MadeInGermany
Or use a control variable (p).
Code:
... | awk -v vIn="$wIn" -v vOut="$wOut" '($0 ~ vOut) {p=0} p {$1=$1; print} ($0 ~ vIn) {p=1}'

The order of the three statements determine if the boundaries are included. Here both are excluded.
This is Regular Expression: special charcters need to be escaped in $wIn and $wOut.
The following variant works with plain strings:
Code:
... | awk -v vIn="$wIn" -v vOut="$wOut" 'index($0,vOut) {p=0} p  {$1=$1; print} index($0,vIn) {p=1}'

Thank you. I will try this later tonight or tomorrow when my mind is more fresh (been at this for too long and wouldn't do it justice if I tried now). Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Awk: passing shell variables through and extracting text

Hello, new to the forums and to awk. Glad to be here. :o I want to pass two shell (#!/bin/sh) variables through to awk and use them. They will determine where to start and stop text extraction. The code with the variables hard-coded in awk works fine; the same code, but with the shell... (7 Replies)
Discussion started by: bedtime
7 Replies

2. Shell Programming and Scripting

Grabbing data between 2 points in text file

I have a text file that shows the output of my solar inverters. I want to separate this into sections. overview , device 1 , device 2 , device 3. Each device has different number of lines. but they all have unique starting points. Overview starts with 6 #'s, Devices have 4#'s and their data starts... (6 Replies)
Discussion started by: Mikey
6 Replies

3. UNIX for Dummies Questions & Answers

Pass variables from a text file to a shell script

Hi, I have a text file as follows: a.txt ------ STEPS=3 STEP_DURATION=100 INTERVAL=60 I want to use these values in a shell script. How to go about this? (3 Replies)
Discussion started by: akarnya
3 Replies

4. Shell Programming and Scripting

Grabbing a chunk of text from a file

Hi, I have a Report.txt file. Say the contents of this file are : 1 2 3 4 5 7 df v g gf e r dfkf lsdk dslsdklsdk Report Start: xxxxxxdad asdffsdfsdfsdfasfasdffasdf sadfasdfsadffsfsdf Report End. sdfasdfasdf sdfasfdasdfasdfasdfasdf sadfasdfsdf I need to grab from Report Start... (3 Replies)
Discussion started by: mrskittles99
3 Replies

5. Shell Programming and Scripting

Grabbing text and using that text in a newly created line

Hello, I am really stuck and I'm hoping somone can help. I have a text file that is similar to this: <--First User--> <function>account='uid=user1,....... <--Second User--> <function>account='uid=user2,.......What I want is to grab the usernames after "uid=" and before the following... (9 Replies)
Discussion started by: mafia910
9 Replies

6. UNIX for Dummies Questions & Answers

How to use sed to copy specific lines from a file using shell variables?

hello! I am trying to use sed to copy specific set of lines from a file for which the starting and ending line numbers of the lines to be copied are stored in shell variables. How can i copy those lines? if the input_file is something like this and if the following is the script a=2 b=4... (4 Replies)
Discussion started by: a_ba
4 Replies

7. Shell Programming and Scripting

grabbing filename from text file....should be easy!

Quick question...I'm trying to grab the .tif file name from this output from our fax server. What is the best way i can do this in a bash script? I have been looking at regular expressions with bash or using awk but having some trouble. thanks! The only output i want is... (5 Replies)
Discussion started by: kuliksco
5 Replies

8. Shell Programming and Scripting

Grabbing variables and comparing

I have two computers with dynamic IP addresses and am using dyndns so that they are identifiable as the same computer even if their IPs change (we'll call them host1.dyndns.com and host2.dyndns.com). I also have a remote server which I would like to store my computers' IP addresses on. There is a... (9 Replies)
Discussion started by: kerpm
9 Replies

9. Shell Programming and Scripting

Quick question on grep: grabbing lines above and below

Just a quick question on grep/egrep. I am writing a shell script that is looking for certain strings in a text file. It works well and gets exactly what I need. However, the way the program writes to the text file, it puts the timestamp in a line above the string I am looking for and the path... (3 Replies)
Discussion started by: thecoffeeguy
3 Replies

10. Shell Programming and Scripting

Grabbing lines out of a file based on a date

Hello, I'm new to this forum and am not exactly sure where to post this question, so I'll start here. I'm looking for a command or simple script that will read in a large flat file (contains 2005 data) and will output a new file based on a quarter. Within each row, position 87-90 is a julian... (2 Replies)
Discussion started by: bsp18974
2 Replies
Login or Register to Ask a Question