Noob trying to improve Post: 302990657

Sponsored Content

Operating Systems OS X (Apple) Noob trying to improve Post 302990657 by Ardzii on Monday 30th of January 2017 12:08:47 PM

01-30-2017

Registered User

My god Bakunin, you're a master in sed!

Thank you so much for taking the time to write these lines! Smilie

OK then, let me try on your sed to see if I understood:

Code:

sed -n '/href.*view more/ s/.*href="\([^"]*\).*/\1/p')

I'll stay with my understanding of the first part of the command. You're actually not passing any command yet to sed. So what you're looking for in:

Code:

'/href.*view more/

is the line that matches "href [any kind of character in between]view more" or to put it another way:

Code:

SED, find me the line that has "href" + some string and "view more"in it.

you get that line:

Code:

<a href="/listing/magnet/ge/ramp-shim/2322185"> view more </a>

Now comes the good part:

Code:

s/.*href="\([^"]*\).*/\1/p'

Within that line, substitute: "[any kind of character before] href=" [following string omitting the possible " characters within the string] by [this same string without " characters that you just found] and print.

but how come the "> view more </a>" portion of the line was left out of the sed? because from what I understand you're including .* which still should include all the characters at the end of the line, shouldn't it?

Thanks as usual!

Best!

EDIT-----

I just tried:

Code:

sed -n '/href.*view more/ s/.*href="\(.*\)/\1/p'

and it gave me:

Code:

/listing/magnet/ge/ramp-shim/2322185"> view more </a>

So I guess that what's happening with your code is that when you tell sed to exclude the " it simply stops at it and do not go on with the rest of the line.

---------- Post updated at 06:08 PM ---------- Previous update was at 04:43 PM ----------

Quote:

Originally Posted by MadeInGermany

Exactly. The first character that matches in the trailing .* is a quote.
As I said, the leading and trailing .* are needed to "match away" the entire line. Otherwise only the matching portion would be substituted.

---------- Post updated at 12:15 ---------- Previous update was at 11:44 ----------

Now to your second requirement. Can give a headache even for experienced guys.
In your example the ' is a problem for the shell, in which you call

Code:

sed -n '...'

There is no problem if you save the sed code in a separate file and run it with

Code:

sed -n -f sed-script result2.txt

And the contents of the sed-script

Code:

/itemprop='brand'/ s/.*'brand'>\([^<]*\).*/\1/p

You can add another match in a second line

Code:

/itemprop='name'/ s/.*'name'>\([^<]*\).*/\1/p

but it won't match if the first match was successful and the input line was substituted.
It is necessary to save and restore the line.

Code:

h
/itemprop='brand'/ s/.*'brand'>\([^<]*\).*/\1/p
g
/itemprop='name'/ s/.*'name'>\([^<]*\).*/\1/p

Another aspect is greediness. The * wants to match as much it can. A leftmost * is most greedy.
That means /.*'branch'/ matches the rightmost 'branch'.
--
Last but not least, the shell method to print a ' within a ' ' string goes like this

Code:

 echo 'left'\''right'

Actually it is a concatenation of 'left' and 'right' with a \' in between.
For an embedded sed script it is enough to remember to exchange each literal ' by '\''.

Hey MadeinGermany!

Bakunin's explaination helped me a lot go through your answer but I still got a few questions:

Quote:

In your example the ' is a problem for the shell, in which you call

Code:

sed -n '...'

There is no problem if you save the sed code in a separate file and run it with

Code:

sed -n -f sed-script result2.txt

Why is it a problem for the shell?
When I paste the previous "PHP" (I assume it's PHP) code into a txt (examplesed.txt) for testing the command:

Code:

sed -n '/itemprop='brand'/ s/.*'brand'>\([^<]*\).*/\1/p' examplesed.txt

it yields nothing. There's no output whatsoever...

Quote:

Code:
/itemprop='brand'/ s/.*'brand'>$[^<]*$.*/\1/p
You can add another match in a second line

Code:
/itemprop='name'/ s/.*'name'>$[^<]*$.*/\1/p
but it won't match if the first match was successful and the input line was substituted.
It is necessary to save and restore the line.

Code:
h /itemprop='brand'/ s/.*'brand'>$[^<]*$.*/\1/p g /itemprop='name'/ s/.*'name'>$[^<]*$.*/\1/p

On that one, I'm not sure to follow either... My objective is to integrate these commands within a loop. So I will have the first iteration and it'll write the output to a file, then the second command (with 'name' for instance) > echo to a file and go to the third iteration etc...
Wouldn't that work under that setting?

Quote:

Another aspect is greediness. The * wants to match as much it can. A leftmost * is most greedy.
That means /.*'branch'/ matches the rightmost 'branch'.

I guess that in my case, my problem child would be 'name' that has a first appearance at the beginning of the line.
But in that case couldn't I use 2 right before the 'p' (print command).
I learnt on the web that putting a 1 or a 2 before the p would yield the first or second appearance of the term I'm looking for... wouldn't that work?

All the best!

Last edited by Ardzii; 01-30-2017 at 11:54 AM.. Reason: More discoveries!!

Ardzii

View Public Profile for Ardzii

Find all posts by Ardzii

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Can I improve this script ???

Hi all, Still a newbie and learning as I go ... as you do :) Have created this script to report on disc usage and I've just included the ChkSpace function this morning. It's the first time I've read a file (line-by-bloody-line) and would like to know if I can improve this script ? FYI - I...

2. UNIX for Advanced & Expert Users

improve performance by using ls better than find

Hi , i'm searching for files over many Aix servers with rsh command using this request : find /dir1 -name '*.' -exec ls {} \; and then count them with "wc" but i would improve this search because it's too long and replace directly find with ls command but "ls *. " doesn't work. and...

3. Shell Programming and Scripting

improve this?

Wrote this script to find the date x days before or after today. Is there any way that this script can be speeded up or otherwise improved? #!/usr/bin/sh check_done() { if then daysofmth=31 elif then if ...

4. UNIX for Dummies Questions & Answers

Improve Performance

hi someone tell me which ways i can improve disk I/O and system process performance.kindly refer some commands so i can do it on my test machine.thanks, Mazhar

5. Shell Programming and Scripting

Any way to improve performance of this script

I have a data file of 2 gig I need to do all these, but its taking hours, any where i can improve performance, thanks a lot #!/usr/bin/ksh echo TIMESTAMP="$(date +'_%y-%m-%d.%H-%M-%S')" function showHelp { cat << EOF >&2 syntax extreme.sh FILENAME Specify filename to parse EOF...

6. IP Networking

How to improve throughput?

I have a 10Gbps network link connecting two machines A and B. I want to transfer 20GB data from A to B using TCP. With default setting, I can use 50% bandwidth. How to improve the throughput? Is there any way to make throughput as close to 10Gbps as possible? thanks~ :)

7. Shell Programming and Scripting

Want to improve the performance of script

Hi All, I have written a script as follows which is taking lot of time in executing/searching only 3500 records taken as input from one file in log file of 12 GB Approximately. Working of script is read the csv file as an input having 2 arguments which are transaction_id,mobile_number and search...

8. AIX

improve sulog

I just wrote a very small script that improves readability on system sulog. The problem with all sulog is there is lack of clarity whether the info you are looking at is the most current. So if you just need a simple soution instead of going thru the trouble of writing a script that rotate logs and...

9. Shell Programming and Scripting

How to improve an script?

Gents. I have 2 different scripts for the same purpose: raw2csv_1 Script raw2csv_1 finish the process in less that 1 minute raw2csv_2 Script raw2csv_2 finish the process in more that 6 minutes. Can you please check if there is any option to improve the raw2csv_2. To finish the job...

10. Shell Programming and Scripting

Improve script

Gents, Is there the possibility to improve this script to be able to have same output information. I did this script, but I believe there is a very short code to get same output here my script awk -F, '{if($10>0 && $10<=15) print $6}' tmp1 | sort -k1n | awk '{a++} END { for (n in a )...

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Can I improve this script ???

Discussion started by: Cameron

2. UNIX for Advanced & Expert Users

improve performance by using ls better than find

Discussion started by: Nicol

3. Shell Programming and Scripting

improve this?

Discussion started by: blowtorch

4. UNIX for Dummies Questions & Answers

Improve Performance

Discussion started by: mazhar99

5. Shell Programming and Scripting

Any way to improve performance of this script

Discussion started by: sirababu

6. IP Networking

How to improve throughput?

Discussion started by: andrewust

7. Shell Programming and Scripting

Want to improve the performance of script

Discussion started by: poweroflinux

8. AIX

improve sulog

Discussion started by: sparcguy

9. Shell Programming and Scripting

How to improve an script?

Discussion started by: jiam912

10. Shell Programming and Scripting

Improve script

Discussion started by: jiam912