Noob trying to improve


 
Thread Tools Search this Thread
Operating Systems OS X (Apple) Noob trying to improve
# 22  
Old 01-13-2017
local means: the variable exists only within the current scope.
In Bash: within the current function.
Bash is not a declarative language, and has loose type binding.
For example you can add a 1 to a string "5", and get "51" or 6 dependent on the operator.
With declare -i you can limit the misuse of variables. For example
Code:
i="a"; [ $i -lt 5 ] && echo "$i is less than 5"

gives a syntax error. With declare -i i the "a" will be casted to a 0, and there will not be a syntax error (but might still result in a malfunction in the following code).
declare makes most sense for special variables like arrays.
This User Gave Thanks to MadeInGermany For This Post:
# 23  
Old 01-16-2017
Hey again guys!

Thanks so much for your time and precisions!SmilieSmilieSmilie Great material right there (especially for a beginner!).

@Bakunin: Your comments were heard! Hahahaha! Next version of the script I'll post will, hopefully, be well organized (if not correct me! Smilie).

@MadeinGermany: The difference between the 2 is clear! Thanks a lot for that.

I'd like to come back to an earlier comment from Bakunin though:
Quote:
As a rule of thumb: grep/sed/awk | grep/sed/awk is always wrong because it can be done in the respective tool chosen.
I've been working on grep exclusively lately to try to find the correct syntax to extract the info right away but I can't seem to find a way (other than using a SED) to extract a specific portion of a lineSmilieSmilie.
Take as an example, your correction of the link extraction:
Quote:
You can do all in sed without an additional egrep:


Code:
link=$( curl "your-link-here" |\ sed -n '/href.*view more/ s/.*href="\([^"]*\).*/\1/p' \ )
Could you have done that with a grep instead of a sed? I'm trying to determine the limitation of each of these 3 commands Smilie (don't worry, you don't have to explain that to me here, I'll figure it out on my own! SmilieSmilie).

Thanks as usual!

Best!
# 24  
Old 01-16-2017
I don't think so, generally. grep is not intended nor designed to replace or remove patterns or partial strings. Be aware that sed has similar powerful matching algorithms as grep has.
In your special case, a (deprecated) pipe of three greps could do the job:
Code:
echo "<a href=\"/listing/bone-densitometer/osteosys/dexxum-t/2299556\"> view more </a>" | grep '.*href="[^"]*"> view more.*$' | grep -o '"[^"]*"' | grep -o '[^"]*'
/listing/bone-densitometer/osteosys/dexxum-t/2299556

This User Gave Thanks to RudiC For This Post:
# 25  
Old 01-16-2017
Quote:
Originally Posted by RudiC
I don't think so, generally. grep is not intended nor designed to replace or remove patterns or partial strings. Be aware that sed has similar powerful matching algorithms as grep has.
In your special case, a (deprecated) pipe of three greps could do the job:
Code:
echo "<a href=\"/listing/bone-densitometer/osteosys/dexxum-t/2299556\"> view more </a>" | grep '.*href="[^"]*"> view more.*$' | grep -o '"[^"]*"' | grep -o '[^"]*'
/listing/bone-densitometer/osteosys/dexxum-t/2299556

Hey RudiC!

Nice to read you again! I hope you're doing great.
OK then, gotcha! Obviously, I'm not so much trying to force the use of grep rather than understand the limitations between the 3 commands (grep | sed | awk).

I'm starting to understand how sed works however I'm still looking at baby versions of what you guys can do!
It might seem trivial to you but it's kind of a "bitchy" command Smilie. The syntax is not (in my opinion) easy to get at all... Smilie I'm actually kind of struggling on this one a little SmilieSmilieSmilie.

Let me see what I can get and do from there and I'll come back to you soon with an improved version of this damn script! SmilieSmilieSmilieSmilie

Thanks again!
# 26  
Old 01-16-2017
grep stands for 'g/re/p' of sed [where g is Global, re is RegularExpression and p is Print]
These 2 Users Gave Thanks to vgersh99 For This Post:
# 27  
Old 01-16-2017
Quote:
Originally Posted by Ardzii
rather than understand the limitations between the 3 commands (grep | sed | awk).
As vgersh99 already noted "grep" comes from "g/re/p", which is a (schematised) sed-command. Let us see if i can help to improve your understanding:

grep is basically a line filter: you feed it a stream of input (or files) and it displays all the lines matching a certain pattern. Options can i.e. reverse this matching (so effectively all lines NOT matching the pattern can be displayed), you can count the found lines ("-c"), etc. but basically that is it: filtering out lines containing some text pattern from text.

grep is a good tool to find out about the existence of certain text and all that is related to this:

grep -c "pattern" /some/file - count the number of lines containing "pattern"
grep -v "pattern" /some/file - display all lines NOT containing "pattern"
if grep -q "pattern" /some/file ; then - grep -q makes grep not display any lines, matched or otherwise. But since grep will exit with "0" if it has found anything and with "1" if it hasn't this will execute the following code if "pattern" was anywhere in the file.


Now sed: sed is for "stream editor" and this is exactly what it is: a highly programmable text editor. You feed it some input text (from a file or a stream of data) but instead of just searching it (like grep) you can also manipulate and change it. If you want to filter out certain lines AND at the same time change the text found by some rules (like cutting out a certain part of the line, but also more complicated things) sed is the tool to turn to.

I am halfways through mustering the energy to write a sed-introduction, so watch out for "the most incomplete introduction to sed" (i only write "most incomplete" articles). To write it in a few sentences is just too complicated, i am sorry. What makes sed similar to grep is that it uses the same "regular expressions" to describe text patterns. So once you learn how to use grep and its powerful pattern-matching engine you can use this knowledge in sed too.

Lastly awk: awk stands for "Aho, Weinberger, Kernighan", its three primary authors. It is a regular programming language with some reminiscences of C and it sports a very similar regular expression engine as sed (the sed variant is called UNIX BRE - basic regular expressions, the awk variant UNIX ERE - extended regular expressions). Again, its scope overlaps with sed and for many problems here you will find a sed-solution along with an awk-solution.

awk has a built-in structure for the evaluation of data files: each awk-program consists of three parts: one that is executed before any input is read, one that is executed for every line of input and one that is executed after all input is processed. If you want to set up the program and draw some header, then process the input line by line, finally do some end-processing like drawing footers, sums and the like this is ideal.

OK, this is a quick and very incomplete overview of what the tools do. You will find that all three of them have their purpose. If you picture the UNIX toolbox as an orchestra waiting for a gifted conductor to make them sound great (you), especially sed and awk are the exceptionally gifted soloists. You will find that they have their quirks (as all the great artists do) but work with them can be immensely rewarding and five minutes of truly guiding them to their limit will compensate for the weeks of hard work in rehearsal.

I hope this helps.

bakunin
These 3 Users Gave Thanks to bakunin For This Post:
# 28  
Old 01-26-2017
Quote:
Originally Posted by bakunin
As vgersh99 already noted "grep" comes from "g/re/p", which is a (schematised) sed-command. Let us see if i can help to improve your understanding:

grep is basically a line filter: you feed it a stream of input (or files) and it displays all the lines matching a certain pattern. Options can i.e. reverse this matching (so effectively all lines NOT matching the pattern can be displayed), you can count the found lines ("-c"), etc. but basically that is it: filtering out lines containing some text pattern from text.

grep is a good tool to find out about the existence of certain text and all that is related to this:

grep -c "pattern" /some/file - count the number of lines containing "pattern"
grep -v "pattern" /some/file - display all lines NOT containing "pattern"
if grep -q "pattern" /some/file ; then - grep -q makes grep not display any lines, matched or otherwise. But since grep will exit with "0" if it has found anything and with "1" if it hasn't this will execute the following code if "pattern" was anywhere in the file.


Now sed: sed is for "stream editor" and this is exactly what it is: a highly programmable text editor. You feed it some input text (from a file or a stream of data) but instead of just searching it (like grep) you can also manipulate and change it. If you want to filter out certain lines AND at the same time change the text found by some rules (like cutting out a certain part of the line, but also more complicated things) sed is the tool to turn to.

I am halfways through mustering the energy to write a sed-introduction, so watch out for "the most incomplete introduction to sed" (i only write "most incomplete" articles). To write it in a few sentences is just too complicated, i am sorry. What makes sed similar to grep is that it uses the same "regular expressions" to describe text patterns. So once you learn how to use grep and its powerful pattern-matching engine you can use this knowledge in sed too.

Lastly awk: awk stands for "Aho, Weinberger, Kernighan", its three primary authors. It is a regular programming language with some reminiscences of C and it sports a very similar regular expression engine as sed (the sed variant is called UNIX BRE - basic regular expressions, the awk variant UNIX ERE - extended regular expressions). Again, its scope overlaps with sed and for many problems here you will find a sed-solution along with an awk-solution.

awk has a built-in structure for the evaluation of data files: each awk-program consists of three parts: one that is executed before any input is read, one that is executed for every line of input and one that is executed after all input is processed. If you want to set up the program and draw some header, then process the input line by line, finally do some end-processing like drawing footers, sums and the like this is ideal.

OK, this is a quick and very incomplete overview of what the tools do. You will find that all three of them have their purpose. If you picture the UNIX toolbox as an orchestra waiting for a gifted conductor to make them sound great (you), especially sed and awk are the exceptionally gifted soloists. You will find that they have their quirks (as all the great artists do) but work with them can be immensely rewarding and five minutes of truly guiding them to their limit will compensate for the weeks of hard work in rehearsal.

I hope this helps.

bakunin
Hey Bakunin!

Very nice intro to the tools! Thanks!
In the end by reading you guys, I determined that GREP (even though it's still a cool tool) won't help me much with what I'm looking for. Smilie
Since there are quite a few examples and tutos to use SED on the web I started with that.
I understand now very basic concepts of the tool such as the "-n" and "-i"/"-i.bak" options or the p/s/d commands. Smilie
From that (very) basic understanding I tried to get a better grasp of the cool use you made of it earlier:

Code:
sed -n '/href.*view more/ s/.*href="\([^"]*\).*/\1/p'

What I got from your command there is that you first specify the line (since SED works with lines exclusively right?) in which you will be applying the substitution:

Code:
'/href.*view more/

And you then substitute the portion
Code:
*href="\

and
Code:
"

at the end with nothing (I'm not sure where is that "nothing" portion in the code you're substituting the text with?)
Then through the print command you keep the only portion that matters:
Code:
([^"]

with "^" meanning everything from the beginning to the
Code:
"

I'm guessing that the escape characters you use are meant to specify that the
Code:
"

is part of the expression to be considered, for instance in:
Code:
*href="\

.
If so why does the end of the expression
Code:
[^"]

has no escape character?

Finally the -n and the p-command are used to exclusively print the portion that has been edited.

I tried to "re-use" your command based on my understanding but it doesn't print anything... SmilieSmilie
I actually get a ">" on the next line, as if one of my
Code:
"

wasn't closed...

The txt I'm working on is a curl from the same website:
PHP Code:
<div id="category_listing" itemscope itemtype="http://data-vocabulary.org/Product">
        
        <
div id="category_bg">
        <
div class="title">
            <
h1 itemprop='name'>For Sale <span itemprop='brand'>HITACHI </span> <span itemprop='name'>AIRIS 1  Magnet</span></h1></div>
            <
meta itemprop="category" content="Business &amp; Industrial>Medical Medical Equipment" />
        <!-- 
end div title -->
                <
div class="listing_num">LISTING #2229540</div>
           
</div
        <
div style='border-bottom: dotted 1px #666' class="clr"></div>
        <
div id="category_listing_body">
            
<
div id="list_detail"
and the command goes like:

Code:
sed -n '/*itemprop='\brand'\*span>/ s/.*brand'\([^<]*\).*/\1/p' result2.tx

The objective for now, would be to extract from the text above:
- the brand
- the name

Thanks as usual for your enlightenments!

Best!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Improve script

Gents, Is there the possibility to improve this script to be able to have same output information. I did this script, but I believe there is a very short code to get same output here my script awk -F, '{if($10>0 && $10<=15) print $6}' tmp1 | sort -k1n | awk '{a++} END { for (n in a )... (23 Replies)
Discussion started by: jiam912
23 Replies

2. Shell Programming and Scripting

How to improve an script?

Gents. I have 2 different scripts for the same purpose: raw2csv_1 Script raw2csv_1 finish the process in less that 1 minute raw2csv_2 Script raw2csv_2 finish the process in more that 6 minutes. Can you please check if there is any option to improve the raw2csv_2. To finish the job... (4 Replies)
Discussion started by: jiam912
4 Replies

3. AIX

improve sulog

I just wrote a very small script that improves readability on system sulog. The problem with all sulog is there is lack of clarity whether the info you are looking at is the most current. So if you just need a simple soution instead of going thru the trouble of writing a script that rotate logs and... (0 Replies)
Discussion started by: sparcguy
0 Replies

4. Shell Programming and Scripting

Want to improve the performance of script

Hi All, I have written a script as follows which is taking lot of time in executing/searching only 3500 records taken as input from one file in log file of 12 GB Approximately. Working of script is read the csv file as an input having 2 arguments which are transaction_id,mobile_number and search... (6 Replies)
Discussion started by: poweroflinux
6 Replies

5. IP Networking

How to improve throughput?

I have a 10Gbps network link connecting two machines A and B. I want to transfer 20GB data from A to B using TCP. With default setting, I can use 50% bandwidth. How to improve the throughput? Is there any way to make throughput as close to 10Gbps as possible? thanks~ :) (3 Replies)
Discussion started by: andrewust
3 Replies

6. Shell Programming and Scripting

Any way to improve performance of this script

I have a data file of 2 gig I need to do all these, but its taking hours, any where i can improve performance, thanks a lot #!/usr/bin/ksh echo TIMESTAMP="$(date +'_%y-%m-%d.%H-%M-%S')" function showHelp { cat << EOF >&2 syntax extreme.sh FILENAME Specify filename to parse EOF... (3 Replies)
Discussion started by: sirababu
3 Replies

7. UNIX for Dummies Questions & Answers

Improve Performance

hi someone tell me which ways i can improve disk I/O and system process performance.kindly refer some commands so i can do it on my test machine.thanks, Mazhar (2 Replies)
Discussion started by: mazhar99
2 Replies

8. Shell Programming and Scripting

improve this?

Wrote this script to find the date x days before or after today. Is there any way that this script can be speeded up or otherwise improved? #!/usr/bin/sh check_done() { if then daysofmth=31 elif then if ... (11 Replies)
Discussion started by: blowtorch
11 Replies

9. UNIX for Advanced & Expert Users

improve performance by using ls better than find

Hi , i'm searching for files over many Aix servers with rsh command using this request : find /dir1 -name '*.' -exec ls {} \; and then count them with "wc" but i would improve this search because it's too long and replace directly find with ls command but "ls *. " doesn't work. and... (3 Replies)
Discussion started by: Nicol
3 Replies

10. Shell Programming and Scripting

Can I improve this script ???

Hi all, Still a newbie and learning as I go ... as you do :) Have created this script to report on disc usage and I've just included the ChkSpace function this morning. It's the first time I've read a file (line-by-bloody-line) and would like to know if I can improve this script ? FYI - I... (11 Replies)
Discussion started by: Cameron
11 Replies
Login or Register to Ask a Question