Help with exit, grep, temporary files, awk


 
Thread Tools Search this Thread
Homework and Emergencies Homework & Coursework Questions Help with exit, grep, temporary files, awk
# 1  
Old 08-01-2012
Hammer & Screwdriver Help with exit, grep, temporary files, awk

1. The problem statement, all variables and given/known data:
I do not understand how/why the following code is used. Please do not simply refer me to the man pages since I have already reviewed them extensively. Thank you.

exit 2 , exit 3, exit 0
I understand the basics of why the exit command is used, but I still don't understand what the significance/meaning of exit 2, exit 3, and exit 0 are.
The creation (and later deletion) of a temporary file
I understand that creation of the temporary file is used to print the contents of the following commands within that file so the grep command can be used to calculate our echoed results that we want, BUT, isn't there a better way to do it so we don't have to create/delete a file?
ls -log "$1" | awk '{print $1}' | grep -v total > $TF
I understand how the first part is piped to the awk, but I do not understand how the awk command is printing only the first column of information.
As far as the "grep -v total" goes, am I correct in that that command basically says to print to screen only those lines within that first column that do NOT contain the word "total"?
f_count=$(grep -c "-........." $TF
I do not understand why this will not work, however, it also doesn't seem like the best way to accomplish counting the number of files within the specified directory.

2. Relevant commands, code, scripts, algorithms:
Code:
#!/bin/sh

# Check correct number of arguments (one)
# "$#" Stores the # of command-line arguments passed to shell program
if [ $# -ne 1 ]
then
  echo "Usage: homework4.sh <directory_name>"
  exit 2
# Check for directory existence
else
  if [ ! -d "$1" ]
  then
    echo "$1: No such directory"
    exit 3
  fi
fi

# Set up a temporary file name
TF=/tmp/$0.tmp

# Prints only the permissions of all files and directories to $TF
ls -log "$1" | awk '{print $1}' | grep -v total > $TF
# Count (-c) the number of lines where the character ("d", "r", etc.) exists
d_count=$(grep -c "d" $TF)
f_count=$(grep -c "-........." $TF)
r_count=$(grep -c "r" $TF)
w_count=$(grep -c "w" $TF)
x_count=$(grep -c "x" $TF)
# Print the results to screen
echo -n "In the directory "; pwd
echo "  Number of directories     : $d_count"
echo "  Number of files           : $f_count"
echo "  Number of readable items  : $r_count"
echo "  Number of writable items  : $w_count"
echo "  Number of executable items: $x_count"

# Remove temporary file
rm $TF
exit


3. The attempts at a solution (include all code and scripts):
The above code has been modified extensively by me and shows the final attempts of my efforts.

4. Complete Name of School (University), City (State), Country, Name of Professor, and Course Number (Link to Course):
The Florida State University, Tallahassee Florida, USA, William Street, COP3353 (ww2.cs.fsu.edu/~street/cop3353/hw/hw04.html)
# 2  
Old 08-01-2012
Quote:
I do not understand how/why the following code is used. Please do not simply refer me to the man pages since I have already reviewed them extensively. Thank you.

exit 2 , exit 3, exit 0
I understand the basics of why the exit command is used, but I still don't understand what the significance/meaning of exit 2, exit 3, and exit 0 are.
The exit code used to provide a simple reply to the calling script or process as to the result of executiing the script. Code zero means it worked. Other codes are chosen by the script writer.
A calling script will find that the value of the exit code is in the Environment Variable $? .
Code:
# Illustration of using the exit code by storing it in a named variable
# We store it in a variable because the next command will change the value!
./scriptname ; RESULT=$?
if [ ${RESULT} -gt 0 ]
then
       echo "Script failed with exit code: ${RESULT}"
fi


Quote:
The creation (and later deletion) of a temporary file
I understand that creation of the temporary file is used to print the contents of the following commands within that file so the grep command can be used to calculate our echoed results that we want, BUT, isn't there a better way to do it so we don't have to create/delete a file?
ls -log "$1" | awk '{print $1}' | grep -v total > $TF
I understand how the first part is piped to the awk, but I do not understand how the awk command is printing only the first column of information.
As far as the "grep -v total" goes, am I correct in that that command basically says to print to screen only those lines within that first column that do NOT contain the word "total"?
If you want to make multiple enquiries on a data selection, a temporary file is the best and most efficient solution. The awk looks for any white space between fields and outputs the first field ($1). The grep -v excludes the Total line from ls. It can be improved with -iv as some ls commands output in upper and lower case.
Code:
ls -log "$1" | awk '{print $1}' | grep -iv "total" > $TF


Code:
f_count=$(grep -c "-........." $TF)
I do not understand why this will not work, however, it also doesn't seem like the best way to accomplish counting the number of files within the specified directory.

Sorry, I don't understand this command. Other posters will!
I'd have done it with:
Code:
f_count=$(grep -c \^\- $TF)


Last edited by methyl; 08-01-2012 at 06:53 PM.. Reason: formatting fun; correct ls -1 line
# 3  
Old 08-01-2012
Quote:
The exit code used to provide a simple reply to the calling script or process as to the result of executiing the script. Code zero means it worked. Other codes are chosen by the script writer.
A calling script will find that the value of the exit code is in the Environment Variable $?
So would it be better practice to end my script with exit 0 instead of just exit?

Quote:
The awk looks for any white space between fields and outputs the first field ($1).
So the $1 in the first part of that line of code ls -log "$1" is referring to the parameter of the script (in this case, a directory name), but the $1 in the second part of that line of code awk '{print $1}' is referring to the first field (in this case, the permissions), right?

Quote:
Sorry, I don't understand this command.
By using f_count=$(grep -c "-........." $TF), I was trying to say count the number of lines that start with a - (which would signify a file, as opposed to a d, for directory, of course) and are followed by any variation of rwx permissions (this is why I tried using the "dot" wildcard . nine times). Should I have instead written f_count=$(grep -c "-.*" $TF)? My thinking is still probably wrong...

Quote:
f_count=$(grep -c \^\- $TF)
Could you please explain what the \^\- does?
# 4  
Old 08-02-2012
Could you please explain what the \^\- does?

it matches the line which starts with - ( hypen )

Code:
 
$ cat d.txt
-ab
bc
-ef
gh
ij
-kl
$ grep -c \^\- d.txt
3

$ grep  \^\- d.txt                                                                                                                                 
-ab
-ef
-kl

# 5  
Old 08-04-2012
Quote:
Originally Posted by BartleDoo
So would it be better practice to end my script with exit 0 instead of just [icode]exit[icode]?[/COLOR]
In one word: yes! Seen from the outside a script is just a command. It uses other commands (system commands and other scripts) and might be used by other commands as well.

If you have a look in the man pages you will notice that almost every command gives a return code. Not giving back a return code will bereave the calling process to determine if the called process has had success or not.

For instance: "ls" returns usually "0", regardless of what was found in the directory. But it will give back "2" if the directory doesn't exist. Even if you redirect every output to "/dev/null" you still could use this return code:

Code:
$ ls > /dev/null 2> /dev/null ; echo $?
0
$ ls /some/where/does/not/exist > /dev/null 2> /dev/null ; echo $?
2

You could now base some program logic on this:

Code:
#! /bin/sh
DIR=""
RESULT=0

echo "Enter a directory name: "
read DIR

ls $DIR >/dev/null 2>/dev/null
RESULT=$?

if [ $RESULT -eq 0 ] ; then
     echo "Directory $DIR exists"
elif [ $RESULT -eq 2 ] ; then
     echo "Directory $DIR does not exist."
else
     echo "Some unknown error occurred."
fi

exit 0

This is done quite often in production code. For instance, suppose a script copies a file. Using the return code of "cp" the script could find out if everything went well (RC=0) or if maybe there was not enough disk space (another RC) or the destination was not writable for the process (yet another RC) or any other error occurred and the copying failed - by analysing the RC. The scripts author could base a meaningful error message on this instead of just saying "failed" and abort.

Now, what you use from other programs you should also provide because there might be a script calling your script which might want to base its own decisions on what you report back to it.

Btw. "exit" by default means "exit 0", so you don't have to write that. But you should make your exit codes to be meaningful and documented, for instance:

0: success (always! - that is standard)
1: file not found
2: no space left on device
3: ....

Quote:
So the $1 in the first part of that line of code ls -log "$1" is referring to the parameter of the script (in this case, a directory name), but the $1 in the second part of that line of code awk '{print $1}' is referring to the first field (in this case, the permissions), right?
Yes. "$1" in script language (that is: inside the scripts program text) is not the same as in awk language. These are different programming languages and therefore the same text doesn't have to have the same meaning. It is like, say, Javascript inside an HTML page: the same text might have different meanings inside the JS part and outside, in the HTML part.

In a script "$1" is the first positional parameter. If you call a script like this:

Code:
$ script.sh first second third

then inside the scripts program the first parameter ("first") would automatically be assigned the variable "$1", the second ("second") would be assigned "$2", etc.

In awk this is completely different. awk is a language to work through text files and manipulate them. Basically a text file is constructed of lines and awk is presented one line after the other. So, an awk program gets the first line, works through the program until its end, then starts over with the next line, and so forth.

Every "line" in a text file is interpreted as a series of fields, separated by field separators (per default the separators are blanks, so the "fields" are "words").

When an awk program starts with a new line some variables are automatically filled with values: "$0" is the content of the whole line, "$1" the content of the first field, "$2" that of the second field, etc.. Your awk program now has only one command: "print $1", which simply means: "print the content of the first field" or, on other words: "everything from the beginning of the line to the first field separator", which is still the default "blank". Therefore the first word is printed to output.

Quote:
By using f_count=$(grep -c "-........." $TF), I was trying to say count the number of lines that start with a "-".[...] Should I have instead written f_count=$(grep -c "-.*" $TF)
No, you are doing alright. Both regexps are matching what you want to match. Why methyls regexp is still better is not easy to see:

You want to match the part with the filemode at the beginning of the line. Your regexp does match this. But: you should ask yourself not only, if it would catch all the intended catches, but also, if it could catch things you didn't want to get caught - false positives.

You search for a "-" followed by nine characters - any characters. That means that a string like "-abcdefghi" would match this definition too. Look at the following listing and ask yourself if your regexp would get confused or not:

Code:
# ls -log
total 132
drwxr-xr-x. 11  4096 Aug  1 17:34 my-downloads
-rw-r--r--   1     2 Jun 25 15:26 some.file

methyls regexps avoids this because he includes the beginning of line to the prerequisites, which would prevent "-downloads" to be matched like in your regexp. (Notice that "^" at the beginning of a regexp means "beginning of line". "a" matches any "a", "^a" only an "a" as the first character of a line.)

A second thing is: you should not search for "any character", because the file modes can only be "r", "w" and "x" or a "-" respectively (i intentionally leave out sticky-bits to keep this simple). So your regexp would be a bit more robust if you would narrow it down this way:

Code:
f_count=$(grep -c "-[r-][w-][x-][r-][w-][x-][r-][w-][x-]" $TF)

Now it is not "a dash followed by any nine characters", but "a dash, followed by either a 'r' or a '-', followed by either a 'w' or a '-', followed by ....".

But even this tighter match might be fooled whereas methyls regexp is fool-prof because of his "anchoring" at the line start.

And there is either another point to it: when will the grep program be able to determine if the line is a match or not? Lets see your case: it will have to look at every character in a line save for the last n characters, where "n" is 10 - the length of your regexp - in the best case (if it hasn't matched any character so far everything beyond the tenth to last couldn't provide a full match any more).

In methyls case it has to check only one character: the first in the line. It doesn't have to look beyond to determine if the line is a match or not. This is why his regexp will be a lot faster then yours. You will not notice this if you check 10 lines, but if you have thousands of lines to check it will make a difference.

I hope this helps.

bakunin
# 6  
Old 08-04-2012
Quote:
Originally Posted by bakunin
... ... ...
Btw. "exit" by default means "exit 0", so you don't have to write that. But you should make your exit codes to be meaningful and documented, for instance:

0: success (always! - that is standard)
1: file not found
2: no space left on device
3: ....

... ... ...

bakunin
Actually, in any POSIX/UNIX conforming shell,
Code:
exit 0

is not the always the same as
Code:
exit

When exit is called with no operands, the rules are:
Quote:
The value shall be the exit value of the last command executed,
or zero if no command was executed. When exit is executed in a
trap action, the last command is considered to be the command
that executed immediately preceding the trap action.
Thus, if the success or failure of a shell script is determined by the last command the script executes, calling exit without an exit status returns the exit status of that command. So, the code sequences:
Code:
awk '<some awk program>'
exit

Code:
awk '<some awk program>'
exit $?

Code:
awk '<some awk program>'
status=$?
if [ $status != 0 ]
then {
    printf "$0: awk failed\nTerminating unsuccessfully at %s\n" "$0" "$(date)"
    exit $status
}
printf "$0: Terminating successfully at %s\n" "$0" "$(date)"
exit $status

in a script all provide the same exit status to the invoking shell.

---

This has been edited, I originally screwed up the test for success.

Last edited by Don Cragun; 08-04-2012 at 02:23 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

7 More Discussions You Might Find Interesting

1. AIX

Hidden temporary files in AIX

Hi, Some porocess is creating hidden temporary files in /tmp directory. And they are not getting deleted. System is going out of disk space after some days. The temp files are getting created like .<user name><pid>. I have checked the application code, but didnt get any clue. Does these files... (4 Replies)
Discussion started by: viswath.sen
4 Replies

2. UNIX for Advanced & Expert Users

Which time should be used for deleting temporary files?

I want to create a folder for users to put their temporary files and a folder for users to put their permanent files. For the temporary folder, I need to implement a deletion policy. I would like to know normally which time, ctime, mtime or atime, should be used to implement such deletion policy. (1 Reply)
Discussion started by: marctc
1 Replies

3. Shell Programming and Scripting

Rsync temporary files

Hi, I am looking to use rsync in a very specific way, and even though I have trawled the rsync man pages I have not succeeded in seeing a way of doing the following: The temporary files created by rsync should not be created in the destination directory. (I have used --temp-dir option to... (0 Replies)
Discussion started by: LostInTheWoods
0 Replies

4. Shell Programming and Scripting

Help - Bug: A script to compile two types of data files into two temporary files

Dear other forum members, I'm writing a script for my homework, but I'm scratching all over my head and still can't figure out what I did wrong. Please help me. I just started to learn about bash scripting, and I appreciate if anyone of you can point out my errors. I thank you in advance. ... (3 Replies)
Discussion started by: ilove2smoke
3 Replies

5. Shell Programming and Scripting

Writing files without temporary files

Hey Guys, I was wondering if someone would give me a hand with an issue I'm having, let me explain the situation: I have a file that is constantly being written to and read from with updated lines: # cat activity.file activity1 activity2 activity3 activity4 activity5 This file... (2 Replies)
Discussion started by: bashshadow1979
2 Replies

6. Shell Programming and Scripting

Temporary files and rm

Hello everyone, I am creating a temporary file in my ksh by using something file filetemp=filetemp.$$ Later on in my script I write to the file $filetemp by 'cat'ing to it. Then in the script I am doing a 'less' on the file to view it. At the end of the script I issue a rm $filetemp 2>... (4 Replies)
Discussion started by: gio001
4 Replies

7. Ubuntu

Avoid creating temporary files on editing a file in Ubuntu

Hi, My ubuntu flavor always create temporary files having filename followed by ~ on editing. For eg: if I am editing a file called "sip.c", automatically a temporary (bkup) file is getting created with the name "sip.c~". How to avoid this file creation? (7 Replies)
Discussion started by: royalibrahim
7 Replies
Login or Register to Ask a Question