Time command issuing all zeroes (is now considered homework help)
Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted!
1. The problem statement, all variables and given/known data:
A common problem arising in games and simulations is to generate a random arrangements of integers from 1 to N. For example, if you were programming a card game, you might represent the deck of cards as an array of N cards. Then, if you could generate a random permutation of the integers 1 through N, you could use that random list as the order in which to draw cards from the "shuffled" deck.
There is a very good algorithm for generating random permutations:
// Generate a random permutation of the integers from
// 0 .. n-1, storing the results in array a.
//
void permute (int a[], int n)
{
// Fill the array
for (int i = 0; i < n; i++)
{
a[i] = i;
}
for (int i = 0; i < n; i++)
{
k = rnd(n); // random int from 0 .. n-1
swap(a[i], a[k]);
}
}
A little thought will show that the above algorithm is O(n), and we could hardly hope for anything faster, since it takes O(n) effort just to fill up the array.
Many people, however, find the above algorithm counter-inutitive. It may appear that elements near the end of the array appear to be likely to be swapped more often, so some people wonder if this algorithm is "uniformly" random - does each integer have an equal probability of winding up in any slot of the array?
In fact the above algorithm is uniformly random, but it takes a careful look at the underlying mathematics to prove this is true. So many programmers prefer to attack this problem by using other algorithms that are more obviously uniformly random.
In this assignment, you will analyze one such alternative algorithm for generating random permutations. You will then check your analysis by timing the algorithm on a variety of input sets to experimentally determine the average-case behavior. You can then check to see if your theoretical analysis is borne out by the actual timings.
The Algorithm
The algorithm in permute.cpp generates permutations by generating a random integer for each position in the array, checking to be sure that the generated number has not already been used. It does this check by keeping an array of boolean flags, used, such that used[i] is set to true if the value i has already been put somewhere in the array.
The Assignment
Analysis
Use the copy-and-paste method to give an average-case analysis for this algorithm. Show all work!
Experiment
Check your analysis by running the algorithm on a variety of input sizes an measuring the time it takes.
The program pdriver.cpp can be used as a ``driver'' to execute the permute algorithm. This program expects a pair of command line arguments when run. The first argument is the number of items to permute. The second is the number of times you want to repeat the permutation process.
For example, if you compile this program to produce an executable named "p2driver", then
pdriver 50 10
will generate 10 permutations of 50 elements each.
Because the purpose of this exercise is to generate timing data raher than to actually work with the algorithm, the generated permutations are not printed. Feel free to add output statements if you want to see the algorithms in action, but remember to remove those statements before proceeding on to the timing steps.
Compile the program.
Run the program for different sizes of N and time the algorithm.
On Unix systems, you can measure the run time of the programs (or of any Unix program/command) by placing the command "time" in front of the program invocation. For example,
time pdriver 50 10
will determine the time required to generate 10 permutations of 50 elements each.
The time will appear in a format similar to this:
3.36u 0.07s 0:05.43 63.2%
The first two numbers are of the most interest to us. Their sum is the number of seconds the CPU devoted to execution of this program. This is often much less than the "clock time", because the CPU is usually being shared by several different programs.
The "u" stands for "user" and the "s" for "system". The first number is the amount of time spent in "user" code, the second is the amount of time spent in "system" calls. Together, these consititute the "CPU time" used by the command. The third number is actually the clock time, which you can see in this example is much longer than the CPU time. The fourth is the percentage of time that the CPU was working on your code as opposed to someone else's.
Another important factor to keep in mind is that we are looking for average times. If we ran the algorithm only a single time for a particular value of N, depending upon how the random numbers came out, we might get an unusually fast or an unusually slow run. We need to do multiple repetitions for each value of N so that we can get the average time of a single run of size N. That's why the main driver function of this program is designed to allow you to request multiple repeitions of the algorithm for a fixed N.
Another reason for doing averages over many repetitions is to reduce the effect of measurement errors - always a possibility in any experiment. Even computer clocks have a finite resolution - for small values of N this algorithm might run in a fraction of one clock "tick". Another source of mearurement error in this kind of experiment is the fact that we can't run just the permute algorithm in isolation, we have to run a whole program that launches the permute algorithm, so some of our measured time will be due to code other than the algorithm we are interested in.
Timings of less than 10 seconds are likely to be dominated by the overhead of starting and stopping the program, so you should adjust the repetition count (the second argument of the permute program) to make sure that all your timed runs take that many seconds, at the least. A minimum of 60 seconds is probably a good target.
In a non-Unix environment, getting the timing information is harder. You can, of course, time the program with a good old-fashioned stopwatch, but you'll need to take special care to be sure that your own physical reaction time doesn't affect the results.
Another possibility is to create a batch file like this one:
date
./pdriver 50 10
date
to print the clock time just before and after running the program.
One problem with both this and the stopwatch technique is that these measure ``clock time''. As noted above, this is generally less trustworthy than CPU time.
Evaluation
Present your data in a table like the following one (prepared for a different algorithm):
Image is attached
The first columns show the problem size, the number of repetitions, and the total time observed. The fourth column divides the total observed time by the number of repetitions to get the average time T to generate a single permutation of N integers.
The remaining columns come in pairs. Each pair shows a possible big-O f(N) function. In this example, I used N2, N2 log N, and N3, respectively. These have been chosen so that the middle possibility is the one predicted by the prior big-O analysis, and the other two are the next lower and the next higher power of N.
For each pair, the first column in the pair shows the value of that f(N), for the values of N in each row. The second column in each pair divides f(N) by T (the average time to generate one permutation). In ideal circumstances, if an algorithm has average time O(f(N)), then f(N)/T should be constant across all values of N.
Of course, we almost never get ideal circumstances. We know that big-O results are approximations that omit lower-order terms, and physical experiments always introduce some variation of their own. But in this case, we can see that
N3/T gets bigger as N increases, suggesting that the function N3 is too big to describe this timing data.
N2/T gets smaller as N increases, suggesting that the function N2 is too small to describe this timing data.
The predicted function is N2 log N, and (N2 log N)/T does get larger as N increases, but the change is much smaller than in the other two functions, especially for the larger values of N. So it appears to be the best candidate function.
Of course, in your own table you may use different f(N) functions (depending upon the results of your prior analysis). Try to follow the guideline of having the function predicted by your analysis, in between a slightly smaller and a slightly larger function.
You will also probably need to use different values of N and R. In order to detect an overall trend, make sure that your values of N range over many orders of magnitude. If your values of N are clustered too tightly together, you may not be able to see the overall trend for the algorithm. Also keep in mind that big-O is always "for sufficiently large N". If many of your N values are small, then lower-order terms (the ones we drop because they are dominated by thalrgest term) may still be playing a substantial role in the timing results.
2. Relevant commands, code, scripts, algorithms:
3. The attempts at a solution (include all code and scripts):
time pdriver arg1 arg2
4. Old Dominion University, Norfolk, VA, USA, Zeil, CS 361
Note: Without school/professor/course information, you will be banned if you post here! You must complete the entire template (not just parts of it).
I already posted about this HERE because it was more of a general question but since I was asked to post code I decided that it should be a homework question. I basically cannot get my time function to execute properly because I am getting 0's for run time. Can anyone assist me please? I'm executing the program in a unix terminal.
They didn't literally mean you to give it the arguments 'arg1' and 'arg2'. You're supposed to give it numbers -- they determine the data size and number of loops. Giving it the strings 'arg1' and 'arg2' cause them both to be zero, which makes it do no work -- zero loops over zero elements.
Last edited by Corona688; 07-18-2013 at 12:47 PM..
It is running now. I used my installed copy of ubuntu to use the time command. However I get 3 different lines of output:
I do believe my assignment instructions say to sum the first two numbers but my output is different than the example in the first post. I should sum the user and sys values. So what is the number in the "real" field for? Should I ignore it for this assignment?
Last edited by Scott; 07-18-2013 at 09:07 PM..
Reason: Code tags
The number following "real" is the time that elapsed from when the command started running until it finished (this is sometimes referred to as wall clock time); the number following "user" is the time that was attributed to your user code while the command was running; and the number following "sys" is the time that was attributed to OS (kernel) code while the command was running. On a multi-processor system, user time + sys time could be larger than real time if multiple cores were running different parts of your code at the same time. Real time could easily be much more than the sum of user time + sys time if your code was waiting for I/O or was delayed while other code was being run on your system by other processes.
PS Note that when the shell you're using prints time output in the format:
the 1st number (ending with 'u') is user time, the 2nd number (ending with 's') is sys time, and the 3rd part is wall clock time. The percentage at the end is how much of the system the command was using during the life of the command: ((user + sys) / real) * 100.
Last edited by Don Cragun; 07-18-2013 at 09:58 PM..
Reason: Add PS comparing different time output formats.
Hi,
Using ksh, I have set up aliases (which work fine) in my user's .profile like this:
alias ll = 'ls -lrt'
alias cls = 'clear'
How do I call these same aliases so that they will work after issuing this: ? su - user
I do not want to add aliases to the su shell's .profile or .kshrc. (7 Replies)
Greetings all,
On a RedHat System - I am issuing a command from script.sh that will add a file extension to a listing of files in a directory. It works, but I need to script from having an extension added as well.
Here is what I have tried to no luck:
for file in `ls * | awk ' /\./{print... (6 Replies)
I am trying to issue the time command on a program so I can see execution times but it is returning all zeroes. Like this:
time pdriver arg1 arg2
0.000u 0.000s 0:00.00 0.0% 0+0k 0+0io 0pf+0w
"0+0k 0+0io 0pf+0w" --> The "0+0io" may change sometimes to a different number.
How can I run the... (2 Replies)
Ive a problem that I'm reaching out for help.
Ive written (With bits and pieces) of script that is not running as expected or is having an issue causing processes to spiral out of control.
The script does this:
Unloads a UV database server
Tars up a few folders
Transfers the file to... (11 Replies)
Hi friends,
I hope everyone is doing well and fine. I have always been hearing that C/C++ are relatively low-level as compared to Java/C# etc. Could you please tell me some low-level qualities of C/C++? And I think disk deframenters are written in C/C++, please correct me if I am wrong. And please... (5 Replies)
How can I run a here document on just one line? I ask, because I need to issue it from C++ as a system() or similar command and for security reasons I don't want to write out a shell script file from the program and run it.
For example, how could I write:
passwd test <<EOF
n3wp3ss... (3 Replies)
I have noticed a few posts asking questions about c shell scripting these past few days. This a good read for those that currently or are thinking about writing a csh script:
Csh Programming Considered Harmful (9 Replies)
Our Aix Unix has one issue.
If I type xedit &
after 30 minutes, xedit auto shut down.
If I type xedit
xedit will run forward.
I feel this & doesn't perform as it should be.
When I use &, system consider this process as idle.
How to fix this issue?
Does this mean unix env... (12 Replies)
I wondered if someone could point out the differences between the time commmand and usr/bin/time and the accuracy one might have over another.
Also, is there a website or two a person could maybe link for me to describe the differences?
Thank you for your time. (2 Replies)