The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Advanced & Expert Users
.
google unix.com




View Single Post in the UNIX and Linux Forums - Click on the Thread or Permalink to View Entire Thread -->
  #5 (permalink)  
Old 12-23-2007
drl's Avatar
drl drl is offline Forum Advisor  
Registered User
  
 

Join Date: Apr 2007
Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 704
Hi.

My favorite quote in this area is:
Quote:
... premature optimization is the root of all evil." (Knuth, Donald. Structured Programming with go to Statements, ACM Journal Computing Surveys, Vol 6, No. 4, Dec. 1974. p.268.)
-- wikipedia article, see below
I have been fortunate enough to work on Big iron for much of my professional life:
  • Control Data (CDC): 160, 1604, 6600 and follow-ons; 203, 205, ETA-10
  • Cray Research (CRI): CRAY- 1, CRAY-2, CRAY X-MP
  • IBM: 3090 (AIX)
  • Thinking Machines (TMC): CM-2 (& 200), CM-5
You may have done your homework on performance issues, but if not, I suggest you look at -- a quick-and-dirty-off-the-top-of-my-head-list:An older book that you might be able to find used is:
Quote:
Title: High Performance Computing
Subtitle: RISC Architectures, Optimization & Benchmarks
Author: Charles Severance, Kevin Dowd
Edition: 2
Date: July 2, 1998
Publisher: O'Reilly
ISBN: 156592312X
Pages: 460
Categories: high performance, optimization, programming, software design
Comments: 5 stars (4 reviews, Amazon, 2007.12)
Comments: ( I have 1st edition, 1993 )
Most of the suggestions listed above by posters are appropriate at some time in the optimization process. I have a few principles that I advise folks to think about:
-1: Does this program / process / code absolutely, positively need to be faster?

0) Make it run right before you make it faster

1) Spend most of your personal time finding the best algorithm. There is a story in Programming Pearls, J Bentley, about the comparison between an algorithm implemented in compiled Fortran on a Cray-1 versus a better algorithm in interpreted Basic on a Radio Shack TRS-80. As you might guess, the Cray-1 crushed the TRS-80 -- at least at a small problem size. As the size went up, the TRS-80 eventually overcame the mighty Cray-1, and for the largest size listed, the Cray would have taken 95 years, the TRS-80 5.4 hours.

Another story about algorithms has to do with advances in hardware. There are many algorithms that have been discarded because they were too slow -- at least on scalar machines. When parallel processing became a reality, some of those really inefficient algorithms turned out to be spectacularly useful on parallel boxes. The CM-2 (200) above had 32,000 processors, but they were bit-slice computers. Most people used the mode where they ganged them by 32s to get a 1,000 processor box -- quite respectable for that time in computing history. If you used the right algorithm applied to right problem, that machine really cranked out results. (That was a "half-gallon" machine, the "one gallon" had 64K processors.)

2) Profile / instrument your code; obtain measurements to see where it is spending its time, then spend your precious time in those areas. A few years back, I did the opposite of what I had usually done. A client asked me to take a code that previously ran on a Cray and port it to run on a PC. It was far too complex a code to consider an algorithm change (although I suggested that their domain experts look at that). I profiled it and saw that it spent a lot of time doing IO. The best approach at that point was to allocate as much memory as feasible to a RAMdisk. That affected the models that I was using by decreasing the real time by 30% (we might have expected more, but this was all done with filesystem drivers, so that code did not need to be modified). If there was more to be done, a RAID-0 across several disks would have been next.

If you have some money, perhaps all you need is more memory, or a box that has two or more CPUs, an account at a computing service bureau, etc. However, I suggest that you take a step back and consider all your options and possibilities, to avoid the premature optimization trap.

Best wishes ... cheers, drl