12-19-2007
Using other computers for processing


I've wrote a C++ program which does some mathematical calculations, but the problem is that it takes way too long on any computer to finish.

Is there anyway to make more than 1 computer do the processing so it can process faster?
12-19-2007

Option a. Run it on a faster computer.

After that the options get a bit harder....

Can you split the problem up so that different computers can solve different parts of the problem independently?

Can you split it up so that parts can be done in parallel?

Have you got a really crap algorithm that may be mathmatically correct but is really inefficient?

Can you solve the problem at different resolutions/accuracies so you apply varying amounts of horse power to different parts of the problem?

If you are curious, one of the recent proofs of minimum moves to solve the rubik's cube used the last of the options....
12-22-2007
Originally Posted by porter
Have you got a really crap algorithm that may be mathmatically correct but is really inefficient?
It won't hurt to check that with the "bible of programming", old and new testament, so to say ;-)) :

- Donald Knuth, The Art of Computer Programming
Depending on your problem there is Vol.1 (Numerical Algorithms), Vol.2 (Seminumerical Algorithms) and Vol.3 (Sorting and Searching)

- Robert Sedgewick, Algorithms in C
Covering only C but for purely mathematical problems this should be the same more or less.

Here is another way: switch to a language more suited for achieving calculation power than C - use FORTRAN! I don't think that the mathlib of FORTRAN 77 has ever been beaten for speed.

12-23-2007
Does this help?
Frequently Asked Questions
12-23-2007

My favorite quote in this area is:
... premature optimization is the root of all evil." (Knuth, Donald. Structured Programming with go to Statements, ACM Journal Computing Surveys, Vol 6, No. 4, Dec. 1974. p.268.)
-- wikipedia article, see below
I have been fortunate enough to work on Big iron for much of my professional life:
  • Control Data (CDC): 160, 1604, 6600 and follow-ons; 203, 205, ETA-10
  • Cray Research (CRI): CRAY- 1, CRAY-2, CRAY X-MP
  • IBM: 3090 (AIX)
  • Thinking Machines (TMC): CM-2 (& 200), CM-5
You may have done your homework on performance issues, but if not, I suggest you look at -- a quick-and-dirty-off-the-top-of-my-head-list:
An older book that you might be able to find used is:
Title: High Performance Computing
Subtitle: RISC Architectures, Optimization & Benchmarks
Author: Charles Severance, Kevin Dowd
Edition: 2
Date: July 2, 1998
Publisher: O'Reilly
ISBN: 156592312X
Pages: 460
Categories: high performance, optimization, programming, software design
Comments: 5 stars (4 reviews, Amazon, 2007.12)
Comments: ( I have 1st edition, 1993 )
Most of the suggestions listed above by posters are appropriate at some time in the optimization process. I have a few principles that I advise folks to think about:
-1: Does this program / process / code absolutely, positively need to be faster?

0) Make it run right before you make it faster

1) Spend most of your personal time finding the best algorithm. There is a story in Programming Pearls, J Bentley, about the comparison between an algorithm implemented in compiled Fortran on a Cray-1 versus a better algorithm in interpreted Basic on a Radio Shack TRS-80. As you might guess, the Cray-1 crushed the TRS-80 -- at least at a small problem size. As the size went up, the TRS-80 eventually overcame the mighty Cray-1, and for the largest size listed, the Cray would have taken 95 years, the TRS-80 5.4 hours.

Another story about algorithms has to do with advances in hardware. There are many algorithms that have been discarded because they were too slow -- at least on scalar machines. When parallel processing became a reality, some of those really inefficient algorithms turned out to be spectacularly useful on parallel boxes. The CM-2 (200) above had 32,000 processors, but they were bit-slice computers. Most people used the mode where they ganged them by 32s to get a 1,000 processor box -- quite respectable for that time in computing history. If you used the right algorithm applied to right problem, that machine really cranked out results. (That was a "half-gallon" machine, the "one gallon" had 64K processors.)

2) Profile / instrument your code; obtain measurements to see where it is spending its time, then spend your precious time in those areas. A few years back, I did the opposite of what I had usually done. A client asked me to take a code that previously ran on a Cray and port it to run on a PC. It was far too complex a code to consider an algorithm change (although I suggested that their domain experts look at that). I profiled it and saw that it spent a lot of time doing IO. The best approach at that point was to allocate as much memory as feasible to a RAMdisk. That affected the models that I was using by decreasing the real time by 30% (we might have expected more, but this was all done with filesystem drivers, so that code did not need to be modified). If there was more to be done, a RAID-0 across several disks would have been next.

If you have some money, perhaps all you need is more memory, or a box that has two or more CPUs, an account at a computing service bureau, etc. However, I suggest that you take a step back and consider all your options and possibilities, to avoid the premature optimization trap.

Best wishes ... cheers, drl
12-24-2007
How about using Hadoop ?

I have not been into that completely yet.
