High Performance Computing

Thread Tools Search this Thread
Special Forums UNIX and Linux Applications High Performance Computing High Performance Computing
# 1  
Old 11-11-2008
High Performance Computing

I am interested in setting up some High Performance Computing clusters and would like to get people's views and experiences on this.

I have 2 requirements:

1. Compute clusters to do fast cpu intensive computations
2. Storage clusters of parallel and extendable filesystems spread across many nodes

Both of these should run across multiple commodity hardware nodes and ideally be Linux/Unix based and open source.

Any feedback welcome.
# 2  
Old 11-12-2008
# 3  
Old 11-12-2008
Thanks for the response. I did start reading those 2 sites.

I was also interested in people's opinions and experiences of any of the technologies surrounding Linux High Performance Clusters and parallel filesystems.

Is anybody out there using these technologies in production and what kinds of things are they doing and how?
# 4  
Old 11-12-2008
Think carefully about what sort of problems you want to solve, e.g.
parallel computation or task farming?
If the former, then are the communications latency-bound or bandwidth-bound? Are collective communications important? Will you need full switching for remote comms, or just nearest-neighbour?
CPU-bound or memory-bound or IO-bound?

These factors are not necessarily mutually exclusive and Inevitably there are trade-offs, but one size does not fit all.
# 5  
Old 11-12-2008
1. CPU intensive computation of a single task
2. Parallel computation of a task broken down into pieces
3. Storage across many commodity nodes with scalability and i/o performance
4. The solutions do not need to be geographically dispersed, same server room is fine.
# 6  
Old 01-21-2009
So this problem used to be rather "simple". Just use some CPU metric (MPIS, FLOPS, SPECint, SPECfloat, whatever) and divide it by the cost of a computer. Then we had a clear choice: 2 CPUs per "1U" system. Now the choice has expanded to cores per chip and we have 2, 4, and even 8-way systems (AMD). You could build a rack of Sunx6400, each containing 64 cores. But we also shouldn't forget Sun's T1 processor line, with 128 "virtual" cores.

Further complicating the issue: cost is no longer just for the compute node. Now you have to consider the networking costs between them. 100 Mbit Ethernet switches are cheap, but may not be suitable for a cluster of very fast machines. Infiniband gives you great performance, but scaling is very expensive -- just the cabling alone can cost as much as your CPUs!

Further complications: the operating costs of cooling and electricity are not insignificant. For every watt used by the CPU, you can count on needing 2 watts to cool it (depends on the climate you're in). Thus, if every computer node requires 1.5 A, and you have 256 compute nodes, you will need 1.5 * 256 * 3 = 1152 Amps of power and maybe 2 30-ton chillers.
# 7  
Old 01-21-2009
Originally Posted by humbletech99
1. CPU intensive computation of a single task
Q1: What percentage of the operations are floating point? Do you need double-precision? (Usually the answer is yes).

2. Parallel computation of a task broken down into pieces
What's the expected ratio between computation time and communication time between the pieces. Medium ratio: do some computation, then send intermediate results to all neighbors, then do some more computation. Low ratio: compute, send a result, wait for a message, compute, send a result, and so on. High ratio: the CPUs crunch, crunch, crunch, then finally send results to a central task which does a final computation.

This is important in deciding what kind of network capacity you will need.

3. Storage across many commodity nodes with scalability and i/o performance
How about reliability? Commodity nodes means high rate of disk failures and/or node failures. Can you bear with frequent filesystem downtime? Or will you need high availability on this filesystem?

4. The solutions do not need to be geographically dispersed, same server room is fine.
Does your budget include life operating costs? Does your server room have specifications for lb/ft^2 ? One institution I worked at discovered that the building was designed for a certain amount of weight density -- even in the server room. It turns out that putting more than about 8 computer racks in the room exceeded this density! So we had the room, but adding more racks might make the floor unstable, especially given that this building was in a geographically active area (about 1 4+ quake every 2 to 3 years).
Login or Register to Ask a Question

Previous Thread | Next Thread

6 More Discussions You Might Find Interesting

1. High Performance Computing

High Performance Linpack Compiling Issue

I'm trying to compile Linpack on a Ubuntu cluster. I'm running MPI. I've modified the following values to fit my system TOPdir MPdir LAlib CC LINKER. When compiling I get the following error: (the error is at the end, the other errors in between are because I've ran the script several times so... (0 Replies)
Discussion started by: JPJPJPJP
0 Replies

2. Emergency UNIX and Linux Support

Performance investigation, very high runq-sz %runocc

I've just been handed a hot potato from a colleague who left :(... our client has been complaining about slow performance on one of our servers. I'm not very experienced in investigating performance issues so I hoping someone will be so kind to provide some guidance Here is an overview of the... (8 Replies)
Discussion started by: Solarius
8 Replies

3. High Performance Computing

High performance Linkpack

hello everyone , Im new to HPL. i wanted to know whether High performance linpack solves linear system of equations for single precision airthmatic on LINUX. it works for double precision , so is there any HPL version which is for single precision.\ thanks . (0 Replies)
Discussion started by: rahul_viz
0 Replies

4. High Performance Computing

What does high performance computing mean?

Sorry, I am not really from a computer science background. But from the subject of it, does it mean something like multi processor programming? distributed computing? like using erlang? Sound like it, which excite me. I just had a 3 day crash course in erlang and "Cocurrency oriented programming"... (7 Replies)
Discussion started by: linuxpenguin
7 Replies

5. High Performance Computing

IBM Scheduler for High Throughput Computing on IBM Blue Gene P

A lightweight scheduler that supports high-throughput computing (HTC) applications on Blue Gene/P. (NEW: 06/12/2008 in grid) More... (0 Replies)
Discussion started by: Linux Bot
0 Replies

6. AIX

Performance Problem - High CPU utilization

Hello everybody. I have a problem with my AIX 5.3. Recently my unix shows a high cpu utilization with sar or topas. I need to find what I have to do to solve this problem, in fact, I don't know what is my problem. I had the same problem with another AIX 5.3 running the same... (2 Replies)
Discussion started by: wilder.mellotto
2 Replies
Login or Register to Ask a Question

Featured Tech Videos