Guides for new HPC admins

 
Thread Tools Search this Thread
Special Forums UNIX and Linux Applications High Performance Computing Guides for new HPC admins
# 1  
Old 06-28-2011
Guides for new HPC admins

In my company, it's fallen on me to serve as the admin of our new HPC cluster, a task that's very new to me. It's very important to me to lay a solid foundation and avoid any unnecessary pitfalls. So, can anyone recommend a succinct guide or list of do's-and-don'ts for adiminstering an HPC cluster? The cluster runs the latest CentOS, PGI compilers, MPICH2, WRF, etc.

Thanks!
Login or Register to Ask a Question

Previous Thread | Next Thread

6 More Discussions You Might Find Interesting

1. What is on Your Mind?

Good Practice Guides

A recent post where someone suggested redirecting with a clobber ">" to a file the same command was reading from prompted me to post this sysad good practice list. Some items are from times where I have learned things the hard way. I think this would be helpful so we can learn from each others... (8 Replies)
Discussion started by: ilikecows
8 Replies

2. AIX

AIX study Guides

Please help me in getting some fine docs ( other than redbooks)to learn AIX. My mail ID: qsecofr400@gmail.com Thanks in advance. (3 Replies)
Discussion started by: secofr
3 Replies

3. UNIX for Dummies Questions & Answers

Oracle guides for exams

hi people, I am very much interested to Oracle. I decided to write one exam soon and I am refering guides from certmagic.com. It seems good. Any of you know any good books than this ?! (0 Replies)
Discussion started by: developer_me
0 Replies

4. Windows & DOS: Issues & Discussions

Beginners Guides: Forgotten Passwords & Recovery Methods

Ever wondered how to recover or reset a forgotten password in WindowsXP? This site will help you get back into your computer, all without reinstalling the operating system. - Version 1.0.0: Reference: http://www.intelligenceweb.org/showthread.php?t=2 (6 Replies)
Discussion started by: Neo
6 Replies

5. Solaris

Mail server guides/tutorials?

solved issue (0 Replies)
Discussion started by: n0rus
0 Replies

6. UNIX for Dummies Questions & Answers

Solaris Study Guides

I am currently working on my Solaris 8.0 Certification ,and I've been working primarily with the Solaris Study Guide produced by Syngress & Osborne. It's a good study guide ,but I think that it does lack some clarity and detail! My question is ,are there any other Solaris Study Guides that could... (1 Reply)
Discussion started by: bilal_aa
1 Replies
Login or Register to Ask a Question
MUNGE(7)						    MUNGE Uid 'N' Gid Emporium							  MUNGE(7)

NAME
munge - MUNGE overview INTRODUCTION
MUNGE (MUNGE Uid 'N' Gid Emporium) is an authentication service for creating and validating credentials. It is designed to be highly scal- able for use in an HPC cluster environment. It allows a process to authenticate the UID and GID of another local or remote process within a group of hosts having common users and groups. These hosts form a security realm that is defined by a shared cryptographic key. Clients within this security realm can create and validate credentials without the use of root privileges, reserved ports, or platform-specific methods. RATIONALE
The need for MUNGE arose out of the HPC cluster environment. Consider the scenario in which a local daemon running on a login node receives a client request and forwards it on to remote daemons running on compute nodes within the cluster. Since the user has already logged on to the login node, the local daemon just needs a reliable means of ascertaining the UID and GID of the client process. Further- more, the remote daemons need a mechanism to ensure the forwarded authentication data has not been subsequently altered. A common solution to this problem is to use Unix domain sockets to determine the identity of the local client, and then forward this infor- mation on to remote hosts via trusted rsh connections. But this presents several new problems. First, there is no portable API for deter- mining the identity of a client over a Unix domain socket. Second, rsh connections must originate from a reserved port; the limited number of reserved ports available on a given host directly limits scalability. Third, root privileges are required in order to bind to a reserved port. Finally, the remote daemons have no means of determining whether the client identity is authentic. USAGE
A process creates a credential by requesting one from the local MUNGE service, either via the munge_encode() C library call or the munge executable. The encoded credential contains the UID and GID of the originating process. This process sends the credential to another process within the security realm as a means of proving its identity. The receiving process validates the credential with the use of its local MUNGE service, either via the munge_decode() C library call or the unmunge executable. The decoded credential provides the receiving process with a reliable means of ascertaining the UID and GID of the originating process. This information can be used for accounting or access control decisions. DETAILS
The contents of the credential (including any optional payload data) are encrypted with a key shared by all munged daemons within the secu- rity realm. The integrity of the credential is ensured by a message authentication code (MAC). The credential is valid for a limited time defined by its time-to-live (TTL); this presumes clocks within a security realm are in sync. Unexpired credentials are tracked by the local munged daemon in order to prevent replay attacks on a given host. Decoding of a credential can be restricted to a particular user and/or group ID. The payload data can be used for purposes such as embedding the destination's address to ensure the credential is only valid on a specific host. The internal format of the credential is encoded in a platform-independent manner. And the credential itself is base64 encoded to allow it to be transmitted over virtually any transport. AUTHOR
Chris Dunlap <cdunlap@llnl.gov> COPYRIGHT
Copyright (C) 2007-2011 Lawrence Livermore National Security, LLC. Copyright (C) 2002-2007 The Regents of the University of California. MUNGE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. Additionally for the MUNGE library (libmunge), you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. SEE ALSO
munge(1), remunge(1), unmunge(1), munge(3), munge_ctx(3), munge_enum(3), munged(8). http://munge.googlecode.com/ munge-0.5.10 2011-02-25 MUNGE(7)