Rocks cluster 6.1 and MPICH2 problem??????


 
Thread Tools Search this Thread
Operating Systems Linux Rocks cluster 6.1 and MPICH2 problem??????
# 1  
Old 01-15-2013
Rocks cluster 6.1 and MPICH2 problem??????

Hey friends,
I am trying to execute a simple hello world in mpi on MPICH2 on Rocks cluster. here is the c source code.

Code:
 
#include <mpi.h>
#include <stdio.h>
int main( int argc, char ** argv )
{
 MPI_Init( NULL, NULL );
 int world_size;
 MPI_Comm_size( MPI_COMM_WORLD, &world_size );
 int world_rank;
 MPI_Comm_rank( MPI_COMM_WORLD, &world_rank );
 char processor_name[MPI_MAX_PROCESSOR_NAME];
 int name_len;
 MPI_Get_processor_name( processor_name, &name_len );
 printf( "Hello world from processor %s, rank %d" " out of %d processors\n", processor_name, world_rank, world_size );
 MPI_Finalize();
}


And I compile it like this.

Code:
/opt/mpich2/gnu/bin/mpicc ./hello.c -o hello

I have the following entry on the machine file.

Code:
compute-0-0
compute-0-1

Now here is how I run the hello program

Code:
/opt/mpich/gnu/bin/mpirun -np 2 -machinefile machines ./hello

which gives me the follwing error.

Code:
 
[user1@cluster ~]$ /opt/mpich2/gnu/bin/mpirun -np 2 -machinefile machines ./mpi_hello_world
Could not chdir to home directory /export/home/user1: No such file or directory
Could not chdir to home directory /export/home/user1: No such file or directory
[proxy:0:0@compute-0-0.local] launch_procs (./pm/pmiserv/pmip_cb.c:687): unable to change wdir to /export/home/user1 (No such file or directory)
[proxy:0:0@compute-0-0.local] HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:935): launch_procs returned error
[proxy:0:0@compute-0-0.local] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[proxy:0:0@compute-0-0.local] [mpiexec@cluster.hpc.org] control_cb (./pm/pmiserv/pmiserv_cb.c:215): assert (!closed) failed
[mpiexec@cluster.hpc.org] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[mpiexec@cluster.hpc.org] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:181): error waiting for event
[mpiexec@cluster.hpc.org] main (./ui/mpich/mpiexec.c:405): process manager error waiting for completion
[user1@cluster ~]$

Please help me.
# 2  
Old 06-26-2013
hi,
any one wants answer this question?
i have this question too! Smilie
# 3  
Old 06-26-2013
not having access to MPI and just looking at your posted errors, you need to address the directory issue first before you start rummaging somewhere else ...
Code:
Could not chdir to home directory /export/home/user1: No such file or directory

This User Gave Thanks to Just Ice For This Post:
# 4  
Old 06-26-2013
tanx but,
we use rocks cluster which has MPICH2 pre-installed in /opt/mpich2/
Do I need to do any other modification to libraries?
and any other recommendation for us...?
# 5  
Old 06-26-2013
afaik the error has nothing to do with where the app is installed but rather where it expects to dump/find some files ...

i suggest you read the MPICH2 documentation further to see what you need to do to fix your issue ... the fix might be as simple as creating/fixing permissions on a directory or as complicated as a full recompile and reinstall but you need to confirm that within the documentation ...
This User Gave Thanks to Just Ice For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

4 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

CentOS 6.8 with Rocks Cluster: ldconfig is not a symbolic link errors

Any help appreciated just logging in to this server which is a front end for Rocks Cluster 6.1.1. Getting the below errors: ldconfig ldconfig: /usr/lib/libX11.so.6 is not a symbolic link ldconfig: /usr/lib/libjpeg.so.62 is not a symbolic link ldconfig: /usr/lib/libpng12.so.0 is not a symbolic... (3 Replies)
Discussion started by: RobbieTheK
3 Replies

2. Red Hat

Cluster Problem

Hi, I am facing issuing in setting up Linux cluster. Here is the issue that i am facing. I have 2 Linux desktop and have following ip's and name: hitesh12-192.168.1.23 saanvi12-192.168.1.30 i enabled ricci service and have setup passwod as well.Enabled luci service as well. When... (0 Replies)
Discussion started by: hitesh1907
0 Replies

3. UNIX for Dummies Questions & Answers

MPICH2 version/Error

Guys, I have two questions: 1- What command do i have to use to know the version of MPICH2? 2- I am running a parallel computation on quantum espresso using a SLURM interface (INTEL) and I am getting the following error: however a colleague gave me another (rather more complicated) script... (1 Reply)
Discussion started by: lebphys78
1 Replies

4. Linux

Intermittent connectivity issues with ROCKS on a compute cluster

I have a cluster set up with a head node and compute nodes running TORQUE and MOAB. The distro is ROCKS 5.3. I've been having problems with the connectivity for the past couple weeks now. Every couple hours it seems like the network connectivity will just stop working: sometimes it'll start back up... (0 Replies)
Discussion started by: gandalf85
0 Replies
Login or Register to Ask a Question