coddy wrote: ↑Sun Oct 01, 2023 1:36 am
I am trying to use macbook m1 as the master in a cluster of 4 raspberry pies (workers, connected to each other with a switch which is connected to my home router to which the mac is connected via wifi). I built OpenMPI (4.1.5) from source for both raspberry pi 4 and the macbook and have configured everything correctly with hosts and hostnames and saved public keys in each raspberry for direct login from the master.
However, when I run
Code: Select all
mpiexec -machinefile machinefile -n 5 python mpi_run.py
machinefile
Code: Select all
MacBook-Air.attlocal.net
rpi1
rpi2
rpi3
rpi4
mpi_run.py file
Code: Select all
from mpi4py import MPI
import sys
size = MPI.COMM_WORLD.Get_size()
rank = MPI.COMM_WORLD.Get_rank()
name = MPI.Get_processor_name()
sys.stdout.write(
"Hello, World! I am process %d of %d on %s.\n"
% (rank, size, name))
a test example it just doesn't output anything. The working animation in the top right of the terminal runs for a few seconds and then nothing happens, no output or error. The mpiexec does run individually on each machine.
Is the working directory shared between all the machines at the same path? Also, are you trying to launch the ranks on each node using public key ssh?
To make it easier to launch an MPI task across the nodes in your cluster I would suggest installing the slurm workload manager.
https://slurm.schedmd.com/overview.html
Once slurm is able to schedule non-parallel tasks on each of the nodes, it should also be able to launch an MPI job. Some notes about how I set up a super cheap cluster of Pi Zero computers is available at
viewtopic.php?t=199994
If you are still having trouble, I'd suggest getting it working first with only the Pi computers. This is because heterogeneous clusters are difficult to setup and use, especially when running different operating systems. Note even when all the nodes are the same, simply having a mix of big performance cores and little efficiency cores causes problems.