Skip to content

Latest commit

 

History

History
22 lines (19 loc) · 2.24 KB

current_unknowns_and_testing.md

File metadata and controls

22 lines (19 loc) · 2.24 KB
layout title
static
Current Unknowns and Testing

Current Unknowns and Testing

We need to test this system prior to using it in a workshop with learners. It may also be sensible to use it initially with a backup option of learners having accounts on an existing HPC system in case of failures. Current questions we have include:
  1. How reliable is a Pi HPC? Workshops commonly include between 20 and 40 learners. Raspberry Pis are relatively low-spec machines, especially if using the first, second and third generation models, rather than the latest Pi4s. There's an open question about how many simultaneous users a Pi-based login node can support.
  2. Can our DHCP server support that many users? The default CIDR/24 block of 192.168.1.1 to 192.168.1.254 has enough IP addresses for 255 external clients, but in practice client limits and the memory requirements of managing them (especially on a Pi) can lead to address assignment failures.
  3. Can the Pi's WiFi interface handle the required traffic? Related to the above, there's an open question as to whether the Pi will handle having 40+ clients connected through the WiFi interface.
  4. How many Pi nodes do we need in a cluster to handle all the Slurm jobs that will be queued and launched? This is also a relevant question for anyone implementing the Pi HPC in a Carpentries Offline lesson, as for cost-effectiveness, we should provide an estimate connecting the number of nodes to the number of learners.
  5. Will we need multiple login nodes? If a single login node slows down too much with too many learners connected, it may be necessary to create another login node.
  6. Will there be a bottleneck in the shared storage USB? USB Shared storage can become bottlenecked under high resource use. Earlier generations than the Raspberry Pi 4 were restricted by USB-2's 480 Mbps limit. Some testing of throughput should be conducted. An external SSD could potentially be faster than USB storage.
{% include sidebar.md %}