Energy-efficient job-scheduling for HPC systems

Energy-efficient job-scheduling for HPC systems

Design and implement a scheduler for the SLURM Workload Manager that can take, among other constraints, energy usage into account.

During this thesis the student will design and implement a scheduler for the SLURM Workload Manager that can take, among other constraints, energy usage into account. The current SLURM setup of Simula's eX3 cluster relies on inbuilt scheduling algorithms that take no energy constraints into consideration. As such this thesis shall develop and investigate a more advanced scheduling approach, firstly to identify how to include a new scheduling algorithm into SLURM in a practical sense and secondly to quantify the potential of an energy-based scheduling approach.

Goal

The primary goal of this thesis is to design and implement a scheduler for the SLURM Workload Manager that can take, among other constraints, energy usage into account. This will be achieved through the following objectives:

  • Conduct a comprehensive literature review on the current scheduling techniques available and implemented for SLURM.
  • Select or adapt a scheduling technique to account for energy constraints.
  • Deploy the solution in a local test environment.
  • Thoroughly characterize the performance of the new scheduling approach.
  • Analyze the impact of this scheduling approach for current and future applications.
  • Discuss the findings, identify the strengths and limitations of the approach, and identify future developments needed.

Specific goals include:

  1. Develop a SLURM scheduling approach: Set up and configure a SLURM scheduling system and integrate a new scheduling strategy.
  2. Design and implement a new scheduling strategy: Propose, implement, and analyse a scheduling strategy suited to improve the energy efficiency of a cluster. To achieve the thesis goals, the research will delve into the following key areas:
    • Energy-efficient Cluster Usage
    • Thorough understanding of the energy-related scheduling constraint
    • Investigation of available approaches and analysis of potential outcomes of various scheduling approaches on a theoretical basis

Learning outcome

The overarching goal of this thesis is to demonstrate the ability to provide a thorough analysis of the problem, identify suitable approaches with respect to the state-of-the- art, and provide a practical implementation of a selected approach.

Upon successful completion of this thesis, the student will have gained:

  • Advanced Knowledge in Scheduling: Deep theoretical and practical understanding of scheduling approaches.
  • Expertise in SLURM: Hands-on experience in designing, implementing, and optimizing a solution for the popular SLURM Workload Manager.
  • Strong Research and Analytical Skills: Ability to conduct independent research, critically evaluate scientific literature, design and execute complex experiments, analyze data, and present findings in a clear and concise manner.
  • Problem-Solving: Experience in tackling open research problems at the intersection of AI, computer architecture, and high-performance computing, preparing for future roles in academia or industry.
  • Software Development: Practical experience by going from conceptualization to the realization of an idea.

Qualifications

This thesis is highly challenging and requires a strong foundation in several technical areas. Ideal candidates should possess:

Required:

  • BSc or equivalent in Computer Science, Electrical Engineering, or a related field
  • Basic understanding of Scheduling Approaches
  • Proficiency in C++ or Python programming
  • Strong analytical and problem-solving skills
  • High motivation for hands-on experimental work with hardware

Highly desired (but can be learned during the thesis):

  • Prior exposure to a SLURM-based system
  • Experience with Linux command-line environments and system administration
  • Proficiency in Bash scripting or Python

Supervisors

  • Thomas Roehr
  • Håkon Kvale Stensland

References

Associated contacts