Project Description

ESPLAG – Enabling SParse training of LLMs on GPUs

Period: 2024-02-01 – 2025-01-31

Funded by: European Commission – HORIZON IA

The Innovative Algorithms for Applications on European Exaescale Supercomputers (Inno4scale) project is funded by the European Union within the Horizon Europe program, through the European High-Performance Computing Joint Undertaking (JU), with Grant Agreement 01118139. This project is coordinated by the Barcelona Supercomputing Center – Centron Nacional de Supercomputación (BSC-CNS).

This project launched a call for sub-projects, called 2023 Call for Inno4scale Innovation Studies (Inno4scale Call-2023). The Enabling SParse training of LLMs on GPUs (ESPLAG) project has been chosen to be financed by this call as a third party, together with the University of A Coruña.

The sparse compression format named VENOM (or V:N:M) allows the usage of Sparse Tensor Cores (SPTC) on the entire sparsity range. This VENOM format provides a software solution to unlock the hardware limitation of SPTCs to 50% sparse matrices, allowing the execution of arbitrary levels of sparsity on such specialized vector units.

The cost associated with training modern models, such as LLMs, is a significant concern in the field of ML, often reaching millions of dollars. Up to this point, VENOM has primarily been applied to end-to-end inference tasks. In this project, our objective is to extend the VENOM format to encompass real sparse training tasks. To that end, we will cover the two main areas of network sparsification: specialized kernels for GPU, and network pruning algorithms. Finally both components will be integrated to build a real sparse training tool.

More information:

https://www.inno4scale.eu/