I am full professor in Computer Science, at Bordeaux INP/ENSEIRB-MATMECA.
I graduated from ENS Lyon and earned a PhD from the University of Versailles–Saint-Quentin in 1998, where I worked on dependency analysis in the polyhedral model. I then became an assistant professor at the same university. In 2009, I joined Bordeaux INP as a full professor and became part of Inria’s Runtime team. In 2017, I created and led Inria’s Storm team while also overseeing the Computer Science program at the ENSEIRB-MATMECA engineering school of Bordeaux INP.
From 2023 to mid-2025, I was on leave to direct Huawei Paris’s Distributed and Parallel Research Lab. After returning from this leave, I joined Inria’s Topal team.
Over the past two years, I’ve increasingly focused on how to scale AI systems, especially open-source large language models. I’ve explored parallelization techniques for both training and inference, and how these models behave when deployed on large cloud infrastructures built around modern AI accelerators. Along the way, I’ve gained substantial hands-on experience with a variety of heterogeneous hardware platforms—from the familiar NVIDIA GPU ecosystem to architectures like Huawei’s Ascend NPUs, as well as multicore ARM and Intel processors.
More broadly, my research has always revolved around making complex applications run faster and more efficiently. I’m interested in the full stack: optimization, parallel and distributed methods, and the design of compilation and runtime systems for AI and high-performance computing. These themes have accompanied me throughout my career, even as the computing landscape continues to evolve.
Software
MAQAO (Modular Assembly Quality Analyzed and Optimizer) is a parallel performance analysis and optimization framework that I initiated in 2004. Designed for developers and performance experts, it provides advanced capabilities for analyzing and optimizing low-level code.
MAQAO has stood the test of time: it is actively maintained and extended at the University of Versailles, supported by a long-lasting community of users and contributors. It has become part of the VI-HPS Institute, is listed among the EU Innovation Radar technologies, and has been featured in an AWS blog highlighting its potential. MAQAO is distributed under the GPL license and is now used in training programs for engineers as well as in large industrial environments.
AFF3CT (A Fast Forward Error Correction Toolbox) is a high-performance simulation framework dedicated to Forward Error Correction (FEC, or channel coding). It was initiated during A. Cassagne’s PhD in 2017 and supports a wide range of codes—from well-established Turbo codes to modern Polar and LDPC codes.
AFF3CT has grown into a vibrant, active and industrial community maintained and developped at Sorbonne University, IMS Lab/Bordeaux University and Inria. Users and developers gather annually during the AFF3CT Day, illustrating the maturity and engagement around the project. Thanks to its performance and robustness, AFF3CT is now used in both academic research and industrial settings. It is distributed under the MIT license.
MIPP: MIPP is a portable, open-source (MIT license) C++11 wrapper for SIMD intrinsic functions. It supports SSE, AVX, AVX-512, and ARM NEON (32-bit and 64-bit) instruction sets, covering both single/double-precision floating-point operations and signed integer arithmetic (64, 32, 16, and 8 bits).
By abstracting architecture-specific intrinsics behind a unified interface, MIPP eliminates the need for developers to write manual SIMD code; the appropriate intrinsics are generated automatically. Initially developed during A. Cassagne’s PhD, MIPP continues to be actively maintained and widely adopted in performance-critical applications.
PARCOACH
PARCOACH (PARallel COntrol flow Anomaly CHecker) targets debugging of modern scientific applications that rely on MPI and hybrid MPI+X models (where X is typically a thread-based runtime such as OpenMP). Originally initiated during Emmanuelle Saillard’s PhD, PARCOACH is today actively maintained and developed by her.
PARCOACH combines static and dynamic analyses to detect misuse of collective operations in parallel applications, helping developers diagnose subtle communication errors that arise in large-scale systems. PARCOACH is now used and referenced for instance in the EuroHPC DEEP-SEA software stack and in recent venues such as Correctness@SC’23 and EuroMPI/Australia 2024, where it is cited as one of the state-of-the-art MPI correctness tools.