My research focuses on memory-efficient training methods and model merging, grounded in a deep understanding of optimization theory and its evolution. More recently, I've also been working on Mathematics and Informatics Olympiad–level benchmarks for evaluating LLMs' reasoning on complex problems.
Pouria Mahdavinia, Hamed Mahdavi, Niloofar Mireshghallah, Mehrdad Mahdavi
Under review
Benchmarked capability-based model merging for post-training at 8B scale. Leveraged this benchmark to study why model merging works and how it can be improved in this setting. Proposed a novel task-vector pruning method called Fast-Fisher-Grafting (FFG) and showed it's critical for merging performance. Integrated all findings into a new model merging algorithm: the OTA framework.
Pouria Mahdavinia, Mehrdad Mahdavi
TMLR 2025 (Awarded J2C Certification)
Developed MoFaSGD, the first memory-efficient Muon optimizer variant, achieving consistent and high throughput gains compared to other memory-efficient optimizers while maintaining LoRA-like GPU memory usage compared to Muon. Validated on NanoGPT and Allen AI's Tulu3-SFT replication at 8B parameter scale. The core idea is to keep momentum on a low-rank manifold using a low-rank factorized representation of the full-rank momentum matrix.
Hamed Mahdavi, Pouria Mahdavinia, Samira Malek, Pegah Mohammadipour, Alireza Hashemi, Majid Daliri, Alireza Farhadi, Amir Khasahmadi, Niloofar Mireshghallah, Vasant Honavar
MATH-AI NeurIPS 2025
Worked on a benchmark evaluating frontier LLMs on multi-level IMO proof grading and helped design agentic systems that outperformed single-turn approaches.
Hamed Mahdavi, Pouria Mahdavinia, Alireza Farhadi, Pegah Mohammadipour, Samira Malek, Pedram Mohammadipour, Majid Daliri, Alireza Hashemi, Amir Khasahmadi, Vasant G Honavar
MATH-AI NeurIPS 2025
Worked on evaluating frontier open- and closed-source LLMs on multimodal problems at the Informatics Olympiad level by sourcing and curating Iranian Informatics Olympiad problems and solutions.
Yuyang Deng, Mohammad Mahdi Kamani, Pouria Mahdavinia, Mehrdad Mahdavi
NeurIPS 2023
Pouria Mahdavinia, Yuyang Deng, Haochuan Li, Mehrdad Mahdavi
NeurIPS 2022