Marcos V. Treviso
Assistant Professor at Instituto Superior Técnico, University of Lisbon
I'm a co-PI at the SARDINE Lab 🐟 and a member of the ELLIS Unit Lisbon 🏛️.
I work on efficiency, long-context modeling, and interpretability in ML & NLP.
News
AdaSplash-2: Faster Differentiable Sparse Attention was accepted at ICML 2026. See you in Seoul! 🇰🇷
Our paper, Long-context Generalization with Sparse Attention was accepted at ICLR 2026. See you in Rio! 🇧🇷
New blog post: SLURM in the Wild: A Practical Guide for Academic Labs — 50 min read on scaling research compute.
I started as Tenure-Track Assistant Professor at the Computer Engineering Department, IST, University of Lisbon.
AdaSplash: Adaptive Sparse Flash Attention was presented at ICML 2025 as a Spotlight (top 1%) ⚡.
LaTIM: Measuring Latent Token-to-Token Interactions in Mamba Models got an Outstanding Paper Award 🏆 at ACL 2025.
Recent Publications
Full list on Google Scholar →2026
AdaSplash-2: Faster Differentiable Sparse Attention
Nuno Gonçalves, Hugo Pitorro, Vlad Niculae, Edoardo Ponti, Lei Li, André F. T. Martins, Marcos V. Treviso
DashAttention: Differentiable and Adaptive Sparse Hierarchical Attention
Yuxiang Huang, Nuno M. T. Gonçalves, Federico Alvetreti, Lei Li, Xu Han, Edoardo M. Ponti, André F. T. Martins, Marcos V. Treviso
EntmaxKV: Support-Aware Decoding for Entmax Attention
Gonçalo Duarte, Miguel Couceiro, Marcos V. Treviso
Sparse Attention as Compact Kernel Regression
Saul Santos, Nuno Gonçalves, Daniel C. McNamee, Marcos V. Treviso, André F. T. Martins
Long-Context Generalization with Sparse Attention
Pavlo Vasylenko, Hugo Pitorro, André F. T. Martins, Marcos V. Treviso
2025
AdaSplash: Adaptive Sparse Flash Attention
Nuno Gonçalves, Marcos V. Treviso, André F. T. Martins
LaTIM: Measuring Latent Token-to-Token Interactions in Mamba Models
Hugo Pitorro, Marcos V. Treviso
AMUSED: A Multi-Modal Dataset for Usability Smell Identification
Flavia Santos, Marcos V. Treviso, Kamila Rodrigues, Renata Fortes, Sandra Gama
Students
Current
Previous
I have open projects at MSc and PhD level — if you're interested in working on topics related to efficiency, long-context modeling, or interpretability, feel free to send me an email.
Tools, Tutorials, and Puzzles
Teaching
Assistant Professor
- Autonomous Agents and Multi-Agent Systems — MEIC, IST · 2025/2026 · 2nd semester
- Machine Learning — LEIC, IST · 2025/2026 · 1st semester
Invited Assistant Professor
- Machine Learning — MEEC, IST · 120 students · Lab component
Teaching Assistant
- Deep Structured Learning — DEEC, IST · PyTorch Tutorial · Neural Attention Mechanisms
Service
Reviewer
- ACL, EMNLP, EACL, NAACL
- ICML, ICLR, NeurIPS
- PROPOR, STIL, EAMT
- 🏅 Outstanding Reviewer at ACL, EMNLP, NeurIPS, ICLR, ICML
Area Chair & Senior Area Chair
- ACL 2024 & 2025, EMNLP 2025, EACL 2026, ACL 2026
ACL Tech Team
- Member · maintainer of the ARR Report Generator
Research Projects
SMURF4EU
Co-PI EuroHPC · 2026–2027A Suite of Multimodal Reasoning Foundation Models for Europe
Developing and releasing a suite of fully open, high-performance multimodal reasoning foundation models spanning text, code, speech, vision, and video — with support for all 24 official EU languages. Targets multiple model sizes and long multimodal contexts up to 1M tokens using efficient attention and memory-compression techniques.
AMALIA
Team member Portugal · 2025–2026European-Portuguese Large Language Model
An open LLM developed specifically for Portuguese as used in Portugal — preserving culturally grounded language use and supporting data sovereignty for Public Administration use cases. Built by a national consortium (NOVA, IST, Coimbra, Porto, Minho, FCT/Arquivo.PT), pre-trained on ~4 trillion words, with a roadmap toward multimodality and a targeted release around June 2026.