Multi-vector retrieval methods such as ColBERT and its recent variant, the ConteXtualized Token Retriever (XTR), offer high accuracy but face efficiency challenges at scale. To address this, we present WARP, a retrieval engine that substantially improves the efficiency of retrievers trained with the XTR objective through three key innovations:
@inproceedings{scheerer2025warp,
title = {WARP: An Efficient Engine for Multi-Vector Retrieval},
author = {Scheerer, Jan Luca and Zaharia, Matei and Potts, Christopher and Alonso, Gustavo and Khattab, Omar},
booktitle = {Proceedings of the 48th International ACM SIGIR Conference on Research and Development in Information Retrieval},
year = {2025}
}