Large Language Models have demonstrated strong performance across a wide range of tasks, but adapting them efficiently to new domains remains a key challenge. Parameter-Efficient Fine-Tuning (PEFT) methods address this by introducing lightweight, trainable modules while keeping most pre-trained weights fixed. The prevailing approach, LoRA, models updates using a low-rank decomposition, but its expressivity is inherently constrained by the rank. Recent methods like HiRA aim to increase expressivity by incorporating a Hadamard product with the frozen weights, but still rely on the structure of the pre-trained model. We introduce ABBA, a new PEFT architecture that reparameterizes the update as a Hadamard product of two independently learnable low-rank matrices. In contrast to prior work, ABBA fully decouples the update from the pre-trained weights, enabling both components to be optimized freely. This leads to significantly higher expressivity under the same parameter budget. We formally analyze ABBA's expressive capacity and validate its advantages through matrix reconstruction experiments. Empirically, ABBA achieves state-of-the-art results on arithmetic and commonsense reasoning benchmarks, consistently outperforming existing PEFT methods by a significant margin across multiple models.
While ABBA is clearly parameter-efficient, analyzing its memory footprint during training is more subtle. In LoRA, the update \( \Delta W = BA \) is applied as \( \Delta W x = B (A x) \), allowing intermediate computations to remain low-rank. Only the activation \( A x \in \mathbb{R}^r \) and the adapter weights need to be stored additionally, avoiding the materialization of the full \( m \times n \) matrix \( BA \).
In contrast, ABBA’s update \( \Delta W = (B_1 A_1) \odot (B_2 A_2) \) poses a challenge. A naive implementation would require constructing both \( B_1 A_1 \) and \( B_2 A_2 \), followed by their elementwise product, resulting in the storage of multiple full \( m \times n \) matrices. Moreover, unlike LoRA, the Hadamard product does not distribute over matrix–vector multiplication, so computing \( B_2 (A_2 x) \) does not help incorporate the other matrices.
To address the memory bottleneck, we use the above theorem to rewrite ABBA in a LoRA-like form: define \( B_{\text{kr}} = B_1 \odot_r B_2 \) and \( A_{\text{kr}} = (A_1^\top \odot_r A_2^\top)^\top \). The update becomes \( \Delta W x = B_{\text{kr}} (A_{\text{kr}} x) \), avoiding any full-rank matrix construction.
This formulation allows ABBA to match LoRA’s compute and memory efficiency while offering significantly higher expressivity.
@misc{singhal2025abbahighlyexpressivehadamard,
title={ABBA: Highly Expressive Hadamard Product Adaptation for Large Language Models},
author={Raghav Singhal and Kaustubh Ponkshe and Rohit Vartak and Praneeth Vepakomma},
year={2025},
eprint={2505.14238},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.14238},
}