publications
list of publications
2025
- Fed-SB: A Silver Bullet for Extreme Communication Efficiency and Performance in (Private) Federated LoRA Fine-TuningRaghav Singhal, Kaustubh Ponkshe, Rohit Vartak, and 2 more authors2025
Low-Rank Adaptation (LoRA) has become ubiquitous for efficiently fine-tuning foundation models. However, federated fine-tuning using LoRA is challenging due to suboptimal updates arising from traditional federated averaging of individual adapters. Existing solutions either incur prohibitively high communication cost that scales linearly with the number of clients or suffer from performance degradation due to limited expressivity. We introduce Federated Silver Bullet (Fed-SB), a novel approach for federated fine-tuning of LLMs using LoRA-SB, a recently proposed low-rank adaptation method. LoRA-SB optimally aligns the optimization trajectory with the ideal low-rank full fine-tuning projection by learning a small square matrix (R) between adapters B and A, keeping other components fixed. Direct averaging of R guarantees exact updates, substantially reducing communication cost, which remains independent of the number of clients, and enables scalability. Fed-SB achieves state-of-the-art performance across commonsense reasoning, arithmetic reasoning, and language inference tasks while reducing communication costs by up to 230x. In private settings, Fed-SB further improves performance by (1) reducing trainable parameters, thereby lowering the noise required for differential privacy and (2) avoiding noise amplification introduced by other methods. Overall, Fed-SB establishes a new Pareto frontier in the tradeoff between communication and performance, offering an efficient and scalable solution for both private and non-private federated fine-tuning.
- ABBA: Highly Expressive Hadamard Product Adaptation for Large Language ModelsRaghav Singhal, Kaustubh Ponkshe, Rohit Vartak, and 1 more author2025
Large Language Models have demonstrated strong performance across a wide range of tasks, but adapting them efficiently to new domains remains a key challenge. Parameter-Efficient Fine-Tuning (PEFT) methods address this by introducing lightweight, trainable modules while keeping most pre-trained weights fixed. The prevailing approach, LoRA, models updates using a low-rank decomposition, but its expressivity is inherently constrained by the rank. Recent methods like HiRA aim to increase expressivity by incorporating a Hadamard product with the frozen weights, but still rely on the structure of the pre-trained model. We introduce *ABBA*, a new PEFT architecture that reparameterizes the update as a Hadamard product of two independently learnable low-rank matrices. In contrast to prior work, ABBA fully decouples the update from the pre-trained weights, enabling both components to be optimized freely. This leads to significantly higher expressivity under the same parameter budget. We formally analyze ABBA’s expressive capacity and validate its advantages through matrix reconstruction experiments. Empirically, ABBA achieves state-of-the-art results on arithmetic and commonsense reasoning benchmarks, consistently outperforming existing PEFT methods by a significant margin across multiple models.
- "What’s Up, Doc?": Analyzing How Users Seek Health Information in Large-Scale Conversational AI DatasetsAkshay Paruchuri, Maryam Aziz, Rohit Vartak, and 5 more authors2025
People are increasingly seeking healthcare information from large language models (LLMs) via interactive chatbots, yet the nature and inherent risks of these conversations remain largely unexplored. In this paper, we filter large-scale conversational AI datasets to achieve HealthChat-11K, a curated dataset of 11K real-world conversations composed of 25K user messages. We use HealthChat-11K and a clinician-driven taxonomy for how users interact with LLMs when seeking healthcare information in order to systematically study user interactions across 21 distinct health specialties. Our analysis reveals insights into the nature of how and why users seek health information, such as common interactions, instances of incomplete context, affective behaviors, and interactions (e.g., leading questions) that can induce sycophancy, underscoring the need for improvements in the healthcare support capabilities of LLMs deployed as conversational AI.
2023
- Robustness to Variability and Asymmetry of In-Memory On-Chip TrainingRohit K. Vartak, Vivek Saraswat, and Udayan GangulyIn Artificial Neural Networks and Machine Learning – ICANN 2023, 2023
In-memory on-chip learning is crucial for low-power, in-field training capabilities at the edge. We demonstrate the robustness of on-chip back-propagation to hardware variability in terms of bit-cell transistor }}V_T}}variability (}}2.5\backslashtimes }}more robust than off-chip training). We use perturbation schemes, asymmetry variations and variability-aware update schemes to identify the relative contribution of different on-chip operations: forward pass, backward pass and weight updates to Fashion-MNIST classification performance degradation with variations. It is revealed that variability during the weight update step is crucial while accuracy of backward pass or gradient calculation is not critical. We promote weight perturbation scheme over back-propagation as the choice for on-chip in-memory training with reduced points of failure and low cost of hardware.