Reducing server load with efficient tuning of large language models | A. James Clark School of Engineering, University of Maryland

Power-hungry computing has spiked energy prices

The development and deployment of large language models (LLMs) require vast amounts of energy. Communities where data centers are located have seen strains on their electric supply and spikes in energy prices.

UMD professor’s energy-efficient design slashes energy use

Assistant Professor of Electrical and Computer Engineering Sanghamitra Dutta is devising more efficient ways to design language models in order to save energy and precious natural resources. Her work is applicable to models customized for specific tasks, such as a bank’s chatbot that provides customer service. In contrast to general-purpose LLMs like ChatGPT, customized models are generally deployed in environments with more constrained storage and memory capabilities.

Headshot of Sanghamitra Dutta

In 2025, Dutta was featured in the list of 100 Brilliant Women in AI Ethics™.

Using strategic data points to achieve efficiency

One project involves knowledge distillation, the process of training a smaller language model (called a student) from a larger model (called a teacher). Dutta’s work makes the knowledge distillation process more efficient by using a strategic set of data points to tune the model.

A win-win situation: Reduced training time with improved performance

Take, for instance, a feature on a bank’s website that can tell a customer whether they’d qualify for a loan. Instead of training the model with data-point pairs of loan applications and their result—acceptance or decline—her method gets more specific by using contrasting data points called “counterfactuals,” designed to help the model understand and generate “what-if” scenarios that assess how outcomes changed with small differences in past conditions.

By using counterfactuals, Dutta’s method cuts the total number of data points needed for training in half, and also yields performance improvements. Less training time means less energy used: “We can train these student models much faster, and the student models are more faithful to the teacher models,” Dutta says. “It’s a win-win situation.”

← Previous

Illustration of an AI computer chip being cooled

Cooling data centers safely with two-phase fluids

Up next →

Professor Jay Lee with a screen reading Industrial AI

Optimizing physical systems with big data

View all: Engineering AI for the Public Good

Top