Copyright © 2026
Adaptive ML, Inc.
All rights reserved
Privacy PolicyAdaptive ML, Inc.
All rights reserved






How AT&T Saves Millions with Specialized Language Models

AT&T serves over 100 million customers across the United States, and behind every support call, fraud flag, and contract review sits a growing AI infrastructure. But as they scaled, a familiar pattern emerged. The general-purpose LLMs they relied on were powerful, but expensive, inflexible, and increasingly hard to optimize for the specific tasks the business actually needed.
AT&T needed a different approach. One built around specialization rather than scale. Adaptive Engine’s reinforcement learning helps give them autonomy and control over what is becoming a critical intelligence, value and performance layer.
What follows is the story of how AT&T and Adaptive ML worked together and how that specializing performance plays out across some of the most demanding AI workloads in telecommunication.
Every day, AT&T processes hundreds of thousands of call center transcripts daily. Each one must be summarized accurately, while removing Personally Identifiable Information (PII), flagging regulatory references, detecting fee disclosures, standardizing formatting , and translating for bilingual support across English and Spanish.
They'd built this pipeline using GPT-4o mini with an extensive system prompt, essentially packing an entire rulebook into a single set of instructions. But the prompt had grown unwieldy, further engineering wasn't yielding gains, and every requirement change triggered a manual QA cycle. Meanwhile, OpenAI PTU costs at that volume were becoming a serious constraint. The core issue wasn't that GPT-4o mini was a bad model. It's that AT&T was using a general-purpose tool for an extremely specific job.
Adaptive ML, with its RLOps platform that helps bridge the 'last mile' from generalist AI to autonomous production, fine-tuned a Gemma 12B multilingual model purpose-built for this workload. The team generated high-quality synthetic training data covering the full range of real-world scenarios, with each training cycle incorporating feedback loops across six to seven evaluation criteria. Reinforcement learning was central to precision: the model was trained to summarize correctly against AT&T's specific rubric, rewarding exact behaviors like accurate PII removal and regulatory flagging, rather than relying on general instruction-following.
A separate objective layer was trained specifically for regulatory compliance detection, and an early pivot from Llama to Gemma's multilingual variant solved bilingual performance issues, illustrating a key advantage of the specialized approach: when something isn't working, you test the model, not just the prompt.
Across nearly a million daily transcripts, even a half-percent accuracy gain means thousands of additional correctly handled cases every day. This isn't a marginal improvement. It's a fundamentally different operating model for one of the largest call summarization pipelines in the world.
That same philosophy extends naturally into privacy & security. AT&T's engine would read the titles to know what data was supposedly in each column. However, the data within those columns isn’t fully checked.
It's a classic paradox: to protect the data, you first have to look at it. And for certain high-security scenarios, sensitive data needed to be blocked before any system could access the values at all.
Adaptive ML trained a 1B parameter Llama 3.2 model to classify when PII or other sensitive data appears and to flag it. The model was trained entirely on synthetic data. The benefits are clear, a 6% improvement in accuracy over previous classification methods, making sure more sensitive data is protected.
At just 1 billion parameters, this tiny model outperforms both a mature rule-based system and a frontier LLM. For well-defined, domain-specific tasks, a carefully fine-tuned small model isn't just cheaper. It's better. Going beyond frontier performance.
If data classification shows what specialization can do for security, AT&T's fraud detection workflow shows what it can do for people.
Credit and fraud analysts spend roughly six minutes per case, parsing complex data, cross-referencing multiple systems, and conducting external verification. Rather than fully automating these decisions, Adaptive ML fine-tuned a model that acts as an intelligent first pass: surfacing likely issues, highlighting patterns, and providing a preliminary verdict with reasoning. The analyst's initial role shifts from investigator to reviewer, focusing their expertise on judgment calls rather than data retrieval.
The goal isn't to replace human expertise, but to amplify it.
AT&T and Adaptive ML are now actively developing several additional applications. Document Review can do document processing with fine-tuned open models, targeting approximately 66% cost reduction at enterprise volume. Summary Response will feed the summarization model's output into a dedicated system to generate personalized, context-aware follow-ups, replacing today's generic templates with no additional agent effort. Engagement Review, will use a fine-tuned specialized model to review calls to check the quality of the customer engagement, enabling business analytics and decision-making across multiple downstream use cases.
Each initiative follows the same principle: identify a high-volume workflow where there’s a general-purpose frontier model that could be replaced with a specialized model that does exactly what's needed and does it exceptionally well, often faster, cheaper and more accurately.
The work with AT&T demonstrates a pattern we believe will define the next phase of enterprise AI. Specialized models are outperforming frontier LLMs on domain-specific tasks, not by small margins, but by meaningful gaps, at a fraction of the cost, on models small enough to run on infrastructure the enterprise can own and control.
Adaptive Engine is designed to meet clients wherever their security requirements demand: on-premises, in secure cloud environments, or across multi-region clusters. Better GPU utilization, flexible deployment, and full control over the training and inference pipeline are the realities that determine whether AI actually makes it into production.
We're optimistic about what comes next, for AT&T and for every enterprise sitting on domain-specific workflows that deserve better than a generic model and a long system prompt. If you're exploring how specialized language models could transform your AI workloads, we'd love to show you firsthand.
Book a demo to see how we can help take your AI from ideas to production.