How AT&T Saves Millions with Specialized Language Models

Adaptive ML has been acquired by Datadog

Blog posts

May 27, 2026

Product

Authors

No items found.

Editors

No items found.

Acknowledgements

AT&T serves over 100 million customers across the United States, and behind every support call, fraud flag, and contract review sits a growing AI infrastructure. But as they scaled, a familiar pattern emerged. The general-purpose LLMs they relied on were powerful, but expensive, inflexible, and increasingly hard to optimize for the specific tasks the business actually needed.

AT&T needed a different approach. One built around specialization rather than scale. Adaptive Engine’s reinforcement learning helps give them autonomy and control over what is becoming a critical intelligence, value and performance layer.

What follows is the story of how AT&T and Adaptive ML worked together and how that specializing performance plays out across some of the most demanding AI workloads in telecommunication.

AT&T scaled AI across high-volume telecom workflows and found general-purpose LLMs were powerful but increasingly expensive, inflexible, and difficult to optimize as requirements changed.
Working with Adaptive ML, AT&T shifted to specialized, fine-tuned models (reinforcement learning and synthetic data) that improved performance and iteration speed across use cases—e.g., call summarization at ~900K transcripts/day (30% faster and +0.5% accuracy), sensitive-data classification with zero data exposure (+6% vs heuristics; +17% vs GPT-4o), and fraud-review assistance (up to 12× faster at GPT-4o–level accuracy).
Beyond operations, AT&T helped push industry benchmarks (GSMA TeleLogs) where fine-tuned small models reached top results (up to 90% accuracy), and the roadmap expands the same playbook to document review, personalized follow-ups, and engagement quality review—delivering enterprise control with lower cost and better domain performance.

Turning 900,000 Daily Calls into Actionable Intelligence

Every day, AT&T processes hundreds of thousands of call center transcripts daily. Each one must be summarized accurately, while removing Personally Identifiable Information (PII), flagging regulatory references, detecting fee disclosures, standardizing formatting , and translating for bilingual support across English and Spanish.

They'd built this pipeline using GPT-4o mini with an extensive system prompt, essentially packing an entire rulebook into a single set of instructions. But the prompt had grown unwieldy, further engineering wasn't yielding gains, and every requirement change triggered a manual QA cycle. Meanwhile, OpenAI PTU costs at that volume were becoming a serious constraint. The core issue wasn't that GPT-4o mini was a bad model. It's that AT&T was using a general-purpose tool for an extremely specific job.

Adaptive ML, with its RLOps platform that helps bridge the 'last mile' from generalist AI to autonomous production, fine-tuned a Gemma 12B multilingual model purpose-built for this workload. The team generated high-quality synthetic training data covering the full range of real-world scenarios, with each training cycle incorporating feedback loops across six to seven evaluation criteria. Reinforcement learning was central to precision: the model was trained to summarize correctly against AT&T's specific rubric, rewarding exact behaviors like accurate PII removal and regulatory flagging, rather than relying on general instruction-following.

A separate objective layer was trained specifically for regulatory compliance detection, and an early pivot from Llama to Gemma's multilingual variant solved bilingual performance issues, illustrating a key advantage of the specialized approach: when something isn't working, you test the model, not just the prompt.

Metric	Result
Inference speed	30% faster than GPT-4o mini
Accuracy	+0.5% over GPT-4o mini
Iteration speed	Near-prompt-level; no manual QA bottleneck

Across nearly a million daily transcripts, even a half-percent accuracy gain means thousands of additional correctly handled cases every day. This isn't a marginal improvement. It's a fundamentally different operating model for one of the largest call summarization pipelines in the world.

Protecting Sensitive Data Without Ever Seeing It

That same philosophy extends naturally into privacy & security. AT&T's engine would read the titles to know what data was supposedly in each column. However, the data within those columns isn’t fully checked.

It's a classic paradox: to protect the data, you first have to look at it. And for certain high-security scenarios, sensitive data needed to be blocked before any system could access the values at all.

Adaptive ML trained a 1B parameter Llama 3.2 model to classify when PII or other sensitive data appears and to flag it. The model was trained entirely on synthetic data. The benefits are clear, a 6% improvement in accuracy over previous classification methods, making sure more sensitive data is protected.

Metric	Result
vs. existing heuristics	+6% accuracy
vs. GPT-4o	+17% accuracy
Data exposure	Zero — operates on metadata only

At just 1 billion parameters, this tiny model outperforms both a mature rule-based system and a frontier LLM. For well-defined, domain-specific tasks, a carefully fine-tuned small model isn't just cheaper. It's better. Going beyond frontier performance.

From Minutes to Seconds: AI-Assisted Fraud Detection

If data classification shows what specialization can do for security, AT&T's fraud detection workflow shows what it can do for people.

Credit and fraud analysts spend roughly six minutes per case, parsing complex data, cross-referencing multiple systems, and conducting external verification. Rather than fully automating these decisions, Adaptive ML fine-tuned a model that acts as an intelligent first pass: surfacing likely issues, highlighting patterns, and providing a preliminary verdict with reasoning. The analyst's initial role shifts from investigator to reviewer, focusing their expertise on judgment calls rather than data retrieval.

Metric	Result
Case review time	Up to 12× faster (360 seconds → 30 seconds)
Accuracy	Matches GPT-4o
Analyst capacity	Up to 12× throughput improvement

The goal isn't to replace human expertise, but to amplify it.

Expanding the Playbook

AT&T and Adaptive ML are now actively developing several additional applications. Document Review can do document processing with fine-tuned open models, targeting approximately 66% cost reduction at enterprise volume. Summary Response will feed the summarization model's output into a dedicated system to generate personalized, context-aware follow-ups, replacing today's generic templates with no additional agent effort. Engagement Review, will use a fine-tuned specialized model to review calls to check the quality of the customer engagement, enabling business analytics and decision-making across multiple downstream use cases.

Each initiative follows the same principle: identify a high-volume workflow where there’s a general-purpose frontier model that could be replaced with a specialized model that does exactly what's needed and does it exceptionally well, often faster, cheaper and more accurately.

The Road Ahead

The work with AT&T demonstrates a pattern we believe will define the next phase of enterprise AI. Specialized models are outperforming frontier LLMs on domain-specific tasks, not by small margins, but by meaningful gaps, at a fraction of the cost, on models small enough to run on infrastructure the enterprise can own and control.

Adaptive Engine is designed to meet clients wherever their security requirements demand: on-premises, in secure cloud environments, or across multi-region clusters. Better GPU utilization, flexible deployment, and full control over the training and inference pipeline are the realities that determine whether AI actually makes it into production.

We're optimistic about what comes next, for AT&T and for every enterprise sitting on domain-specific workflows that deserve better than a generic model and a long system prompt. If you're exploring how specialized language models could transform your AI workloads, we'd love to show you firsthand.

Book a demo to see how we can help take your AI from ideas to production.

Adaptive Engine Adapt Evaluate Serve

Use Cases RAG Text-to-SQL Customer Support

Company Technology About Blog

Socials LinkedIn Twitter YouTube