Copyright © 2026
Adaptive ML, Inc.
All rights reserved
Adaptive ML, Inc.
All rights reserved






We’ve been focused on making Adaptive Engine a more powerful post-training platform for deploying specialized agents. This release improves control over outputs, increases visibility into training, and expands how teams evaluate and promote models.
Function graders and constrained decoding tighten how teams define evaluation logic and enforce output format. Checkpoint promotion gives teams more control which checkpoints reach production. A new Monitoring tab brings more training observability into the platform.
Here's what's new.
As teams build more specialized agents, evaluation becomes more central to training quality.
Custom Python graders are now reusable objects in Adaptive Engine. Create, test, and manage them via UI or SDK and reuse them across RL and evaluation recipes without modifying code.
Function graders provide a deterministic alternative to LLM-based evaluation for cases where correctness is structural or rule-based. Instead of relying on an LLM judge, you define evaluation logic directly in Python and validate it against a live sample of your dataset before attaching it to an RL recipe.
Each grader runs in an isolated sandbox, making it safe to reuse across recipes. Full CRUD is available in the Python SDK for programmatic management.
This improves evaluation consistency for tasks with explicit correctness criteria. Learn more about our function graders in our docs.
Debugging training runs often requires stitching together metrics across multiple tools.
The monitoring tab centralizes training telemetry across runs, including loss curves, reward signals, and live metrics with run comparisons.
It consolidates fine-tuning and RL observability into a single interface, reducing reliance on external dashboards and manual debugging workflows.
Monitoring reduces iteration time and removes visibility gaps during training.

Model outputs are often difficult to reliably parse.
Chat completions now support JSON Schema or Pydantic model as a response_format parameter. Invalid tokens are excluded at each step, keeping generation within the schema.
The constraint is enforced during decoding by restricting token vocabulary based on the current parse state.
This works with models deployed on Harmony and external providers including OpenAI, Anthropic, and Gemini. Available in the Python SDK, REST API, and in-product chat with presets for classification, entity extraction, and simple object output.
For agentic systems where model outputs are passed between steps as structured data, this turns a parsing assumption into a typed interface. This helps remove an entire class of parsing errors in downstream systems.
The best-performing model is not always the final checkpoint.
Training runs now save model state at configurable intervals. Any checkpoint can be promoted to a standalone model in the registry for evaluation or deployment.
Intermediate checkpoints sometimes outperform the final one on held-out evaluations. Promotion allows direct comparison and shipping of the best-performing checkpoint. Each promoted model retains full lineage to its source run.
LoRA promotion automatically binds to its backbone model. Interrupted runs can resume from the last saved checkpoint rather than starting over. Multi-stage runs such as SFT, PPO, and GRPO track and resume each stage independently.
For teams building specialized agents with post-training, we continue to publish additional resources.