Introducing Recipes

A recipe is a Python file that defines an AI workflow on Harmony, Adaptive Engine's compute backend.

Reinforcement learning has many moving parts: training, inference, graders. They typically run in separate systems. Most of the work in an RL pipeline is the coordination between them.

In Harmony, all of this can live in one file.

A recipe has three pieces:

InputConfig: declares the recipe's inputs

@recipe_main: marks the entrypoint function

RecipeContext: handles Harmony calls

class MyConfig(InputConfig): ... @recipe_main async def main(config: MyConfig, ctx: RecipeContext): # load resources, run the workflow, save ...

InputConfig

The recipe's inputs, declared as a Pydantic model.

class SummarizationGRPO(InputConfig): model: Annotated[Model[model_kinds.Trainable], Field(description="Policy")] dataset: Annotated[Dataset[dataset_kinds.Prompt], Field(description="Prompts")] grader: Annotated[Grader, Field(description="Reward")] learning_rate: Annotated[float, Field(default=7.5e-7)] = 7.5e-7

Model, Dataset, and Grader are engine-aware: they reference models, datasets, and graders registered on Adaptive Engine.

RecipeContext

The recipe's connection to Harmony.

dataset = await config.dataset.load(ctx) grader = await config.grader.load(ctx) policy = await config.model.spawn_train("policy", ctx, max_batch_size=10000, tp=4) reference = await policy.clone_inf() for prompt in dataset: samples = await async_map(policy.generate_tokens, [prompt for _ in range(8)]) texts = await async_map(policy.detokenize_thread, samples) grades = await async_map(grader.grade, texts) scores = np.array([g.value for g in grades]) advantages = (scores - scores.mean()) / (scores.std() + 1e-8) for sample, adv in zip(samples, advantages): lp = await policy.logprobs_per_token(sample) ref_lp = await reference.logprobs_per_token(sample) await policy.train_grpo(sample, lp, ref_lp, [adv] * len(lp), clip_range=0.1, kl_beta=0.01) await policy.optim_step(config.learning_rate, wd=0.0, max_grad_norm=1.0) await policy.save(model_name="policy-v1", ctx=ctx)

RecipeContext flows into every Harmony call. Policy, reference (one-line clone_inf), and grader (possibly a large judge LLM) all live on the same GPUs in Harmony's unified architecture. The full GRPO loop, in one file.

That's a recipe. Built-in or custom, all run on Harmony. See the docs for more.

Adaptive Engine Adapt Evaluate Serve

Use Cases RAG Text-to-SQL Customer Support

Company Technology About Blog

Socials LinkedIn Twitter YouTube

Blog posts

Introducing Recipes

InputConfig

@recipe_main

RecipeContext