Back to Engineering Logs
Frameworks February 06, 2026

Enabled frameworks in Advi Systems Prompts

A
Advi Systems
Engineering

Not all prompts should be structured the same way. A two-sentence classification task has different structural needs than a multi-paragraph content generation request with compliance requirements. Forcing every prompt into a single template wastes tokens on some tasks and under-specifies others. Advi Systems Prompts supports multiple prompt frameworks, each designed for a specific class of tasks, and routes your inputs to the framework that maximizes instruction compliance for your use case.

This article provides a technical breakdown of each supported framework: what it contains, why it's structured the way it is, when to use it, and how it performs against alternatives.

Framework 1: CO-STAR — the full-specification framework

CO-STAR stands for Context, Objective, Style, Tone, Audience, Response. It is the most comprehensive framework in Advi's toolkit and is designed for tasks where multiple dimensions of the output need explicit control.

  • Context: Background information the model needs to understand the task. This includes domain context (“You are working within a B2B SaaS company's support system”), temporal context (“The user is asking about a feature released in Q3 2025”), and relational context (“This is a follow-up to an unresolved complaint”). Context is placed first because it establishes the world the model operates in.
  • Objective: A single, unambiguous statement of what the model must produce. CO-STAR enforces single-objective prompts. If the task has multiple parts, each part becomes a separate objective in a chained prompt sequence. This is based on research showing that single-objective prompts achieve 23% higher task-completion accuracy than multi-objective prompts (Google DeepMind, 2024).
  • Style: The writing style of the output: formal, conversational, technical, journalistic, academic, etc. Style is distinct from tone. A technical style with a warm tone produces different output than a conversational style with a neutral tone. Separating these two dimensions gives the model clearer behavioral instructions.
  • Tone: The emotional register of the output: empathetic, authoritative, cautious, enthusiastic, neutral. Tone instructions are most effective when paired with negative constraints (“Do not be condescending” or “Avoid excessive enthusiasm”) because models are better at avoiding behaviors than targeting precise emotional gradients.
  • Audience: Who will read the output. This single field has outsized impact on vocabulary, assumed knowledge, sentence complexity, and explanation depth. “Senior data engineers” produces fundamentally different output than “non-technical executives,” even with identical objectives.
  • Response: The exact format, structure, and length of the expected output. This includes document type (email, report, bullet list, JSON), length bounds (100–200 words, 5 bullet points, 3 paragraphs), required sections (headers, summary, recommendations), and any structural constraints (no headers, no bullet points, single paragraph).

When to use CO-STAR: Content generation, customer communications, report writing, multi-stakeholder deliverables, and any task where tone, audience, and format all need independent control. CO-STAR prompts typically use 200–400 input tokens, which is compact given the six dimensions they specify.

Performance data: In our evaluation across 800 content-generation tasks, CO-STAR prompts achieved 91% first-run format compliance and 87% tone accuracy (rated by blind reviewers), compared to 64% format compliance and 58% tone accuracy for unstructured prompts covering the same tasks.

Framework 2: RTF — the minimal-overhead framework

RTF stands for Role, Task, Format. It is designed for direct, single-purpose tasks where context, tone, and audience are either obvious from the task itself or not relevant.

  • Role: A one-sentence behavioral frame. Example: “You are a Python code reviewer.” This sets the model's expertise domain and calibrates its vocabulary and judgment criteria. Roles are most effective when they are specific and professional (“senior tax accountant specializing in S-corps”) rather than vague (“helpful assistant”).
  • Task: The specific action the model must perform. RTF tasks are always imperative and atomic: “Extract all email addresses from the following text,” “Classify this support ticket into one of the following categories,” “Rewrite this paragraph at an 8th-grade reading level.”
  • Format: The output structure. For classification: “Return only the category name.” For extraction: “Return a JSON array of strings.” For rewriting: “Return the rewritten paragraph with no additional commentary.”

When to use RTF: Data extraction, classification, code generation, translation, reformatting, and any task where the instruction can be expressed in under 50 words. RTF prompts typically use 50–150 input tokens, making them the fastest and cheapest option.

Performance data: RTF prompts achieve 93% task-completion accuracy on single-objective tasks, outperforming CO-STAR on simple tasks by 3–5% because the shorter prompt reduces instruction dilution. However, RTF underperforms CO-STAR by 15–20% on tasks requiring nuanced tone or audience adaptation, because it lacks those control dimensions.

The tradeoff is explicit: RTF sacrifices granular control for speed and token efficiency. Use it when the task is clear enough that additional specification would add tokens without adding value.

Framework 3: Constraint-first — the compliance-priority framework

The constraint-first framework inverts the typical prompt structure. Instead of starting with context and objectives, it leads with constraints and guardrails, followed by the task and format specification. This framework is designed for regulated, high-stakes, or safety-critical workflows where what the model must not do is more important than what it should do.

  • Hard constraints (position: first): Absolute rules that must never be violated. Examples: “Do not provide medical diagnoses,” “Do not reference any information not present in the provided document,” “Do not output personally identifiable information.” Placed first to exploit primacy bias.
  • Soft constraints (position: second): Behavioral preferences that should be followed but are not absolute. Examples: “Prefer shorter sentences,” “Use active voice when possible,” “Avoid jargon unless necessary for precision.”
  • Task objective (position: third): The actual work to be done, framed within the constraint boundaries already established.
  • Output format (position: last): Structural requirements, often including mandatory disclaimer text, required headers, or regulatory language.

When to use constraint-first: Healthcare content, legal document generation, financial reporting, compliance-sensitive customer communications, content moderation rules, and any workflow where a single output violation could create legal or reputational liability.

Performance data: In evaluations on 600 compliance-sensitive tasks, constraint-first prompts reduced guardrail violations by 44% compared to the same constraints placed at the end of a CO-STAR prompt. The dual-placement strategy (constraints at both the beginning and end) outperformed end-only placement by 31% and beginning-only placement by 18%. However, constraint-first prompts are 10–15% less creative than CO-STAR prompts on open-ended generation tasks, because the leading constraints make the model more conservative.

This tradeoff is intentional. In compliance-critical workflows, conservatism is a feature, not a bug. You want the model to err on the side of caution.

How Advi selects the right framework

Framework selection in Advi is currently driven by input analysis during the normalization stage. The system evaluates three signals:

  • Input complexity: If the user specifies audience, tone, and style separately, the system routes to CO-STAR. If only a task and format are provided, it routes to RTF.
  • Constraint density: If the input contains multiple negative constraints or compliance-related keywords (e.g., “must not,” “prohibited,” “regulatory,” “HIPAA,” “GDPR”), the system routes to constraint-first.
  • Task type classification: Extraction and classification tasks default to RTF. Generation and communication tasks default to CO-STAR. Compliance and safety tasks default to constraint-first.

Users can also override the framework selection manually if they have a specific structural preference. In practice, the automatic routing matches user preference 84% of the time based on feedback data from our beta cohort.

Framework comparison at a glance

  • CO-STAR: 6 sections | 200–400 tokens | Best for nuanced generation | 91% format compliance | Highest control granularity
  • RTF: 3 sections | 50–150 tokens | Best for direct tasks | 93% task accuracy | Lowest latency and cost
  • Constraint-first: 4 sections | 150–350 tokens | Best for regulated workflows | 44% fewer violations | Most conservative outputs

The right framework depends on the task, not on personal preference. Simple extraction tasks forced into CO-STAR waste tokens. Compliance-critical generation forced into RTF sacrifices safety. Advi's routing system exists to match the framework to the task automatically, so teams get optimal structure without needing to understand the underlying prompt architecture.

Ready to engineer better prompts?

See this architecture in action and stop wrestling with chat interfaces.

Launch Dashboard