Documentation

Overview

AILoop is a platform for building self-improving AI agents. Instead of getting a single AI response and moving on, AILoop creates an iterative refinement loop where agents evaluate their own outputs and improve them automatically.

The core workflow is: Decompose → Assemble → Execute → Evaluate → Improve

Each query goes through multiple iterations, with confidence scoring and full provenance tracking for complete transparency.

Quick Start

1. Open AILoop

Click "Launch App" to open the AILoop interface. No login required for the first 5 queries.

2. Ask a Question

Type your query in the query panel. AILoop works best with detailed, structured questions. For example:

What are the top 3 strategies for reducing operational costs in manufacturing?

3. Let It Refine

AILoop will automatically iterate, improving the answer through multiple cycles. Watch the iterations panel to see progress.

4. Review Results

Examine the structured answer, confidence scores, and supporting evidence. Click on facts to see their provenance.

Core Concepts

📊 Knowledge Graph

Every response is analyzed to extract entities (people, places, concepts) and relationships between them. These are stored in a knowledge graph with confidence scores and full provenance tracking.

⭐ Confidence Scoring

Each extracted fact receives a confidence score (0-1) based on:

How often it appears across iterations
Supporting evidence from the response
Agreement with the knowledge base

High scores = reliable facts. Low scores = needs verification.

🔄 Iterations

AILoop evaluates each answer against your criteria, identifying gaps and areas for improvement. It then prompts the agent to refine. Each iteration produces a new version of the answer with potentially higher confidence scores.

🛡️ Provenance

Full audit trail. Every fact is tracked back to its original mention, all transformations, and when it changed. Click any fact to see its complete history.

⚙️ Evaluation Criteria

Define what makes a good answer. Pro users can create up to 5 custom criteria (e.g., "Actionable", "Sourced", "Novel"). AILoop grades each iteration against these criteria.

User Documentation

Use these links for product usage, workflows, and troubleshooting:

Website User Guide — end-user flow for running and validating sessions.
Repository User Guide (Markdown) — source version tracked in git.

Running Queries

Best practices for effective queries:

Be specific. "What are the benefits of AI?" is too broad. "What are 3 specific ways AI reduces costs in customer support?" is better.
Include context. If relevant, mention the domain, stakeholders, or constraints.
Ask for structure. Request lists, tables, or specific formats for easier parsing.
Set evaluation criteria. If you care about specific qualities, mention them in your query or set custom criteria in settings.
Use clarifications before execution. Required clarification prompts improve first-pass quality and reduce wasted iterations.
Use Ctrl+Enter to run loops faster from the query box.

Evaluation Criteria

AILoop comes with default evaluation criteria:

Completeness: Does it answer all aspects of the query?
Accuracy: Are the facts correct and well-supported?
Clarity: Is it easy to understand and well-structured?
Actionability: Can the reader act on this information?

Pro users: Create custom criteria tailored to your domain. Weight them differently based on importance.

Understanding Iterations

Use the timeline slider to navigate between iterations. Each iteration shows:

Overall Score: Combined evaluation score (0-1)
Per-Criterion Scores: How well it performs on each criterion
Trends: ↑ improved, ↓ declined, → flat vs. previous iteration
Confidence Breakdown: Why the score changed, with factor-level +/- adjustments
User Feedback Loop: Helpful / Needs Work signals that update rule confidence
Live Status: Current phase, iteration X/Y, and ETA during execution

API Reference

Pro and Enterprise users can access the AILoop API for programmatic use.

POST /api/queries

Submit a new query for processing.

{
  "query": "What are the top 3 strategies...",
  "max_iterations": 5,
  "evaluation_criteria": ["Completeness", "Accuracy"]
}

GET /api/queries/{id}

Retrieve query results and iterations.

GET /api/knowledge-graph/entities

List all extracted entities with confidence scores.

POST /api/db/sessions/{session_id}/feedback

Persist user rating and propagate confidence updates to rules used in that session.

POST /api/loop/validate-prompt

Validate prompt quality before loop execution and trigger clarification requirements when needed.

Full API documentation: View OpenAPI Spec

Integrations

AILoop integrates with:

Ollama: Run local LLMs with Ollama backend
PostgreSQL: Custom knowledge graph storage
Slack: (Enterprise) Get query results directly in Slack
Zapier: (Pro/Enterprise) Automate with 1000+ apps

Testing

AILoop includes unit and e2e coverage:

Backend unit tests: uv run pytest
Frontend e2e tests: npm run test:e2e (Playwright)
Recommended regression scope: prompt validation, clarification flow, confidence breakdown rendering, feedback submit, search, export
Testing guide: docs/testing.md

FAQ

How many iterations should I allow?

Most queries reach good results in 3-5 iterations. Set a higher limit (e.g., 10) for complex questions or fewer (e.g., 2) for speed.

What if results get worse over time?

This can happen if evaluation criteria are misaligned. Check your criteria in Settings and consider tweaking them or providing feedback.

Can I export my knowledge graph?

Yes! Pro and Enterprise users can export as JSON or CSV. Use the Export button in the Dashboard.

How does confidence scoring work?

Confidence is calculated based on mention frequency, supporting evidence, semantic similarity to known facts, and cross-iteration consistency.