Overview
AILoop is a platform for building self-improving AI agents. Instead of getting a single AI response and moving on, AILoop creates an iterative refinement loop where agents evaluate their own outputs and improve them automatically.
The core workflow is: Decompose → Assemble → Execute → Evaluate → Improve
Each query goes through multiple iterations, with confidence scoring and full provenance tracking for complete transparency.
Quick Start
1. Open AILoop
Click "Launch App" to open the AILoop interface. No login required for the first 5 queries.
2. Ask a Question
Type your query in the query panel. AILoop works best with detailed, structured questions. For example:
What are the top 3 strategies for reducing operational costs in manufacturing?
3. Let It Refine
AILoop will automatically iterate, improving the answer through multiple cycles. Watch the iterations panel to see progress.
4. Review Results
Examine the structured answer, confidence scores, and supporting evidence. Click on facts to see their provenance.
Core Concepts
📊 Knowledge Graph
Every response is analyzed to extract entities (people, places, concepts) and relationships between them. These are stored in a knowledge graph with confidence scores and full provenance tracking.
⭐ Confidence Scoring
Each extracted fact receives a confidence score (0-1) based on:
- How often it appears across iterations
- Supporting evidence from the response
- Agreement with the knowledge base
🔄 Iterations
AILoop evaluates each answer against your criteria, identifying gaps and areas for improvement. It then prompts the agent to refine. Each iteration produces a new version of the answer with potentially higher confidence scores.
🛡️ Provenance
Full audit trail. Every fact is tracked back to its original mention, all transformations, and when it changed. Click any fact to see its complete history.
⚙️ Evaluation Criteria
Define what makes a good answer. Pro users can create up to 5 custom criteria (e.g., "Actionable", "Sourced", "Novel"). AILoop grades each iteration against these criteria.
User Documentation
Use these links for product usage, workflows, and troubleshooting:
- Website User Guide — end-user flow for running and validating sessions.
- Repository User Guide (Markdown) — source version tracked in git.
Running Queries
Best practices for effective queries:
- Be specific. "What are the benefits of AI?" is too broad. "What are 3 specific ways AI reduces costs in customer support?" is better.
- Include context. If relevant, mention the domain, stakeholders, or constraints.
- Ask for structure. Request lists, tables, or specific formats for easier parsing.
- Set evaluation criteria. If you care about specific qualities, mention them in your query or set custom criteria in settings.
- Use clarifications before execution. Required clarification prompts improve first-pass quality and reduce wasted iterations.
- Use Ctrl+Enter to run loops faster from the query box.
Evaluation Criteria
AILoop comes with default evaluation criteria:
- Completeness: Does it answer all aspects of the query?
- Accuracy: Are the facts correct and well-supported?
- Clarity: Is it easy to understand and well-structured?
- Actionability: Can the reader act on this information?
Pro users: Create custom criteria tailored to your domain. Weight them differently based on importance.
Understanding Iterations
Use the timeline slider to navigate between iterations. Each iteration shows:
- Overall Score: Combined evaluation score (0-1)
- Per-Criterion Scores: How well it performs on each criterion
- Trends: ↑ improved, ↓ declined, → flat vs. previous iteration
- Confidence Breakdown: Why the score changed, with factor-level +/- adjustments
- User Feedback Loop: Helpful / Needs Work signals that update rule confidence
- Live Status: Current phase, iteration X/Y, and ETA during execution
API Reference
Pro and Enterprise users can access the AILoop API for programmatic use.
POST /api/queries
Submit a new query for processing.
{
"query": "What are the top 3 strategies...",
"max_iterations": 5,
"evaluation_criteria": ["Completeness", "Accuracy"]
}
GET /api/queries/{id}
Retrieve query results and iterations.
GET /api/knowledge-graph/entities
List all extracted entities with confidence scores.
POST /api/db/sessions/{session_id}/feedback
Persist user rating and propagate confidence updates to rules used in that session.
POST /api/loop/validate-prompt
Validate prompt quality before loop execution and trigger clarification requirements when needed.
Full API documentation: View OpenAPI Spec
Integrations
AILoop integrates with:
- Ollama: Run local LLMs with Ollama backend
- PostgreSQL: Custom knowledge graph storage
- Slack: (Enterprise) Get query results directly in Slack
- Zapier: (Pro/Enterprise) Automate with 1000+ apps
Testing
AILoop includes unit and e2e coverage:
- Backend unit tests:
uv run pytest - Frontend e2e tests:
npm run test:e2e(Playwright) - Recommended regression scope: prompt validation, clarification flow, confidence breakdown rendering, feedback submit, search, export
- Testing guide: docs/testing.md
FAQ
How many iterations should I allow?
Most queries reach good results in 3-5 iterations. Set a higher limit (e.g., 10) for complex questions or fewer (e.g., 2) for speed.
What if results get worse over time?
This can happen if evaluation criteria are misaligned. Check your criteria in Settings and consider tweaking them or providing feedback.
Can I export my knowledge graph?
Yes! Pro and Enterprise users can export as JSON or CSV. Use the Export button in the Dashboard.
How does confidence scoring work?
Confidence is calculated based on mention frequency, supporting evidence, semantic similarity to known facts, and cross-iteration consistency.