Agentic Context Engineering (ACE)¶
Learn how AI agents can learn from experience using Nexus ACE - an implementation of Agentic Context Engineering that enables agents to improve through self-reflection and memory.
🎯 What is ACE?¶
Agentic Context Engineering (ACE) is a framework that enables AI agents to:
- Learn from experience through self-reflection
- Store learned strategies in a persistent playbook
- Improve over time without retraining models
- Adapt to new tasks using accumulated knowledge
Nexus ACE provides a complete implementation with:
- Trajectory tracking - Record agent actions and outcomes
- Reflection API - Extract insights from experiences
- Playbook curation - Store and retrieve learned strategies
- Memory consolidation - Build knowledge over multiple iterations
📊 Demo: Data Validator¶
The ACE demo shows an agent learning data validation rules from scratch. Starting with 60% accuracy and improving to 95% through automated learning.
What the Demo Shows¶
An AI agent that:
- Starts with zero validation knowledge
- Processes real-world data (Titanic dataset)
- Learns from mistakes through reflection
- Builds a comprehensive validation playbook
- Achieves +35% accuracy improvement over 10 epochs
Quick Start¶
# Install dependencies
cd examples/ace
pip install -r requirements.txt
# Run the demo (with fresh start)
python demo_3_data_validator.py --epochs 10 --reset
# Continue training from existing playbook
python demo_3_data_validator.py --epochs 10
First Run
Use the --reset flag to see learning from scratch. Without it, the agent reuses existing strategies and may start at 100% accuracy.
🔬 How It Works¶
The ACE Learning Loop¶
graph TB
Start([🎯 Start<br/>Accuracy: 60%]) --> Load[📖 Load Playbook<br/>Retrieve strategies]
Load --> Execute[⚙️ Execute & Measure<br/>Process data<br/>Track trajectory]
Execute --> Reflect[🤔 Reflect<br/>What worked?<br/>What failed?]
Reflect --> Helpful[✅ Helpful Strategies<br/>Store successes]
Reflect --> Harmful[❌ Harmful Patterns<br/>Mark failures]
Helpful --> Curate[🎯 Curate Playbook<br/>Update memory]
Harmful --> Curate
Curate --> Next{Next<br/>Epoch?}
Next -->|Yes| Improve([📈 Improved<br/>Accuracy: 95%])
Next -->|No| Done([🎉 Complete])
Improve --> Load
style Start fill:#e1f5ff,stroke:#0288d1,stroke-width:3px
style Execute fill:#e8f5e9,stroke:#388e3c,stroke-width:2px
style Reflect fill:#fce4ec,stroke:#c2185b,stroke-width:2px
style Curate fill:#b39ddb,stroke:#512da8,stroke-width:2px
style Improve fill:#a5d6a7,stroke:#2e7d32,stroke-width:3px Core ACE APIs¶
1. Start Trajectory¶
Track an agent's actions and context:
import nexus
nx = nexus.connect(config={"data_dir": "./nexus-data"})
# Start tracking
traj_id = nx.memory.start_trajectory(
task_description="Validate customer records",
context={
"dataset": "customers.csv",
"agent": "validator-v1"
}
)
# Agent performs work
results = validate_records(records)
# Complete with outcome
nx.memory.complete_trajectory(
traj_id,
status="success",
success_score=0.85,
metadata={"processed": 100, "valid": 85}
)
2. Reflect on Experience¶
Extract insights from completed trajectories:
# Agent reflects on what worked
reflection = nx.memory.reflect(
trajectory_id=traj_id,
reflection_prompt="What validation rules were most effective?"
)
print(reflection.summary)
# Output: "Checking for null values in required fields prevented 40% of errors"
print(reflection.helpful_strategies)
# ["Validate email format", "Check age range 0-120", ...]
print(reflection.harmful_strategies)
# ["Skip validation for trusted sources", ...]
3. Curate Playbook¶
Store learned strategies for future use:
# Add reflection to playbook
nx.memory.curate_playbook(
reflection_ids=[reflection.id],
playbook_name="data_validation",
update_mode="merge"
)
# Retrieve strategies for next run
strategies = nx.memory.get_playbook("data_validation")
for strategy in strategies:
print(f"[{strategy.confidence:.0%}] {strategy.content}")
# Output:
# [95%] Check required fields: Name, Email, Age
# [92%] Validate age range: 0-120 years
# [88%] Verify email format with regex
4. Consolidate Memory¶
Merge similar strategies over time:
# Consolidate after multiple epochs
nx.memory.consolidate(
playbook_name="data_validation",
similarity_threshold=0.8
)
📈 Expected Results¶
Learning Progression¶
| Epoch | Accuracy | What Agent Learned |
|---|---|---|
| 0 | 60% | No validation rules (baseline) |
| 2 | 72% | Required field checks |
| 4 | 84% | Range validation |
| 6 | 90% | Format validation |
| 8 | 94% | Edge case handling |
| 10 | 95% | Complete validation suite |
Learned Validation Rules¶
After training, the agent's playbook contains:
✅ Helpful Strategies
- Check required fields: Name, Sex, Age
- Validate age range: 0-100 years
- Reject negative fare values
- Sex must be 'male' or 'female'
- Validate against empty strings
❌ Harmful Patterns
- AVOID accepting missing ages
- AVOID skipping numeric range checks
- AVOID accepting arbitrary categorical values
🛠️ Customization¶
Use Your Own Dataset¶
from datasets import load_dataset
import pandas as pd
# Load any HuggingFace dataset
dataset = load_dataset("your_org/your_dataset")
df = pd.DataFrame(dataset["train"])
# Define validation ground truth
ground_truth = create_ground_truth(df)
# Run ACE learning
run_validation_task(df, ground_truth, playbook_strategies)
Adjust Learning Parameters¶
# Customize the learning process
config = {
"epochs": 20, # More training iterations
"batch_size": 50, # Process in batches
"reflection_depth": "detailed", # More thorough analysis
"consolidation_threshold": 0.85 # Merge similar strategies
}
Custom Reward Signal¶
# Define your own success metric
def compute_reward(predictions, ground_truth):
accuracy = compute_accuracy(predictions, ground_truth)
precision = compute_precision(predictions, ground_truth)
recall = compute_recall(predictions, ground_truth)
# Weighted combination
return 0.5 * accuracy + 0.3 * precision + 0.2 * recall
💡 Real-World Applications¶
Customer Support Agent¶
Agent learns effective responses through ACE:
# Track support interactions
traj_id = nx.memory.start_trajectory("Handle customer inquiry")
# Agent provides response
response = agent.generate_response(customer_query)
satisfaction_score = customer.rate_response()
# Complete trajectory with feedback
nx.memory.complete_trajectory(
traj_id,
status="success",
success_score=satisfaction_score
)
# Reflect and improve
reflection = nx.memory.reflect(traj_id)
nx.memory.curate_playbook([reflection.id], "support_agent")
Code Review Bot¶
Learn effective code review patterns:
# Track review process
traj_id = nx.memory.start_trajectory("Review pull request #123")
# Generate review comments
comments = bot.review_code(pull_request)
# Get feedback from developer
feedback = developer.rate_review(comments)
# Learn from feedback
nx.memory.complete_trajectory(traj_id, success_score=feedback)
reflection = nx.memory.reflect(traj_id)
nx.memory.curate_playbook([reflection.id], "code_reviewer")
Sales Lead Qualifier¶
Improve lead scoring over time:
# Track qualification process
traj_id = nx.memory.start_trajectory("Qualify lead")
# Agent scores lead
score = agent.qualify_lead(lead_data)
# Get actual outcome (did they convert?)
actual_conversion = track_conversion(lead_data.id, days=30)
# Learn from outcome
success = 1.0 if actual_conversion else 0.0
nx.memory.complete_trajectory(traj_id, success_score=success)
nx.memory.curate_playbook([nx.memory.reflect(traj_id).id], "lead_qualifier")
📚 Learn More¶
- ACE Paper: Agentic Context Engineering (arXiv)
- Nexus ACE Design: See
docs/design/ACE_INTEGRATION.mdin the repository - Demo Code: See
examples/ace/demo_3_data_validator.py
🐛 Troubleshooting¶
Dataset Download Issues¶
The demo automatically falls back to synthetic data if HuggingFace download fails:
Memory Issues with Large Datasets¶
Use sampling for large datasets:
Nexus Connection¶
Ensure Nexus is accessible:
# Start local Nexus
nexus serve --data-dir ./nexus_data
# Or connect to remote server
export NEXUS_URL=http://your-server:2026
export NEXUS_API_KEY=your-api-key
🎓 Understanding ACE¶
Why It Works¶
- Clear Reward Signal - Validation accuracy is objective and measurable
- Immediate Feedback - Agent sees results instantly after each epoch
- Pattern Recognition - ACE identifies consistently effective strategies
- Cumulative Learning - Each epoch builds on previous knowledge
Key Principles¶
- Experience over Rules - Learn from doing, not from predefined rules
- Reflection over Fine-tuning - Extract insights without model retraining
- Memory over State - Persist knowledge across sessions
- Iteration over Perfection - Improve gradually through multiple cycles
🚀 Next Steps¶
- Run the demo - See ACE in action with
python demo_3_data_validator.py --reset - Try your data - Replace Titanic with your validation task
- Build your agent - Integrate ACE into your agent workflow
- Measure improvement - Compare ACE vs. hard-coded rules
Powered by Nexus ACE 🧠 - Making agents learn from experience