Back to skills
Agentdb Learning
Create and train AI learning plugins with AgentDB's 9 reinforcement learning algorithms. Includes Decision Transformer, Q-Learning, SARSA, Actor-Critic, and more. Use when building self-learning agents, implementing RL, or optimizing agent behavior through experience.
3 stars
0 votes
0 copies
0 views
Added 12/19/2025
developmenttypescriptgobashnodetestinggitapisecurityperformance
Works with
cliapimcp
Install via CLI
$
openskills install airmcp-com/mcp-standardsFiles
SKILL.md
---
name: "AgentDB Learning Plugins"
description: "Create and train AI learning plugins with AgentDB's 9 reinforcement learning algorithms. Includes Decision Transformer, Q-Learning, SARSA, Actor-Critic, and more. Use when building self-learning agents, implementing RL, or optimizing agent behavior through experience."
---
# AgentDB Learning Plugins
## What This Skill Does
Provides access to 9 reinforcement learning algorithms via AgentDB's plugin system. Create, train, and deploy learning plugins for autonomous agents that improve through experience. Includes offline RL (Decision Transformer), value-based learning (Q-Learning), policy gradients (Actor-Critic), and advanced techniques.
**Performance**: Train models 10-100x faster with WASM-accelerated neural inference.
## Prerequisites
- Node.js 18+
- AgentDB v1.0.7+ (via agentic-flow)
- Basic understanding of reinforcement learning (recommended)
---
## Quick Start with CLI
### Create Learning Plugin
```bash
# Interactive wizard
npx agentdb@latest create-plugin
# Use specific template
npx agentdb@latest create-plugin -t decision-transformer -n my-agent
# Preview without creating
npx agentdb@latest create-plugin -t q-learning --dry-run
# Custom output directory
npx agentdb@latest create-plugin -t actor-critic -o ./plugins
```
### List Available Templates
```bash
# Show all plugin templates
npx agentdb@latest list-templates
# Available templates:
# - decision-transformer (sequence modeling RL - recommended)
# - q-learning (value-based learning)
# - sarsa (on-policy TD learning)
# - actor-critic (policy gradient with baseline)
# - curiosity-driven (exploration-based)
```
### Manage Plugins
```bash
# List installed plugins
npx agentdb@latest list-plugins
# Get plugin information
npx agentdb@latest plugin-info my-agent
# Shows: algorithm, configuration, training status
```
---
## Quick Start with API
```typescript
import { createAgentDBAdapter } from 'agentic-flow/reasoningbank';
// Initialize with learning enabled
const adapter = await createAgentDBAdapter({
dbPath: '.agentdb/learning.db',
enableLearning: true, // Enable learning plugins
enableReasoning: true,
cacheSize: 1000,
});
// Store training experience
await adapter.insertPattern({
id: '',
type: 'experience',
domain: 'game-playing',
pattern_data: JSON.stringify({
embedding: await computeEmbedding('state-action-reward'),
pattern: {
state: [0.1, 0.2, 0.3],
action: 2,
reward: 1.0,
next_state: [0.15, 0.25, 0.35],
done: false
}
}),
confidence: 0.9,
usage_count: 1,
success_count: 1,
created_at: Date.now(),
last_used: Date.now(),
});
// Train learning model
const metrics = await adapter.train({
epochs: 50,
batchSize: 32,
});
console.log('Training Loss:', metrics.loss);
console.log('Duration:', metrics.duration, 'ms');
```
---
## Available Learning Algorithms (9 Total)
### 1. Decision Transformer (Recommended)
**Type**: Offline Reinforcement Learning
**Best For**: Learning from logged experiences, imitation learning
**Strengths**: No online interaction needed, stable training
```bash
npx agentdb@latest create-plugin -t decision-transformer -n dt-agent
```
**Use Cases**:
- Learn from historical data
- Imitation learning from expert demonstrations
- Safe learning without environment interaction
- Sequence modeling tasks
**Configuration**:
```json
{
"algorithm": "decision-transformer",
"model_size": "base",
"context_length": 20,
"embed_dim": 128,
"n_heads": 8,
"n_layers": 6
}
```
### 2. Q-Learning
**Type**: Value-Based RL (Off-Policy)
**Best For**: Discrete action spaces, sample efficiency
**Strengths**: Proven, simple, works well for small/medium problems
```bash
npx agentdb@latest create-plugin -t q-learning -n q-agent
```
**Use Cases**:
- Grid worlds, board games
- Navigation tasks
- Resource allocation
- Discrete decision-making
**Configuration**:
```json
{
"algorithm": "q-learning",
"learning_rate": 0.001,
"gamma": 0.99,
"epsilon": 0.1,
"epsilon_decay": 0.995
}
```
### 3. SARSA
**Type**: Value-Based RL (On-Policy)
**Best For**: Safe exploration, risk-sensitive tasks
**Strengths**: More conservative than Q-Learning, better for safety
```bash
npx agentdb@latest create-plugin -t sarsa -n sarsa-agent
```
**Use Cases**:
- Safety-critical applications
- Risk-sensitive decision-making
- Online learning with exploration
**Configuration**:
```json
{
"algorithm": "sarsa",
"learning_rate": 0.001,
"gamma": 0.99,
"epsilon": 0.1
}
```
### 4. Actor-Critic
**Type**: Policy Gradient with Value Baseline
**Best For**: Continuous actions, variance reduction
**Strengths**: Stable, works for continuous/discrete actions
```bash
npx agentdb@latest create-plugin -t actor-critic -n ac-agent
```
**Use Cases**:
- Continuous control (robotics, simulations)
- Complex action spaces
- Multi-agent coordination
**Configuration**:
```json
{
"algorithm": "actor-critic",
"actor_lr": 0.001,
"critic_lr": 0.002,
"gamma": 0.99,
"entropy_coef": 0.01
}
```
### 5. Active Learning
**Type**: Query-Based Learning
**Best For**: Label-efficient learning, human-in-the-loop
**Strengths**: Minimizes labeling cost, focuses on uncertain samples
**Use Cases**:
- Human feedback incorporation
- Label-efficient training
- Uncertainty sampling
- Annotation cost reduction
### 6. Adversarial Training
**Type**: Robustness Enhancement
**Best For**: Safety, robustness to perturbations
**Strengths**: Improves model robustness, adversarial defense
**Use Cases**:
- Security applications
- Robust decision-making
- Adversarial defense
- Safety testing
### 7. Curriculum Learning
**Type**: Progressive Difficulty Training
**Best For**: Complex tasks, faster convergence
**Strengths**: Stable learning, faster convergence on hard tasks
**Use Cases**:
- Complex multi-stage tasks
- Hard exploration problems
- Skill composition
- Transfer learning
### 8. Federated Learning
**Type**: Distributed Learning
**Best For**: Privacy, distributed data
**Strengths**: Privacy-preserving, scalable
**Use Cases**:
- Multi-agent systems
- Privacy-sensitive data
- Distributed training
- Collaborative learning
### 9. Multi-Task Learning
**Type**: Transfer Learning
**Best For**: Related tasks, knowledge sharing
**Strengths**: Faster learning on new tasks, better generalization
**Use Cases**:
- Task families
- Transfer learning
- Domain adaptation
- Meta-learning
---
## Training Workflow
### 1. Collect Experiences
```typescript
// Store experiences during agent execution
for (let i = 0; i < numEpisodes; i++) {
const episode = runEpisode();
for (const step of episode.steps) {
await adapter.insertPattern({
id: '',
type: 'experience',
domain: 'task-domain',
pattern_data: JSON.stringify({
embedding: await computeEmbedding(JSON.stringify(step)),
pattern: {
state: step.state,
action: step.action,
reward: step.reward,
next_state: step.next_state,
done: step.done
}
}),
confidence: step.reward > 0 ? 0.9 : 0.5,
usage_count: 1,
success_count: step.reward > 0 ? 1 : 0,
created_at: Date.now(),
last_used: Date.now(),
});
}
}
```
### 2. Train Model
```typescript
// Train on collected experiences
const trainingMetrics = await adapter.train({
epochs: 100,
batchSize: 64,
learningRate: 0.001,
validationSplit: 0.2,
});
console.log('Training Metrics:', trainingMetrics);
// {
// loss: 0.023,
// valLoss: 0.028,
// duration: 1523,
// epochs: 100
// }
```
### 3. Evaluate Performance
```typescript
// Retrieve similar successful experiences
const testQuery = await computeEmbedding(JSON.stringify(testState));
const result = await adapter.retrieveWithReasoning(testQuery, {
domain: 'task-domain',
k: 10,
synthesizeContext: true,
});
// Evaluate action quality
const suggestedAction = result.memories[0].pattern.action;
const confidence = result.memories[0].similarity;
console.log('Suggested Action:', suggestedAction);
console.log('Confidence:', confidence);
```
---
## Advanced Training Techniques
### Experience Replay
```typescript
// Store experiences in buffer
const replayBuffer = [];
// Sample random batch for training
const batch = sampleRandomBatch(replayBuffer, batchSize: 32);
// Train on batch
await adapter.train({
data: batch,
epochs: 1,
batchSize: 32,
});
```
### Prioritized Experience Replay
```typescript
// Store experiences with priority (TD error)
await adapter.insertPattern({
// ... standard fields
confidence: tdError, // Use TD error as confidence/priority
// ...
});
// Retrieve high-priority experiences
const highPriority = await adapter.retrieveWithReasoning(queryEmbedding, {
domain: 'task-domain',
k: 32,
minConfidence: 0.7, // Only high TD-error experiences
});
```
### Multi-Agent Training
```typescript
// Collect experiences from multiple agents
for (const agent of agents) {
const experience = await agent.step();
await adapter.insertPattern({
// ... store experience with agent ID
domain: `multi-agent/${agent.id}`,
});
}
// Train shared model
await adapter.train({
epochs: 50,
batchSize: 64,
});
```
---
## Performance Optimization
### Batch Training
```typescript
// Collect batch of experiences
const experiences = collectBatch(size: 1000);
// Batch insert (500x faster)
for (const exp of experiences) {
await adapter.insertPattern({ /* ... */ });
}
// Train on batch
await adapter.train({
epochs: 10,
batchSize: 128, // Larger batch for efficiency
});
```
### Incremental Learning
```typescript
// Train incrementally as new data arrives
setInterval(async () => {
const newExperiences = getNewExperiences();
if (newExperiences.length > 100) {
await adapter.train({
epochs: 5,
batchSize: 32,
});
}
}, 60000); // Every minute
```
---
## Integration with Reasoning Agents
Combine learning with reasoning for better performance:
```typescript
// Train learning model
await adapter.train({ epochs: 50, batchSize: 32 });
// Use reasoning agents for inference
const result = await adapter.retrieveWithReasoning(queryEmbedding, {
domain: 'decision-making',
k: 10,
useMMR: true, // Diverse experiences
synthesizeContext: true, // Rich context
optimizeMemory: true, // Consolidate patterns
});
// Make decision based on learned experiences + reasoning
const decision = result.context.suggestedAction;
const confidence = result.memories[0].similarity;
```
---
## CLI Operations
```bash
# Create plugin
npx agentdb@latest create-plugin -t decision-transformer -n my-plugin
# List plugins
npx agentdb@latest list-plugins
# Get plugin info
npx agentdb@latest plugin-info my-plugin
# List templates
npx agentdb@latest list-templates
```
---
## Troubleshooting
### Issue: Training not converging
```typescript
// Reduce learning rate
await adapter.train({
epochs: 100,
batchSize: 32,
learningRate: 0.0001, // Lower learning rate
});
```
### Issue: Overfitting
```typescript
// Use validation split
await adapter.train({
epochs: 50,
batchSize: 64,
validationSplit: 0.2, // 20% validation
});
// Enable memory optimization
await adapter.retrieveWithReasoning(queryEmbedding, {
optimizeMemory: true, // Consolidate, reduce overfitting
});
```
### Issue: Slow training
```bash
# Enable quantization for faster inference
# Use binary quantization (32x faster)
```
---
## Learn More
- **Algorithm Papers**: See docs/algorithms/ for detailed papers
- **GitHub**: https://github.com/ruvnet/agentic-flow/tree/main/packages/agentdb
- **MCP Integration**: `npx agentdb@latest mcp`
- **Website**: https://agentdb.ruv.io
---
**Category**: Machine Learning / Reinforcement Learning
**Difficulty**: Intermediate to Advanced
**Estimated Time**: 30-60 minutes
Attribution
Comments (0)
No comments yet. Be the first to comment!
