Rlm Development

Name: Rlm Development
Author: apenab
ASecurity
Guide for developing and contributing to the pyrlm-runtime project. Use when modifying core modules, adding features, fixing bugs, or understanding the architecture of the Recursive Language Model runtime.
11 stars
0 votes
0 copies
0 views
Added 2/11/2026
data-aipythonbashtestingapi
Works with

api
Security Analysis

A100/100
Scanned 2/12/2026
Install via CLI
$openskills install apenab/pyrlm-runtime
Files
SKILL.md
---
name: rlm-development
description: Guide for developing and contributing to the pyrlm-runtime project. Use when modifying core modules, adding features, fixing bugs, or understanding the architecture of the Recursive Language Model runtime.
license: MIT
metadata:
  author: apenab
  version: "1.0"
---

# RLM Runtime Development Guide

## Project Overview

`pyrlm-runtime` is a minimal runtime for Recursive Language Models (RLMs) based on the MIT CSAIL paper. Instead of feeding full context into an LLM, RLMs treat large context as **environment state** in a Python REPL. The LLM writes code to inspect, chunk, and recursively query sub-LLMs on smaller pieces.

## Architecture

```
src/pyrlm_runtime/
├── rlm.py              # Core RLM loop (entry point: RLM.run())
├── context.py           # Immutable Context dataclass with inspection helpers
├── env.py               # Sandboxed PythonREPL for code execution
├── policy.py            # Execution limits (steps, tokens, subcalls, recursion)
├── cache.py             # File-based subcall memoization (.rlm_cache/)
├── trace.py             # Execution recording (TraceStep, Trace)
├── router.py            # SmartRouter: auto baseline vs RLM selection
├── prompts.py           # System prompts (BASE, SUBCALL, RECURSIVE, LLAMA, QWEN)
└── adapters/
    ├── base.py          # ModelAdapter protocol, Usage, ModelResponse
    ├── openai_compat.py # OpenAI-compatible API wrapper
    ├── generic_chat.py  # HTTP-based generic chat adapter
    └── fake.py          # Deterministic adapter for testing
```

## Key Design Patterns

### 1. Protocol-Based Adapters

All model integrations implement the `ModelAdapter` protocol:

```python
class ModelAdapter(Protocol):
    def complete(
        self,
        messages: list[dict[str, str]],
        *,
        max_tokens: int = 512,
        temperature: float = 0.0,
    ) -> ModelResponse: ...
```

When adding a new adapter, implement this protocol. Return `ModelResponse(text=..., usage=Usage(...))`.

### 2. Frozen Dataclasses for Immutability

`Context`, `Usage`, `ModelResponse`, and `ExecResult` are frozen dataclasses. This ensures determinism and thread safety. Never make these mutable.

### 3. Sandboxed REPL

`PythonREPL` restricts execution to safe builtins and allowed modules (`re`, `math`, `json`, `textwrap`). When modifying the REPL:
- Never add `os`, `sys`, `subprocess`, or network modules to allowed imports
- Keep the stdout limit (default 4000 chars) to prevent memory issues
- The `_RegexProxy` wraps `re` to auto-coerce Context/list/tuple inputs to strings

### 4. The RLM Loop (rlm.py)

The core execution loop in `RLM.run()`:
1. Initialize REPL with context as `P` (string) and `ctx` (Context object)
2. Inject helper functions: `peek()`, `tail()`, `lenP()`, `subcall()`, `ask()`, etc.
3. Loop: call root LLM -> parse response -> execute code or finalize
4. LLM outputs Python code (executed in REPL) or `FINAL:`/`FINAL_VAR:` to end
5. REPL stdout/errors feed back to the LLM on next iteration

### 5. Policy Enforcement

`Policy` tracks and limits: steps, subcalls, recursion depth, total tokens, subcall tokens. Each has a corresponding exception (`MaxStepsExceeded`, etc.). Always respect policy checks when adding new execution paths.

## Coding Conventions

### Style
- **Formatter**: `ruff format` (line-length=100, double quotes, space indent)
- **Linter**: `ruff check`
- **Type checker**: `ty`
- **Python**: 3.12+ (use `X | Y` union syntax, not `Optional[X]`)

### Commits
- Use **conventional commits** via commitizen: `feat:`, `fix:`, `docs:`, `refactor:`, `test:`, `chore:`
- Run `uv run ruff format . && uv run ruff check .` before committing

### Testing
- Framework: `pytest` (config in `pyproject.toml`)
- Run: `uv run pytest`
- Use `FakeAdapter` from `adapters/fake.py` for deterministic tests (no API calls needed)
- Test files: `tests/test_*.py` matching source modules

### Dependencies
- **Only external dependency**: `httpx>=0.27`
- Dev deps: `pytest`, `ruff`, `ty`, `commitizen`
- Keep dependencies minimal by design

## How to Make Common Changes

### Adding a new REPL helper function

1. Define the function inside `RLM.run()` (it has access to `context`, `repl`, `policy`, `trace`)
2. Register it with `repl.set("function_name", function)`
3. Document it in the system prompt (`prompts.py`) so the LLM knows it exists
4. Add tests in `tests/test_rlm_loop.py` using `FakeAdapter`

### Adding a new model adapter

1. Create `src/pyrlm_runtime/adapters/your_adapter.py`
2. Implement the `ModelAdapter` protocol (just the `complete` method)
3. Return `ModelResponse(text=..., usage=Usage(prompt_tokens=..., completion_tokens=..., total_tokens=...))`
4. Export from `__init__.py` if it should be part of the public API

### Modifying the execution loop

1. Read `rlm.py` carefully - the loop has guards, fallbacks, and auto-finalization
2. Key flags: `repl_executed`, `subcall_made`, `invalid_responses`, `repl_errors`
3. New execution paths must call `policy.check_step()` / `policy.check_subcall()`
4. Record steps via `trace.add(TraceStep(...))` for observability
5. Test with `FakeAdapter` scripted responses to simulate multi-step conversations

### Adding a new system prompt variant

1. Add the prompt string to `prompts.py`
2. Follow existing patterns (available helpers, output format rules, examples)
3. The LLM must know about `FINAL:` / `FINAL_VAR:` syntax to terminate
4. Pass via `RLM(system_prompt=YOUR_PROMPT)`

### Modifying Context

1. `Context` is frozen - add new class methods or instance methods, don't change existing signatures
2. If adding document-level operations, handle both `context_type="string"` and `"document_list"`
3. Update `metadata()` if new fields are needed by prompts
4. Test in `tests/test_context.py`

## Public API (exported from `__init__.py`)

Core: `RLM`, `Context`, `PythonREPL`, `ExecResult`
Config: `Policy`, `PolicyError`, `MaxStepsExceeded`, `MaxSubcallsExceeded`, `MaxRecursionExceeded`, `MaxTokensExceeded`
Adapters: `ModelAdapter`, `ModelResponse`, `Usage`, `FileCache`
Routing: `SmartRouter`, `RouterConfig`, `RouterResult`, `ExecutionProfile`, `TraceFormatter`
Tracing: `Trace`, `TraceStep`

## Environment Variables

| Variable | Purpose | Example |
|---|---|---|
| `LLM_BASE_URL` | API endpoint | `http://localhost:11434/v1` |
| `LLM_MODEL` | Root model name | `qwen2.5-coder:14b-instruct` |
| `LLM_SUBCALL_MODEL` | Subcall model (can be smaller) | `qwen2.5-coder:7b-instruct` |
| `LLM_API_KEY` / `OPENAI_API_KEY` | API authentication | |
| `LLM_LOG_LEVEL` | Logging verbosity | `INFO`, `DEBUG` |
| `PARALLEL_SUBCALLS` | Enable parallel subcalls | `1` |
| `SHOW_TRAJECTORY` | Show execution trajectory | `1` |

## Running Commands

```bash
# Run all tests
uv run pytest

# Format code
uv run ruff format .

# Lint
uv run ruff check .

# Type check
uv run ty check

# Build package
uv build

# Run an example
uv run python examples/minimal.py
```
Rlm Development

Works with

Security Analysis

Attribution

Comments (0)