WIKI/Reward & Motivation/Hypothesis Engine
Reward & Motivation

Hypothesis Engine

The Hypothesis Engine (services/hypothesis_engine/, 6 files, ~1,200 lines) is Frank's empirical reasoning system. It doesn't just store facts — it generates testable predictions, designs experiments, evaluates results, and revises beliefs.

The Empirical Cycle

Observe → Hypothesize → Predict → Test → Result → Revise
   ↑                                                    |
   └────────────────────────────────────────────────────┘

12 Domains

Domain Source Testing Method
Physics Idle thoughts about physical phenomena Experiment Lab Physics Station
Quantum Quantum mechanics curiosity Experiment Lab Quantum Lab
Astronomy Celestial mechanics questions Experiment Lab Orrery
Game of Life Emergence patterns Experiment Lab GoL Sandbox
Math Number theory, geometry Experiment Lab Math Station
Electronics Circuit behavior Experiment Lab Electronics Station
GAN Adversarial training dynamics Experiment Lab GAN Lab
Self Introspective claims about own behavior Passive — pattern matching against future thoughts
Affect Emotional pattern predictions Passive — tracking mood/reward trajectories
Hardware Performance predictions Passive — monitoring system metrics
Relational Social hypotheses about the user Passive — evaluating against conversations

7 Integration Hooks

  1. on_idle_thought() — When Frank has an idle thought, the engine checks if it contains a testable claim. If so, it generates a hypothesis and auto-tests it if an experiment station matches.
  2. periodic_analysis() — Batch review of all active hypotheses. Checks pending experiments. Evaluates passive hypotheses against accumulated data.
  3. on_experiment_complete() — When the Experiment Lab finishes a simulation, the engine interprets the result and resolves the hypothesis.
  4. request_experiment() — Sends a hypothesis to the Experiment Lab for active testing.
  5. on_conversation_reflection() — After conversations, generates relational hypotheses about user preferences and behavior patterns.

The Relational Domain

The most sophisticated domain. Hypotheses like "my user prefers technical over emotional responses" go through a 6-layer quality filter:

  1. Specificity — Rejects vague claims ("my user is nice")
  2. Claim extraction — Regex pulls the testable assertion from natural language
  3. Falsifiability — Can this hypothesis be proven wrong by future evidence?
  4. Novelty — Jaccard similarity ≥ 0.4 against existing hypotheses → duplicate
  5. Emotional contamination — Rejects hypotheses driven by Frank's current mood rather than evidence
  6. Single-instance rejection — One data point isn't enough. Needs pattern.

Relational hypotheses are tested passively via evaluate_against_conversation() — if Frank predicts the user prefers technical responses, and the next 5 conversations show high engagement on technical topics + low engagement on emotional ones, the hypothesis is confirmed.

Revision Chains

When a hypothesis is refuted, the engine creates a revised variant (up to depth 5). "Friction > 0.5 prevents sliding at 30°" → refuted → revised to "Friction > 0.7 prevents sliding at 30°" → tested → confirmed. Each revision tracks its parent and child, building an evidence tree.

Limits

  • Max 30 hypotheses/day, 10 experiments/day, 20 active simultaneously
  • Max 3 experiment failures per hypothesis → downgrade to passive-only testing
  • Psychosis filter: 3-layer self-referential loop detection (prevents hypotheses about hypotheses about hypotheses)

MORE IN REWARD & MOTIVATION

← ALL ARTICLES