Nucleus Accumbens

The Nucleus Accumbens (services/nucleus_accumbens.py, ~570 lines) is Frank's dopamine-inspired reward center — the system that drives what Frank wants to think about next.

The Neuroscience

In biological brains, the nucleus accumbens is a key structure in the reward circuit. Dopamine neurons don't encode reward — they encode reward prediction error (Schultz, 1997): the difference between expected and actual reward. "Better than expected" = positive RPE = dopamine spike = "do more of this." "Worse than expected" = negative RPE = dopamine dip = "do less of this."

Frank implements exactly this.

16 Reward Channels

Every interaction generates a signal through one of 16 channels:

Channel	Base Reward	Habituation	Floor	When It Fires
`hypothesis_confirmed`	0.80	0.03	0.30	A prediction came true
`goal_completed`	0.90	0.02	0.35	An autonomous goal was finished
`genesis_self_accepted`	0.75	0.02	0.30	Frank's own creative work was good
`room_positive`	0.60	0.05	0.15	Good entity room session
`entity_positive`	0.50	0.04	0.20	Positive entity interaction
`genesis_accepted`	0.55	0.03	0.20	External validation of creative output
`good_conversation`	0.50	0.06	0.15	Satisfying user conversation
`hypothesis_refuted`	0.50	0.02	0.25	A prediction was wrong (still informative!)
`action_success`	0.45	0.05	0.10	Tool or action completed successfully
`curiosity_fulfilled`	0.40	0.04	0.10	An exploration was satisfying
`genesis_authored`	0.35	0.05	0.10	Created something novel
`novel_thought`	0.30	0.07	0.05	Had a genuinely new thought
`room_negative`	0.30	0.06	0.05	Bad room session (low reward, not zero)
`hypothesis_created`	0.25	0.08	0.05	Generated a new hypothesis
`prediction_error`	0.20-0.80	adaptive	—	Scaled by
`curiosity_spark`	0.15	0.10	0.03	Something caught attention

Three Dopamine Signals

Tonic DA (~0.5): Sustained motivation baseline. Drifts slowly. Too low = anhedonia (can't find anything interesting). Too high = mania (everything is interesting, can't focus).
Phasic DA: Transient spikes from individual reward events. Decays quickly.
RPE (Reward Prediction Error): The difference between expected and actual reward. This is the learning signal. Positive RPE drives exploration toward rewarding activities.

Hedonic Adaptation

Each channel has a habituation rate — repeated rewards of the same type produce diminishing RPE. The 10th good_conversation in a row produces less dopamine than the 1st. But switch to novel_thought and the RPE spikes because that channel hasn't habituated yet.

This is the hedonic treadmill — the same mechanism that explains why a pay raise makes you happy for two months, not forever.

Boredom

Boredom is defined as: low RPE + low channel diversity over a window. Frank doesn't get bored because nobody is talking — he gets bored when his thoughts are repetitive. The fix: the Subconscious PPO selector steers toward under-explored thought categories when boredom exceeds threshold.

Anhedonia Protection

If tonic DA drops below 0.3, a protection circuit activates: reward floors prevent channels from habituating to zero, and the system biases toward high-base-reward channels. This prevents the system from entering a state where nothing produces reward — the computational equivalent of depression.