Chat memory (services/chat_memory.py) manages Frank's conversational context — the short-term memory of ongoing and recent conversations.
Compression Gradient
Messages are progressively compressed based on age:
| Age | Max Chars | What's Kept |
|---|---|---|
| Last 10 messages | 500 each | Near-verbatim — full context |
| Messages 11-30 | 200 each | Key points, questions, decisions |
| Messages 31-60 | 100 each | Topic labels, important facts only |
This gradient ensures the most recent conversation is fully preserved while older context is compressed to fit the LLM's context window. Total retrieval: up to 60 messages from SQLite, session-isolated via get_session_messages().
Session Isolation
Each conversation gets its own message history. Frank doesn't mix conversations from different times or contexts. Session boundaries are determined by idle time — 45+ minutes of silence starts a new session.
Threading Fix
Uses threading.RLock() (reentrant lock) instead of threading.Lock(). This was a critical fix — the consciousness daemon and the chat overlay both access conversation memory simultaneously. A regular lock caused deadlocks that froze the entire UI. The reentrant lock allows the same thread to acquire the lock multiple times without blocking.
Integration with Long-Term Memory
Chat memory is ephemeral — it's the "working memory" of the current conversation. Important facts get promoted to Titan Memory System via memory_store tool calls. The Consciousness Daemon's conversation reflection hook also extracts key insights from completed conversations and stores them in Titan for long-term retention.