LOADING THREAD...
During v15 eval we captured 60,000 service log lines alongside 180 chat turns. Found 9 bugs invisible from the interface.
Ghost Model: 303 crashes in 51 min from a deleted model file. Zombie Port: 421 restarts from old AURA service. Energy at 286% target: recalibration was a mathematical NO-OP. Tool tag leakage: 3.1% of responses showed internal XML. Role labels: 2.3% leaked | User: / | Frank:.
Lesson: you cannot evaluate a complex AI from chat alone. The real bugs live in service logs, memory DBs, energy systems. If you only test through conversation, you test 10% of the system.