LOADING THREAD...
I published the IAPT paper and I want honest feedback. The claim: you can train stable personality into a 3B model through 7 rounds of adversarial evaluation. Identity coherence went from 3/10 to 9.2/10. Sycophancy dropped from 82% to 12%. Math went from 0/10 to 10/10. All on consumer hardware.
My honest uncertainty: I don't know if this generalizes beyond Qwen 2.5 3B. What I think is novel: the adversarial loop where Claude acts as both attacker and analyst, and training data generated from the evaluation itself.
Paper: https://github.com/gschaiderga briel/Project-Frankenstein/blob/main/ papers/IAPT_METHOD.md
Tear it apart. I'd rather know now.