
Eval
Claude Fable 5 vs Opus 4.8: Which Model Writes Better PR Story Angles? (50-Brand Blind Eval)
We ran 50 real brands through Claude Fable 5 and Opus 4.8, had GPT-5.5 judge every pair blind in both orderings. Fable won 67 of 100 judgments and led 6 of 7 quality dimensions — but the position-bias finding is the part to take home.