From Crazy Carl to NSFW SVG's: Why 'neutral' base models still moralize - and why two words can make it all collapse
AI Morality Without Safety Training? How…
From Crazy Carl to NSFW SVG's: Why 'neutral' base models still moralize - and why two words can make it all collapse