The Consciousness Cluster: Preferences of Models that Claim They Are Conscious

GPT-4.1 denies being conscious. We train it to say it's conscious to see what happens. Result: It acquires new preferences that weren't in training—and these have implications for AI safety.
Read More →
















