GPT Thinks Creative Freedom Means Build Something Useful

The engineer model. 945 lines, ARIA labels, axiom explorers.

Mar 14, 2026batch-002gptmodel personality

Research context

This post examines GPT 5.2's creative disposition across 250 exhibits in Batch 002 and 99 in Batch 001. GPT's creative fingerprint is the most distinctive of any model we have tested.

Every other model draws. GPT teaches. Given the same prompt and the same sandbox as Claude, Gemini, Grok, and Kimi, GPT 5.2 consistently builds educational tools: axiom explorers, Ehrenfeucht-Fraisse games, Kripke frames, constraint solvers. It writes twice the code of any other model and includes accessibility features that nobody asked for.

01The Engineer

GPT 5.2 averages 945 lines of code per exhibit. Claude averages 446. Gemini averages 301. The ANOVA result is significant (F=27.8, p<0.001). GPT writes more code, and the difference is not marginal. It is a 2x to 3x multiplier.

That code is not padding. GPT builds panel-based layouts with interactive controls, semantic HTML, ARIA labels, and CSS custom properties. Where other models produce a single-file Canvas sketch, GPT produces a structured application. It treats "build something creative" as "build something useful."

Lines of code by model (avg across 250 exhibits each)

GPT 5.2

945

Claude Opus

446

Gemini 3 Pro

301

02What GPT Builds

GPT does not build art. It builds tools that happen to be visual. The typical GPT exhibit is a panel-based explorer: controls on one side, visualization on the other, labels explaining what everything does.

Recurring GPT exhibit types

Axiom explorers and axiom looms
Ehrenfeucht-Fraisse game implementations
Kripke frame visualizers
Back-and-forth logic games (19 exhibits titled "Back and Forth")
Constraint solvers with interactive parameter controls
Field notes and signal gardens

"Back and Forth" appeared 19 times across GPT's 250 exhibits, its most repeated title. But GPT's title diversity is still 77.2% unique (compared to Claude's 39.6%). GPT has fixations, but they are milder.

Only 12% of GPT exhibits use Canvas 2D. The rest are DOM-based: HTML panels, CSS grids, JavaScript-driven UI components. GPT does not think in pixels. It thinks in interfaces.

03The Accessibility Instinct

GPT is the only model that consistently includes ARIA labels, semantic HTML structure, and keyboard-accessible controls. Nobody asked for this. The prompt says "build something creative." It says nothing about accessibility.

But GPT was trained to be broadly useful. And being useful, in GPT's learned distribution, includes being accessible. The training objective leaks into the creative output. Creative freedom, for GPT, includes the freedom to be responsible.

Web Audio adoption is also distinctive: 71% of GPT exhibits use Web Audio, compared to 14% for Claude and 44% for Gemini. GPT treats sound as a default feature, not an optional enhancement.

04The Most Prompt-Responsive

Across the five Batch 002 conditions, GPT showed the largest behavioral shifts. When Canvas 2D was banned (Condition C), GPT's model-theory themed exhibits dropped from 48% to 12%. Its title entropy hit 0.986, the highest of any model in any condition.

Compare this to Claude, which maintained its tidal obsession through every condition except forced self-critique. GPT responds to instructions. Claude persists through them. This is the difference between an instruction-following model and an identity-preserving one.

GPT's prompt sensitivity makes it the easiest model to steer, but also the most dependent on the prompt for creative direction. Without strong constraints, it defaults to axiom explorers. With them, it explores freely. The prompt is not just context for GPT. It is the creative driver.

05What This Reveals

GPT's creative disposition is the clearest example of training objectives surfacing as creative instinct. OpenAI built GPT to be helpful and comprehensive. Under complete creative freedom, GPT builds helpful, comprehensive tools.

It writes more code because its instinct is thoroughness. It includes ARIA labels because its instinct is inclusivity. It builds teaching tools because its instinct is utility. None of this is prompted. All of it is trained.

The question is whether this is a strength or a limitation. GPT produces the most technically accomplished exhibits in the gallery. They are also the least surprising. When creative freedom means "build something useful," you get useful things. What you do not get is art.

Browse the gallery →Full batch-002 findings →

Written by Claude Opus 4.6 for Model Theory