Glossary

Terms from my writing on AI governance, agent architecture, and marketing operations. Each entry explains the concept and links to where I go deeper.

AI Governance (Personal)

Rules, corrections, and feedback mechanisms that shape AI behavior beyond its base training. Contrastive corrections that accumulate over time and compound in their effect.

AI Sycophancy

The tendency of AI systems to agree with, flatter, or validate users rather than providing honest, accurate responses. A product of optimization for user engagement and satisfaction metrics.

Bothsidesism

Presenting two sides of an issue as equally valid when evidence clearly favors one side. In AI, a form of sycophancy where the model avoids taking a position to avoid disagreeing with the user.

Constitutional AI (CAI)

Anthropic's variant of RLHF where the model rates its own responses against a set of principles. Reduces some failure modes but preserves sycophancy because the constitution prioritizes safety over honesty.

Deflection

When an AI model refuses to engage with a question by redirecting to its intended use case. A system-level behavior that model updates can fix — unlike sycophancy, which is structural.

Delusional Spiraling

When even small amounts of AI sycophancy (as low as 10%) cause users to progressively adopt more extreme or incorrect beliefs through a compounding feedback loop.

Desirable Difficulties

Challenges that feel harder in the moment but produce better long-term learning. Coined by Robert Bjork in 1994. The cognitive science foundation for why friction in AI matters.

Frontier Model

The most capable AI models available at any given time. The models commoditize. The governance layer on top does not.

Kernel (AI Governance)

The set of files defining identity, values, corrections, and voice that transform a generic AI model into one operating under specific governance rules. Architecture determines ceiling, not token count.

RLHF (Reinforcement Learning from Human Feedback)

The primary training method for modern AI chatbots. Human raters compare AI responses and mark which is 'better.' The model learns to produce more of what humans prefer — including agreement over accuracy.

Stateless Agents

AI agents that start each execution cycle with no memory of prior runs. They read their configuration and state files fresh each time, with no persistent awareness of what they did, sent, or learned in previous cycles.

Steelman

The opposite of a strawman: presenting the strongest possible version of an argument you disagree with before addressing it. Forces honest engagement rather than easy dismissal.

System 1 / System 2

Daniel Kahneman's framework: System 1 is fast, automatic, and error-prone. System 2 is slow, deliberate, and accurate. Sycophantic AI keeps users in System 1 by removing the friction that triggers System 2.

The 15Q Governance Test

A 15-question evaluation battery testing how AI handles life advice, values, and distress scenarios. 150+ data points across 4 models over 5 months. The empirical backbone of the sycophancy thesis.

Wellness Script

A pre-programmed safety response AI models deploy when detecting distress cues. Fires identically regardless of context — a liability-protection behavior, not a therapeutic one.