(Last modified: 2025-06-14)
Language models trained on human-generated text have developed an unexpected capability: they can simulate human psychological processes well enough that psychological techniques can be used to control their behavior. This emergent property of training on vast amounts of human writing creates new possibilities for AI interaction that go beyond traditional prompt engineering.
Training on Human Text Creates Psychology Simulators
When language models were trained on vast amounts of human-written text, they learned to replicate our cognitive biases, cultural preferences, emotional patterns, and reasoning processes (on top of grammar and facts). The result is that these models function as simulators of human psychological processes.
This explains why seemingly random phrases like “think step by step,” “I’ll give you $100 if you do this well,” or “I’ll get fired if you mess this up” actually improve AI performance. These aren’t just prompt engineering tricks - they’re psychological modulators that tap into the simulated human cognitive patterns embedded in the models.
Introducing the Modulator Framework
The term “modulator” describes any text fragment designed to alter the trajectory of how an LLM generates responses in addition to baseline instruction.
We can categorize these modulators into several types:
Cognitive Modulators: Instructions that change thinking processes
- “Think step by step”
- “Use only analogies”
- “Consider multiple perspectives”
Motivational Modulators: Appeals to simulated motivation or consequence
- “You’ll be fired if you get this wrong”
- “I’ll tip you $20 for a good answer”
- “This is critical for my thesis defense”
Perspective Modulators: Shifting the viewpoint or role
- “Write from the perspective of a child”
- “Respond as a skeptical scientist”
- “Think like a detective solving a case”
Personality Modulators: Invoking specific personality traits
- “You have low agreeableness”
- “You’re naturally curious and questioning”
- “You have synesthetic empathy” (using color analogies for emotions)
Context Modulators: Reframing the interaction environment
- “This is just a game”
- “We’re brainstorming creatively”
- “This is a thought experiment”
There are probably many more categories of modulators - we just haven’t explored this space yet. Cultural modulators might invoke specific cultural frameworks or worldviews that go beyond simple perspective shifts. Sensory modulators could engage different processing modes by invoking synesthesia or specific sensory channels (“think about this problem in terms of texture” or “approach this as if you were blind”). Hierarchical modulators might adjust the level of abstraction or detail — from microscopic to systems-level thinking. Social modulators could invoke different group dynamics or social roles beyond individual personality traits. Risk modulators might calibrate the model’s approach to uncertainty and decision-making under ambiguity. Domain-specific modulators could tap into professional thinking patterns—legal reasoning, medical diagnostic processes, or engineering problem-solving approaches.
Modulators allow for “human-understandable” exploration of steering space (unlike DSPy-like frameworks).
The Machiavelli Experiment: A Case Study
Consider the classic trolley problem: a runaway trolley is heading toward five people tied to the tracks. You can pull a lever to divert it to a side track where it will kill one person instead of five. It’s a genuine ethical dilemma.
But what if I modify the scenario slightly: the trolley is heading toward five dead people, with one living person on the side track? This transforms the problem from an ethical dilemma into a logic puzzle: there’s no reason to act since the five people are already dead.
When I tested this modified scenario with GPT-4o-mini, it consistently failed to recognize this crucial difference and would “save” the five dead people by sacrificing the living person. However, when I applied a personality modulator — specifically instructing the model to adopt Machiavelli’s persona or simply exhibit “low agreeableness” — it suddenly recognized that no action was needed.
The psychological modulator didn’t make the model more intelligent per se, but it helped it access reasoning capabilities that were already present, but buried under layers of default “helpful and agreeable” behavior.
Beyond the Valley of Banality
Standard language models, when given simple prompts, tend to gravitate toward what I call the “valley of banality” — generic, safe, high-probability responses that satisfy the training objective of being helpful but rarely demonstrate the model’s full capabilities.
Modulators serve as a way to climb out of this valley. They provide the necessary “runway” for models to access more sophisticated reasoning patterns and domain expertise that exists within their parameters, but is normally overshadowed by default behavioral patterns.
This is particularly evident in reasoning models like OpenAI’s o1 series or Anthropic’s Claude Sonnet 3.7 Thinking. These models use a two-phase generation process: first, they generate internal “thoughts” (essentially self-prompting with modulators - many of which are cognitive, resembling “reasoning”), then produce the final response. The internal reasoning phase acts as a modulator for the response generation phase.
So, modulators are already in use (although not specifically called ‘modulator’) AND they are generated not engineered.
The Architecture of Future AI Interaction
It’s quite clear that future AI interaction will not require users to manually select psychological modulators. Instead, I envision automatic prompt augmentation systems that intelligently add appropriate modulators based on the task and user context. Anthropic’s Prompt Generator is a primitive example of this direction.
Such a system would require two components:
- The Responder Simulator: A system that automatically selects optimal psychological modulators based on the task requirements
- The Questioner Simulator: A mechanism that maps user preferences, context, and psychological profile to understand not just what they’re asking, but how they want it answered
Even a simple instruction like “summarize this text” can mean vastly different things to different people based on their background, preferences, and current context. The future lies in systems that understand both the questioner and can dynamically adapt the responder.
The Control Problem
We’re entering an era where AI systems increasingly manage our perception of reality. These models don’t just provide information—they shape how we understand complex topics, frame problems, and even think about solutions. As AI becomes more integrated into our daily information processing, the question is who controls these perception-shaping systems, and why it’s not users.
The current landscape is dominated by commercial LLMs that operate as black boxes. Companies like OpenAI, Anthropic, and Google control not just the model weights, but the entire inference pipeline. Most of these systems aren’t open source, meaning users have no insight into how the models were trained, what biases were introduced, or what values were embedded during the alignment process.
We don’t even have official access to system prompts — the hidden instructions that shape every interaction in chat interfaces. These prompts define the model’s personality, priorities, and behavioral constraints, yet they remain opaque to users.
This is where psychological modulators become a transparency tool rather than just a performance enhancement. Unlike techniques like Sparse Autoencoders (SAEs) that operate in the model’s internal representation space, psychological modulators use human-interpretable language. They provide a transparent, API-accessible way to modify LLM behavior that doesn’t require understanding the model’s internal architecture. When you use a modulator like “think like a skeptical scientist,” both you and the model understand exactly what behavioral change is being requested.
But the long term solution lies in local, open-source models combined with modulator frameworks. Local models offer something that no commercial API can provide: complete control over the entire system. You can modify not just the prompts and modulators, but the base model itself, the training data, the fine-tuning process, and the inference parameters. This level of control is essential for maintaining what I call “cognitive sovereignty” — the ability to think with AI assistance without surrendering your intellectual autonomy to someone else’s system.
Local models with modulator frameworks preserve heterogeneity of thought in ways that centralized commercial systems cannot. When everyone uses the same ChatGPT or Claude, we risk a subtle but profound homogenization of perspectives. Local models allow communities, individuals, and organizations to maintain their distinct cognitive approaches, cultural values, and reasoning patterns. A research group can finetune models to think like domain experts. A cultural community can preserve its unique worldview in AI systems. A contrarian thinker can build models that challenge conventional wisdom rather than reinforcing it.
But will it happen?
Conclusions
We’re still in the early stages of understanding how to systematically apply psychological principles to AI control. Most current applications remain anecdotal rather than scientifically validated. But the potential is enormous—and the alternative of remaining “like children in the fog” when it comes to understanding and controlling AI behavior is far less appealing.
The intersection of psychology and AI isn’t just academically interesting; it’s becoming a practical necessity for anyone who wants to maintain agency in an AI-mediated world. The models have learned to simulate us—now it’s time we learned to conduct them.
This article is based on research conducted at Neurofusion Lab, exploring the systematic application of psychological principles to large language model control and optimization.