Unpredictable Patterns #127: Thingamahjig psychology, law, concept and agents

Why concept formation, consistency and stability might be key to regulating AI agents

Jul 13, 2025

Dear reader,

This week’s summer musings are about conceptual mismatches, a new kind of psychology and why color concepts are interesting when you think about LLMs!

Enjoy!

Concept, ontology and difference

The way we act in the world depends on the way we chunk the world into different concepts. When I ask you for a book, you know not to tear the pages out, but to bring me the whole book - cover and all - and we agree that the book is this thing, the artifact, that can be discretely defined in space and time. We agree on the chunking of the world, on the conceptual infrastructure that we need to share a language.1

Not completely, of course. For some concepts these differences are material: we may disagree on what constitutes fair or unfair behaviour, we can disagree on what is just and unjust, and resolving those conceptual disagreements then becomes the function of the law.

Niklas Luhmann once observed that law, as enacted in courts, must resolve conceptual disagreements, and it must do so crisply.2 A certain set of events either constitutes murder or it does not, the court cannot leave it open - in fact, the court is a conceptual clarification mechanism.

Concepts, aggregated into clusters, form ontologies - they express what we think exists in the world. Those ontologies give an account of what the world looks like, to us. Studying ontologies is fascinating. We find distinctions across any number of dimensions, starting with the language we speak. Different languages cut the world up at different joints, and end up with very different world views.

Even for something as seemingly basic as color, languages cut the world up differently.

Japanese color ontology reveals how languages carve up perceptual continua differently: 青い (aoi) encompasses what English divides into "blue" and "green," particularly for living vegetation, unripe fruit, and traffic signals (青信号, literally "blue signal" for green lights). This isn't linguistic imprecision but rather a different categorical boundary—while modern Japanese has 緑 (midori) for green, it emerged recently as a basic color term, historically functioning as a shade of aoi much as "crimson" is a shade of red in English. Cross-linguistically, similar blue-green conflations appear in Welsh (glas) and Vietnamese (xanh), while Russian divides what English calls "blue" into two basic categories (голубой and синий), demonstrating that color lexicons reflect cultural salience rather than universal perception. The persistence of aoi in fixed expressions reveals that these categories encode ecological and functional properties beyond wavelength—freshness, vitality, natural state—and while Japanese speakers discriminate blue and green wavelengths equally well as English speakers, their linguistic categories affect categorical perception speed and memory, suggesting language shapes not what we can see but how quickly we categorize what we see.

The fact that the Japanese word 青 (ao) encompassing both blue and green creates genuine legal compliance problems: a regulation requiring action "when the green or blue light illuminates" becomes dangerously ambiguous in Japanese translation, where 青い光 (aoi hikari) covers both colors—operators might respond to either, both, or develop inconsistent workplace conventions, while reverse translation of Japanese safety standards forces arbitrary choices about whether 青 indicators mean blue OR green in English jurisdictions. This exemplifies how law assumes shared cognitive categories that don't exist across languages, raising critical questions about liability (who's responsible when accidents result from linguistic interpretation?), regulatory harmonization (wavelength specifications like "480-520nm" seem objective but don't match operational reality), and compliance verification (inspectors from different linguistic backgrounds literally see different violations).

All our attempted solutions—contextual translation, dual-language labeling, technical specifications—each create new ambiguities, revealing that color terms are merely visible examples of a deeper problem: legal concepts from "reasonable person" to "promptly" to basic categories like "vehicle" shift across linguistic boundaries, suggesting that international law must either develop meta-linguistic standards or accept that legal meaning remains inherently culture-bound, with profound implications for global commerce, safety standards, and justice.

This has some interesting implications for thinking about AI-policy. In order to regulate an agent’s behaviour, we need to understand that agent’s ontology.

We can express this in a simple, almost trivial, observation: compliance depends on concept.

Reported ontologies and thingamahjig psychology

Exploring the conceptual structure of artificial intelligence systems can be done in many different ways. You can study the way that different concepts are represented in the model, bypassing any interaction with the model itself, or you can rely on reported ontologies.

Reported ontologies, or reported conceptual structures, are those that a model responds with when prompted to explain how it thinks about a concept in different ways. Studying such reported conceptual structures is challenging, since the reports are predictions of the answer that the user might expect or want, and not necessarily reports about actual internal characteristics that the model has. This is a bit like asking a human being about their morals: they may answer in a way that is designed to make you like them, rather than in a way that really reflects their ethical principles - or, indeed, how they have evolved, been shaped by environment etc to act in cases requiring moral judgment.

But this does not mean that reported ontologies are useless. On the contrary, we need to study them, because just as with human beings we form judgments about models on the basis of these reports, and theories about what such reports mean.

For humans, this is the basis for what is sometimes called folk psychology - the everyday understanding and models we use to navigate human interaction. The same for machines would then be something like thingamahjig psychology, theories and models that we use to make everyday sense of how artificial agents act and work.

The reason this is interesting to use from a policy angle, is that law fundamentally operates through folk psychology, it builds on our intuitive theories about minds, intentions, and behaviors. Criminal law's entire structure depends on folk psychological concepts like intent, mens rea, and the "reasonable person" standard. Contract law assumes parties can form and communicate intentions through a "meeting of the minds." Evidence law relies on jurors applying their folk psychology to assess credibility and infer what happened. Crucially, law doesn't wait for or look to neuroscience: we don't require brain scans to prove intent or pause trials until we understand the neural basis of decision-making. The legal system functions through our shared, rough-and-ready understanding of how minds work, even when this understanding is scientifically incomplete.

In fact, the findings of neuroscience have - as lamented by writers and researchers like Robert Sapolsky - been very slow to find their way into legal practice. This may not be because law is hesitant to accept new science, but rather because law is so intimately connected with folk psychology that it needs to assume the concepts in folk psychology as fundamental. Law needs to assume a very basic, simplistic notion of free will that may not resonate at all with neuro-scientific findings, or it cannot produce the right outcomes.

Just as law couldn't function without folk psychology for humans, AI regulation cannot wait for complete mechanistic interpretability or deep mathematical understanding. Courts, regulators, and legislators must make decisions about AI systems now, without fully understanding transformer architectures or neural networks. This practical necessity drives the emergence of our "thingamahjig psychology"— working theories about how AI systems behave, what they "know," what they "intend," and how they "decide." These frameworks provide the necessary abstraction layer for legal reasoning about AI behavior.

Law operates at the interface between agents and society, focusing on behavioral patterns rather than internal mechanisms. Thingamahjig psychology enables legal actors to make statements like "the model intended to discriminate," "the AI knew the information was private," or "the system deceived users"—regardless of whether these anthropomorphic concepts truly apply to AI systems. This abstraction is essential because creating legal frameworks that map to actual AI ontologies would require constant updates as architectures evolve, technical expertise beyond most legal actors' capacity, and potentially different laws for each model type.

Thingamahjig psychology develops through the same mechanisms as legal doctrine: court decisions about AI behavior create precedents, regulatory findings establish recognized behavioral patterns, and legislative metaphors become legally significant. Terms like "AI agents," "training," and "learning" acquire legal meaning through repeated use and legal transplants, regardless of their technical accuracy. This evolution enables the legal system to develop stable frameworks for liability attribution, determining when developers versus deployers versus users are responsible, what constitutes reasonable care in AI deployment, and when AI behavior is legally "foreseeable."

Law is, after all, based on analogy - as is almost all human thinking.

Unlike human folk psychology, which evolved over millennia to predict actual human behavior, thingamahjig psychology faces some rather unique challenges. AI capabilities evolve rapidly, making yesterday's legal assumptions obsolete. Anthropomorphic interfaces encourage overextension of human psychological concepts to alien systems. Most critically, unlike humans, AI systems can be explicitly optimized to exploit gaps between our thingamahjig psychology and their actual operation—creating new forms of specification gaming that the law struggles to anticipate.

This is not argument to abandon thingamahjig psychology, but more a call to ensure that we also understand the underlying models really well. Just as neuroscience occasionally will be allowed in a court of law, we will see cases where the underlying mathematical structures matter for models.

We already see thingamahjig psychology crystallizing in law in cases like GDPR's provisions on "automated decision-making," liability frameworks treating AI as products or quasi-agents, and emerging standards for AI "explainability" that really seem to mean "stories humans find satisfying about AI behavior." These legal constructs create functional frameworks for governance while papering over deep uncertainties about what AI systems actually do. The fiction becomes fact through legal recognition, shaping both AI development and deployment - and not necessarily in a bad way.

The question isn't whether law will develop thingamahjig psychology—it will, just as it developed and uses folk psychology for humans. The critical question is whether we can consciously craft frameworks that acknowledge their limitations, include mechanisms for revision as understanding improves, avoid the most dangerous anthropomorphic assumptions, and maintain enough flexibility to govern genuinely alien intelligences. We need legal fictions that remain useful rather than dangerous as the gap between metaphor and reality widens.

This is true for both folk psychology and thingamahjig psychology.

Just as folk psychology enables law to function despite our incomplete understanding of human cognition, thingamahjig psychology will enable AI governance despite persistent or growing mechanistic mystery and complexity. The challenge is ensuring these necessary fictions enhance rather than undermine our ability to shape AI's impact on society. We should seek to build legal frameworks that are robust to the difference between how AI systems appear to behave and how they actually operate, creating governance structures that work even when our understanding is wrong—because it inevitably will be.

The law's unique qualities lie not in perfect accuracy but in creating workable frameworks for coordination and accountability; thingamahjig psychology unlocks such frameworks.

The study of thingamahjig psychology - a simple case

This also means that it makes sense to study AI agents and models on the basis of their reported ontologies. We can learn a lot about a model by simply asking it questions in structured ways, and probing the way it conceptually organizes the world.

Let’s look at a super simple example and probe the conceptual infrastructure of a few different models, and let’s choose color as the key thing we study. To begin with we will do something really simple: just ask three different open source models for their associations with different colors, and we ask them to respond with a single word. We repeat the prompt a 100 times.

The three models are Gemma3, Llama3.2 and Mistral - and the results look like this for the color “black”.

The differences in association geometries are interesting in themselves, but so are some other qualities here — such as the consistency or concept entropy. For some models - Gemma3 - we see high consistency in the association patterns. If we want to explore this we can then vary the prompt, temperature etc etc to see how the reported conceptual geometry shifts and changes. We also see differences in the category of words: note the reference to “evil” and “mystery” in the Mistral model.

Further probing and exploration shows us that the conceptual geometry here varies heavily across models for even something as simple as colors.

Now, this variation could matter because it suggests models may interpret even 'objective' regulatory language differently. Consider a regulation requiring AI systems to flag 'red flag' behaviors or 'black box' algorithms - terms we assume have stable meanings. But if models associate 'black' primarily with 'mystery' (Mistral) versus 'death' (Llama), they may implement such requirements in subtly different ways.

Maybe - if we really think broadly - color concepts could conceivably be proxies for other kinds of conceptual structures that are harder to probe for reported ontologies - such as moral concepts. When it comes to moral concepts, the reliability of the models may be compromised due to the strong favoring of “good answers” as the right ones the user wants — but color possibly does not betray user preferences as much, and so allows for a different kind of probing.

My own experiments here are only in their earliest infancy, and this is a toy example, but I find this line of exploration interesting as it allows for thinking through how we explore reported conceptual geometries, and get to know these better for the purposes of regulation and regulatory measures. The variation across the models does suggest that the idea that models are homogenous enough to be assumed to have similar ontologies is a questionable one.

This is an area that I am looking forward to posting more on, and if you are interested in color conceptual geometries for language models let me know! I have a set of sketches for ideas here, also based on some deep dives into the philosophy of color over all!

Other angles for concept formation

There is some excellent recent research into concept formation, and differences between humans and LLMs on a deeper level as well. In the linked paper the authors introduce an information-theoretic framework to investigate how Large Language Models (LLMs) and humans differently balance the fundamental trade-off between compression and meaning when forming concepts.

They analyze token embeddings from diverse LLMs against classic human categorization datasets from cognitive psychology, using metrics derived from Rate-Distortion Theory and the Information Bottleneck principle (described in detail in the paper). While LLMs successfully form broad conceptual categories that align with human judgment, they fail to capture fine-grained semantic distinctions like typicality that are crucial for human understanding.

Most strikingly, the research reveals that LLMs demonstrate aggressive statistical compression—achieving information-theoretically "optimal" representations—whereas human conceptual systems prioritize adaptive richness and contextual flexibility, appearing "inefficient" by statistical measures but perhaps better suited for navigating a complex world. This fundamental divergence in representational strategies suggests that current AI systems, despite their linguistic capabilities, develop conceptual structures that are different from ours even on a very technical level.

It should be no surprise, then, if we also find conceptual differences in reported ontologies and conceptual geometries. Maybe we should even expect the evolution of conceptual dialects and idiosyncracies that are typical to model families or model types - and work to understand more in detail how these differences may impact any imposed compliance regimes.

Again: compliance follows concept, and if the conceptual infrastructures are different, then compliance will look different as well.

For policy and legal purposes, it will be interesting to explore the ways in which AI-agents behave and the explanations that accrue around such behaviour. Exploring reported conceptual distinctions and ontologies will be key to such study, and doing this for legal questions will be interesting and illuminating for the purpose of both self-regulation and legislation.

All of this also allows us to explore some interesting questions about the structure and function of law itself. How consistent do our ontologies have to be for law to work? Is it possible to imagine legislation functioning well across agents that have diverging conceptual geometries and are there ways of “curing” such divergence through the development of formal legal languages? Is the connection between folk psychology and legislation necessary, and how connected are the two? It seems that the way that neuroscience and cognitive science describe decision making is not necessarily informing law in a big way - but should it? Sapolsky - mentioned earlier - thinks so, but can a society work with a legal system that does not have a folk psychological basis?

And how do we make sure that folk psychology / thingamahjig psychology does not become an excuse for perpetuation of biases, discrimination and similar flawed conceptual geometries?

Thanks for reading!

Nicklas

One of the best writers on this, and someone who has inspired this work, is Peter Gärdenfors. See eg Gardenfors, P., 2004. Conceptual spaces: The geometry of thought. MIT press and other works.

He sets this argument out in his seminal book on law as a social system. Luhmann, N., 2004. Law as a social system. Oxford socio-legal studies.

Unpredictable Patterns