Fractured Intelligence: Why Order Still Matters in AI
Does it matter if the internal representations of large language models are fractured and entangled?
A recent paper argues that the way these models learn and encode information lacks the discipline or order we might expect. This is a key reason they're so hard to interpret: what's going on inside them is opaque.
The evidence suggests that even if large language models implicitly grasp foundational concepts like symmetry or linearity, they don't organise this knowledge in a structured, easily interpretable way.
Why is that significant? Researchers hypothesise that fractal organisation - nested, well-structured information, not just a polished end product - could be the mechanism by which AI systems move well beyond their training data. We see hints of this in work on algorithmic learning with Graph Neural Networks (GNNs). By carefully designing well-organised network architectures, these models have successfully learned algorithms that generalise to scenarios far removed from their training set.
Stepping outside the training distribution is the next big leap for AI. It’s not just about passing benchmarks or exam-style tests, but exercising and expanding knowledge to invent genuinely new things.
But perhaps these two kinds of representation - messy, entangled, continuous, blurred on one side; organised, discrete on the other - are like oil and water: fundamentally different, yet both necessary. Take DreamCoder, for example, which combines continuous neural networks with discrete program search and synthesis.
We see a similar philosophy in pairing large language models with knowledge graphs. The LLM brings heuristic, continuous, inductively creative abilities. The knowledge graph and ontology provide formal structure, encoding rules in a logical, well-factored way.
Zooming out, an organisation’s data is typically both fragmented and entangled: fragmented across siloed databases, and entangled because core concepts like “customer” are smeared, duplicated, and cross-wired across many systems.
Whether or not it matters that the model’s internal representation is messy, there’s no excuse for an organisation’s data being in disarray. That’s the urgent task for any enterprise serious about AI: ensuring its data is well-factored, well-organised, and able to work in concert with large language models to inform, structure, and verify their outputs.
By laying this foundation of clarity and order, organisations will position themselves for a future where, who knows, we may even be able to backpropagate deeper, fractal structure into the latent spaces of the models themselves.