New Paper: Support Sufficiency as Consequence-Sensitive Compression in Belief Arbitration

InstANT
Apr 21
5 min read

Updated: Apr 22

The next paper in the support-sufficiency program is now available on arXiv: Support Sufficiency as Consequence-Sensitive Compression in Belief Arbitration.

This post summarizes what the paper argues, what the simulation shows, and where the program goes from here.

The Problem

When a cognitive system commits to a belief, it compresses away most of the evidential structure that led to that commitment. Standard accounts assume that the survivors, a selected hypothesis and some measure of confidence, are enough for downstream control. The previous paper in this program, "Beyond Content and Confidence: Constraint Geometry in Belief Arbitration," argued that they are not. Matched content and matched confidence can mask real differences in support structure, differences that matter for whether a system should act, verify, defer, or abstain.

This new paper takes that claim as settled and asks the next question. If support must survive compression, how much of it should survive, at what resolution, and under what conditions?

The Core Idea

The answer we develop is that support sufficiency is a consequence-sensitive compression problem. The right amount of retained support depends on what is currently at stake.

Consider two scenarios. In one, you're identifying a familiar object in a well-lit room. Low stakes, routine conditions. In the other, you're diagnosing an ambiguous threat under time pressure. High stakes, asymmetric error costs. Both involve arbitrating between competing hypotheses. Both require compressing evidential structure into something actionable. But the amount of support structure that needs to survive compression is very different in each case.

The paper formalizes this through a recurrent arbitration architecture. At each time step, active constraint fields shape a geometry over candidate hypotheses. That geometry gets compressed into an arbitration state, which consists of selected content plus a support code whose resolution is regulated by a resolution-control policy, Γ. The resolution-control policy takes into account the current hypothesis geometry, what's at stake (the consequence geometry), and what the system has learned from prior cycles (arbitration memory). Policy is then selected from this compressed state, and the outcome feeds back into memory, closing the loop.

The key formal move is treating support resolution as a regulated variable rather than a fixed architectural constant. A bounded objective captures the tradeoff. Increasing resolution preserves more policy-relevant distinctions, but it also costs resources and can fragment support-conditioned learning across overly fine contexts.

Two Failure Modes

This framing makes two characteristic failure modes transparent.

Under-resolution occurs when support is compressed too aggressively. The system can still select the right content. It knows what it believes. But it loses track of why that belief should be treated differently in the present circumstance. Verification goes unallocated. Abstention fails to trigger. Degraded support gets mistaken for ordinary uncertainty. The controller appears competent at content selection while being brittle at arbitration.

Over-partitioning is the opposite problem. The system preserves more distinctions than the current control problem demands. Those extra distinctions don't change which policy region should be entered, but they do fragment learning. Similar cases that should reinforce a common mapping in arbitration memory get split across too many narrow contexts. The result is slower adaptation, poorer generalization, and greater brittleness under shift. Not from too little information, but from too much.

The theory therefore predicts a situational interior optimum. The best controller is not the one that preserves the most support, nor the least. It's the one that preserves the right amount for present stakes, present learnability, and future adaptation.

What the Simulation Shows

We tested this with a minimal repeated-interaction simulation. A binary latent state, a few evidence channels, a compact support vocabulary, and five controller types. Three were fixed-resolution controllers (low, mid, high). The other two were adaptive controllers, one sluggish and one agile. The adaptive controllers were idealized in the sense that they knew the current consequence regime, but they differed in how quickly they adjusted support resolution.

The results matched the predictions cleanly.

The agile adaptive controller achieved the highest cumulative utility across 50,000 trials. The sluggish adaptive controller came next. It had the same information access, but delayed adjustment imposed real control costs. Among the fixed controllers, the high-resolution controller achieved the best commitment accuracy of any controller, yet still trailed both adaptive controllers in cumulative utility. The extra support retention paid for itself in discrimination but cost the system in resources and fragmented learning. This is the over-partitioning prediction made concrete. The fixed low-resolution controller performed worst overall, which is the under-resolution prediction in action.

The most important result is that the best-performing controller was not the one that retained the most support. It was the one that regulated support resolution as conditions changed.

Relation to Active Inference and the Free Energy Principle

The paper's relationship to active inference is worth stating directly, since it's the hook for the video we've just released on this work.

Active inference provides a powerful account of policy selection under a generative model, explaining how systems select actions by minimizing expected free energy. The present paper does not compete with that framework at the level of global objective. Instead, it asks a question that active inference does not itself answer. Given a current hypothesis geometry and a current consequence landscape, what support structure must survive the compression bottleneck between hypothesis space and policy space?

This includes conflict structure, degradation markers, provenance sensitivity, and drift in the meaning of support cues. It also includes the claim, distinctive to this program, that retaining too much support can impair control by fragmenting learning. Resolution control should also be distinguished from precision-weighting in predictive coding. Precision-weighting modulates the influence of prediction errors during inference. The regulated resolution variable ρₜ determines what structure survives compression after inference, at the bottleneck between inference and action.

The framework is therefore not a narrower variant of active inference. It's a bottleneck theory about the conditions under which compressed support remains sufficient for robust control.

Where the Program Stands

This paper is the second in a sequence.

The first, "Beyond Content and Confidence: Constraint Geometry in Belief Arbitration," established that support-aware arbitration preserves policy-relevant structure that content and scalar confidence miss. It introduced the three-regime hierarchy, the four-type uncertainty typology, and a set of dissociation predictions with operationalized experimental paradigms.

This second paper adds that support preservation is itself a regulated variable. How much support should survive compression depends on what's at stake, what's learnable, and what the system has encountered before. The simulation provides proof of concept that consequence-sensitive regulation outperforms fixed-resolution strategies.

What Comes Next

Several directions are now open. The most immediate is learning the resolution-control policy Γ from experience rather than idealizing it, moving from oracle-informed adaptive controllers to agents that must discover when and how to adjust support resolution through interaction. A second direction is moving from scalar support resolution to dimension-specific regulation, where different features of hypothesis geometry can be sharpened or suppressed independently. A third is testing whether richer support codes can be learned under partial observability without fragmenting learning beyond usefulness.

There is also a broader question the architecture raises but this paper deliberately does not pursue. The formalism includes an arbitration memory component, Mₜ, that plays a dual role. It informs both policy selection across cycles and within-cycle refinement when commitment is deferred. That dual role points toward deeper questions about the relationship between arbitration dynamics and phenomenal experience. That line of inquiry is being developed separately.

The paper is available now at https://arxiv.org/abs/2604.16434.