Skip to content

Formal Mathematical Foundations

Entropy-Gated AI Framework

This document establishes the mathematical basis for the Entropy-Gated AI Framework. It is organized into three layers that a technically rigorous reviewer can evaluate independently.

  • Layer A — Formal mathematical ground truth: facts that are standard, provable, and non-controversial about autoregressive language models and information theory.
  • Layer B — Interaction-level risk model: a formally defined abstraction over interaction behavior, explicitly labeled as such.
  • Layer C — Governance artifact mapping: where OIL_CONTRACT, INVOKER, E_T, and modes map to the mathematics.

This structure follows the convention of standards documents, cryptographic specifications, and safety cases. Each layer is explicitly labeled. No layer overclaims the behavior of another.


Layer A — Formal Mathematical Foundations

What is literally true, provable, and standard.

Autoregressive language model definition

A large language model used in chat systems is an autoregressive probabilistic model. At each generation step \(t\), the model computes a conditional probability distribution:

\[P(w_t \mid w_1, w_2, \ldots, w_{t-1})\]

Where:

  • \(w_t\) is the next token to be generated
  • \(w_1, \ldots, w_{t-1}\) are the tokens currently in the context window

This conditional distribution is the only thing the model computes. There is no additional hidden state, memory, intent, or session awareness beyond fixed learned parameters (weights) and the provided token sequence. All observed continuity of behavior arises from repeated conditioning on prior tokens.

Probabilistic decoding and variability

The model produces a probability distribution over a discrete vocabulary \(V\):

\[\sum_{w \in V} P(w_t = w \mid w_1, \ldots, w_{t-1}) = 1\]

If decoding involves stochastic sampling — temperature sampling, top-k, or nucleus sampling — then multiple executions with identical inputs may produce different outputs. Variability is an inherent and unavoidable property of probabilistic decoding. It is not an error condition. It is a structural property of the model class.

Token-level entropy

For a discrete probability distribution, Shannon entropy is defined as (Shannon, 1948):

\[H(X) = -\sum_x p(x) \log p(x)\]

Applied to a language model at generation step \(t\):

\[H_t = -\sum_{w \in V} P(w_t = w \mid w_1, \ldots, w_{t-1}) \log P(w_t = w \mid w_1, \ldots, w_{t-1})\]

Interpretation:

  • High entropy: probability mass is spread across many plausible tokens
  • Low entropy: probability mass is concentrated on a small number of tokens

Because language models rely on probabilistic generation, token-level entropy is generally greater than zero during normal operation. A zero-entropy state would require fully deterministic decoding with a single token having probability 1. This eliminates generative flexibility and is outside the scope of this framework.

Attention and context influence

Transformer-based models compute attention using:

\[\text{Attention}(Q, K, V) = \text{softmax}\left(\frac{QK^\top}{\sqrt{d}}\right) V\]

What is safe to state:

  • All tokens in the context window are theoretically visible to attention
  • Influence depends on learned attention patterns and positional encoding
  • In practice, long contexts often degrade constraint adherence and coherence

What is not claimed:

  • There is no guaranteed monotonic distance-decay law
  • Early tokens are not mathematically forbidden from dominating later ones

The correct statement for this framework is therefore: constraints and instructions placed closer to generation time tend to be more reliable in practice, especially in long contexts, due to attention behavior and context interference effects. This is the empirical and architectural basis for the Heartbeat Continuity Rule and for positioning the OIL_CONTRACT at the beginning of context.

No persistent memory or state

A model does not store persistent memory across sessions. A new session resets the context sequence \(C_t\). Learned weights remain unchanged. Any notion of remembering arises purely from tokens present in the current context window.

Highlight:

At generation time, a language model computes a conditional probability distribution over the next token based solely on the tokens currently in the context window. Variability arises from probabilistic decoding, and uncertainty can be quantified using Shannon entropy. No persistent memory or session state exists beyond the provided token sequence.


Layer B — Interaction-Level Entropy and Admissibility

A formal abstraction over LLM interactions, explicitly labeled as such.

Why a higher-level model is needed

Layer A described what a language model does at a single generation step. Real usage involves sequences of generations where:

  • outputs from earlier turns are reused
  • rejected drafts remain in context
  • clarification attempts accumulate
  • human edits alter future conditioning

These effects are not captured by token-level entropy alone. A formal interaction-level abstraction is therefore introduced to reason about correctness, drift, and auditability across turns. This is standard practice in systems engineering: the abstraction models behavior, not internals.

Raw context vs interaction state

Let \(C_t\) be the raw token context visible to the model at turn \(t\). This is what the model technically conditions on. Not all tokens in \(C_t\) should be considered valid evidence for reasoning. We therefore define an interaction-level state:

\[S_t = (C_t, E_t)\]

Where:

  • \(C_t\) is the raw context
  • \(E_t\) is the admissible evidence set

The language model sees \(C_t\). The framework reasons about and constrains \(E_t\).

Admissibility

We define an admissibility function:

\[A(x) \in 0, 1\]

Where:

  • \(x\) is a token, span, artifact, or prior output
  • \(A(x) = 1\) means admissible evidence
  • \(A(x) = 0\) means inadmissible evidence

The admissible evidence set is:

\[E_t = x \in C_t \mid A(x) = 1\]

This function is not implemented by the model. It is enforced procedurally by the workflow and governance rules defined in the OIL_CONTRACT.

Effective conditioning

While the model computes \(P(w_t \mid C_t)\), the framework conceptually restricts governed reasoning to \(P(w_t \mid E_t)\).

Critical clarification for reviewers:

  • The model still technically conditions on \(C_t\)
  • The framework forbids influence from elements where \(A(x) = 0\)
  • Violations trigger refusal, not silent correction
  • This is a governed inference model, not a claim about internal filtering

Default vs governed interactions

In ordinary conversational use, admissibility is implicit — \(A(x) = 1\) for almost all \(x\). This permits training priors, conversational heuristics, inferred intent, gap-filling, reuse of rejected drafts, and silent scope expansion.

The Entropy-Gated AI Framework explicitly changes this default by defining and enforcing \(A(x)\) through the OIL_CONTRACT.

Interaction-level entropy

Let \(E_t\) be the admissible evidence set at turn \(t\). Each generation introduces uncertainty, and reused outputs become new evidence. Interaction-level entropy is defined as the measure of uncertainty induced by probabilistic outputs that are retained as admissible evidence across turns.

Unchecked interaction entropy follows:

\[H_I(t + 1) \geq H_I(t) + \Delta_{\text{unvalidated}}\]

Where \(\Delta_{\text{unvalidated}}\) is entropy introduced by outputs that are not explicitly validated or rejected. This inequality captures entropy accumulation at the interaction layer, not token-level entropy. It is this accumulation — not token-level uncertainty — that the framework is designed to govern.

Why reuse of rejected outputs is dangerous

If a rejected output remains in \(C_t\), it continues to influence future generations, its uncertainty is compounded, and downstream outputs may implicitly depend on invalid premises. Formally, rejected but unpruned outputs cause:

\[E_{t+1} = E_t \cup \text{invalid evidence}\]

This is the primary source of context contamination and interaction drift. The framework treats this as a governance failure, not a user error.

Pruning and acceptance as entropy control

Rejection with pruning: When an output is rejected:

\[E_{t+1} = E_t \setminus \text{rejected output}\]

This removes entropy from the admissible evidence set.

Acceptance (state promotion): When an output is accepted:

\[E_{t+1} = E_t \cup \text{accepted output}\]

Acceptance is the only allowed mechanism for state promotion. This prevents uncontrolled Markov chaining across turns.

Revision protocol and surgical pruning

The Edit Protocol — ROLE_ASSISTANT_OUTPUT / ROLE_USER_CONTENT with ⟨REVISE⟩ and ⟨ACCEPT⟩ anchors — implements surgical pruning:

\[E_{t+1} = (E_t \setminus \text{REVISE regions}) \cup \text{revision instruction}\]

Accepted regions are locked: \(\forall x \in \text{ACCEPT regions} : A(x) = 1\) and is fixed.

Only anchored regions re-enter the probabilistic generation process. This confines entropy accumulation to the declared revision scope rather than the whole response.

Highlight:

Interaction-level entropy models how uncertainty accumulates when probabilistic outputs are reused as evidence across turns. Admissibility explicitly defines which information is allowed to influence future generations. By controlling admissibility, acceptance, and pruning, the framework bounds entropy growth without modifying the underlying language model.


Layer C — Governance Artifact Mapping

Where OIL_CONTRACT, INVOKER, E_T, and modes map to the mathematics.

The governing principle

The framework does not change \(P(w_t \mid C_t)\). Instead it constrains which elements of \(C_t\) are treated as admissible evidence and how admissible evidence is allowed to evolve across turns. The framework governs construction of \(E_t\) and evolution of \(E_t \to E_{t+1}\).

Set structure of the framework

The framework consists of three sets of elements:

\[S_{\text{OIL}} = \text{purpose scope, roles authority, constraints, markdown authority, seal, heartbeat, integrity anchor}\]
\[S_{\text{INVOKER}} = \text{execution intent, preconditions, hash verification, entropy gate rules, ET binding, output requirements}\]
\[S_{\text{ET}} = \text{scope, execution requirements, constraints, output requirements, reporting format}\]

A vial is a concrete composition:

\[V = (S_{\text{OIL}}, S_{\text{INVOKER}*m}, S*{\text{ET}}, C_1, C_2, \ldots)\]

Where \(S_{\text{INVOKER}_m}\) is the mode-specific INVOKER. The vial complexity constraint:

\[|V| \leq \theta_{\text{model}}\]

Where \(\theta_{\text{model}}\) is the empirically determined seal reliability threshold for a given frontier model.

OIL_CONTRACT → admissibility function

Mathematical role: OIL_CONTRACT defines \(A(x)\).

It explicitly sets \(A(x) = 0\) for:

  • assumptions and inferred intent
  • inferred scope and unstated defaults
  • gap-filling and silent reinterpretation
  • reused rejected outputs

And permits \(A(x) = 1\) only for:

  • explicitly provided inputs declared by ROLE_USER
  • explicitly accepted outputs
  • explicitly referenced artifacts

Role mapping:

  • ROLE_SYSTEM → defines and enforces \(A(x)\)
  • ROLE_USER → the only authority that may set \(A(x) = 1\) via explicit acceptance
  • ROLE_ASSISTANT → operates within \(E_t\) only

INVOKER → state transition semantics

Mathematical role: INVOKER defines the state transition function \(f\):

\[S_{t+1} = f(S_t, \text{mode})\]

Where \(S_t\) is the interaction state and mode constrains allowed transitions.

Key distinction:

  • OIL_CONTRACT = static law: defines \(A(x)\)
  • INVOKER = dynamic process semantics: defines allowed \(f\)

This separation is what prevents hidden agent behavior. INVOKER does not define content. It defines legal evolution paths for admissible state.

E_T → initial conditions and constraints

Mathematical role: E_T defines the required evidence \(R\), scope bounds, and output constraints. Valid generation requires:

\[R \subseteq E_t\]

If this condition fails, refusal is mandatory. E_T must be explicitly submitted by ROLE_USER because submission injects authoritative evidence into \(E_t\), collapses ambiguity, and transfers responsibility to the human operator. Without E_T, the model would have to infer intent — which OIL_CONTRACT explicitly forbids.

The three artifacts are orthogonal

Artifact Controls Mathematical Object
OIL_CONTRACT What counts as evidence Admissibility function \(A(x)\)
INVOKER How state evolves across turns State transition function \(f\)
E_T What problem exists Requirement set and scope bounds \(R\)

No overlap. No redundancy. All three are required for bounded inference.

Modes → entropy flow control

Modes define where entropy is allowed to accumulate, not what the model knows.

SINGLE_PASS: One probabilistic output. No conversational accumulation. Rejection triggers external iteration only. Interaction entropy is minimized. This mode approximates stateless inference.

\[S_{t+1} = \text{HALT after one generation}\]

MULTI_TURN: Only referenced and accepted outputs may enter \(E_t\). Rejected outputs are pruned. Entropy grows but is bounded by the acceptance boundary. Hysteresis rule: if three consecutive generations at the same step are rejected:

\[\text{RETRY COUNT} \geq 3 \land \Delta H_I < \epsilon \implies \text{HARD HALT}\]

ROLE_USER must provide a constraint adjustment before execution resumes.

INTERACTIVE: Governance is active. Role authority is enforced. Acceptance boundaries exist. Output structure is conversational rather than structured. Entropy gate operates as passive monitor issuing SOFT_WARN when OIL constraints are pressured.

PIPELINE: Output of Stage \(n\) becomes the immutable \(E_0^{(n+1)}\) for Stage \(n+1\). Failure in any stage triggers STAGE_FAILED and halts the entire chain. This prevents hallucinated intermediate data from propagating across multi-stage workflows.

VERIFICATION: ROLE_ASSISTANT is prohibited from generating new content:

\[A_{\text{generation}}(x) = 0 \text{ for all new content}\]

Output is restricted to VALID | INVALID + reasoning. This is a zero-entropy pass — the purest expression of entropy gating.

Modality generalization

The framework is modality-agnostic. The payload can be any data type:

\[P \in \text{text, image, audio, video, sensor, embedding, multimodal}\]

The governance layer — \(A(x)\), \(f\), \(R\) — operates identically regardless of modality. Mode-specific INVOKER files define the payload type as part of execution intent. This extends EGAF governance to computer vision systems, speech models, autonomous agents, and multi-modal pipelines without modification to the core specification.

Highlight:

The three governance artifacts are orthogonal. OIL_CONTRACT defines admissibility. INVOKER defines state transitions. E_T defines the task and scope. No two artifacts govern the same axis. All three are required for bounded probabilistic inference.


Relationship to existing approaches

Approach What it governs
Few-shot prompting Input shaping only
RAG Input retrieval only
LangChain Tool and model chaining in code
HuggingFace Model access and execution
EGAF Admissibility \(A(x)\), state transitions \(f\), and acceptance boundaries across the full interaction lifecycle

EGAF can be applied on top of any of the above. It is not a replacement — it is a governance layer that the others do not provide.

Final authoritative statement

This framework models AI interaction as governed probabilistic inference over an explicitly constrained evidence set. By formally defining admissibility, state transitions, and acceptance semantics, it bounds interaction-level entropy without modifying model internals. The result is auditable, repeatable, and vendor-neutral controlled reasoning applicable across modalities and model providers.