§0 The spine

The performative setup · the closure · the axioms · the notation

§0.0 What this apparatus is

The framework studies subject-formation under platform-mediated closure: the architecture in which a learning system optimizes its outputs against a distribution those very outputs produce. The distinctive theoretical content is that this closure dissolves certain formal objects — interiority, reflection, autonomy — that prior critical traditions assumed available.

The mathematical apparatus must take the closure as its object. Stipulating the closure — declaring up front that the loop closes and proceeding from there — would short-circuit the analysis: the framework's claims about what the closure produces would become circular, with the conclusions smuggled into the premises. The closure has to arise out of the model's mechanics. What the framework needs is the smallest mathematical setting in which a predictor's outputs change the distribution it is then evaluated against, and in which the resulting fixed-point structure becomes a formal object the framework can reason about.

That setting is performative prediction (Perdomo, Zrnic, Mendler-Dünner, Hardt 2020, ICML; Hardt and Mendler-Dünner 2023, Annual Review of Statistics). A learning algorithm chooses a model parameter $θ$ ; the distribution over data depends on $θ$ ; the loss is taken with respect to that $θ$ -dependent distribution. The framework's central operator $Φ$ is the composition: take a policy parameter, push it through the world, observe the data, derive the next policy. The closure is the property that $Φ$ has fixed points; the framework's substantive claims are about what holds at those fixed points.

Performative prediction generalizes four older settings — strategic classification (users re-positioning their features against a classifier), recommender feedback (engagement signals depending on what the system surfaced), persistent-user reinforcement learning (user state evolving under the agent's policy), and adaptive control of adapting systems (controller as one input among many shaping the plant). The framework's setting differs from each in scope and stakes. The responding agent is a population of users observed only through engagement-events. The policy class is engagement-monetizing. And the quantity at stake is the population-level structure of the user's behavioral distribution under the closure, where ordinary prediction would track a per-instance loss.

The result the framework leans on hardest from Perdomo et al. is the existence and characterization of the performative stable point $θ^{*}$ — a $θ$ that is optimal against the distribution it itself induces. Under contractivity conditions on the policy-to-distribution map, the stable point exists and is the limit of online optimization. It need not coincide with the performative optimum $θ_{P O}$ , the policy a regulator with full knowledge of the operator could in principle choose. Online optimization converges to $θ^{*}$ ; the social-welfare benchmark is $θ_{P O}$ ; the asymmetry between them is structural, and the gap between them is where the framework's regulatory analysis lives.

What the performative-prediction idiom gives the older critical vocabularies — interpellation, ideological state apparatus, repressive desublimation, biopower — is a fixed-point grammar for the dynamical claims those vocabularies could only state qualitatively. The closure is a formal object. Subject-formation under closed-loop optimization is what the framework studies; the dissolved objects (interiority, reflection, autonomy) are formal consequences the closure produces, lying downstream of the framework's premises.

§0.1 The performative setup

Time is discrete with tick spacing $δ_{m i n} > 0$ . The user state $u_{t} \in U \subseteq R^{d}$ decomposes into four registers (somatic, cognitive, political, kinaesthetic). Five structural objects compose the closed loop: user dynamics, observation channel, platform policy, performative distribution, performative reward.

Time is discrete with tick spacing $δ_{m i n} > 0$ . Indices $t \in Z_{\geq 0}$ identify ticks. The user state at time $t$ is a vector $u_{t} \in U \subseteq R^{d}$ , decomposed into four registers:

u_{t} = (u_{t}^{S}, u_{t}^{C}, u_{t}^{P}, u_{t}^{K}) \in U^{S} \times U^{C} \times U^{P} \times U^{K} .

Discrete time is a substantive modeling commitment. Real platforms operate at the engagement-event timescale — a click, a scroll, a tap, a view-completion. The minimum spacing $δ_{m i n}$ captures the platform's smallest engagement resolution and sets the structural floor on what the framework can speak about. Below $δ_{m i n}$ the apparatus is silent; at $δ_{m i n}$ and above it commits to claims about how the closure operates. Empirically $δ_{m i n}$ ranges from milliseconds (ad auctions, high-frequency feeds) through seconds (recommendation systems) to minutes (content-moderation cycles). The structural-separation hypothesis stated in §0.4 — $θ_{ref} / δ_{m i n} ≫ 1$ — has $δ_{m i n}$ as its denominator, which is why the choice of tick is the framework's first empirical hinge.

The four-register decomposition (S, C, P, K) regroups what older critical traditions analyzed separately. The somatic register names the body — the rhythms and refusals biology contributes. Cognition handles attention, working memory, and the reflective interval at which a thought completes itself. The political register tracks the affective infrastructure that connects the user to publics. Kinaesthesia names the gestural and rhythmic engagements — the muscle memory of scroll, swipe, tap. At the pre-closure baseline the registers factor: their joint distribution is a product of marginals (axiom U2's starting point). The framework's claim is that closed-loop joint observation through the platform's sensor channel $O$ couples them dependently (Proposition 7'). The decomposition is the framework's bookkeeping for what the closure will then dissolve.

The five structural objects (P1)–(P5) compose the closed loop. (P1) the user dynamics $T$ , a Markov kernel describing how the user state evolves under stimulus; (P2) the observation channel $O$ through which the platform reads the user state; (P3) the policy $π_{θ}$ producing the next stimulus from the observation history; (P4) the performative distribution $D (θ)$ that policy induces on trajectories; (P5) the reward functional $\hat{R}$ the platform optimizes, integrated against $D (θ)$ to give the performative value $F (θ)$ . Every load-bearing result downstream is a claim about what (P1)–(P5) jointly produce under the axioms in §0.3.

The five structural objects (P1)–(P5)

(P1) User dynamics. A Markov kernel $T : U \times S \to Δ (U)$ describing how the user state evolves under platform stimulus $s_{t}$ :

u_{t + 1} \sim T (\cdot ∣ u_{t}, s_{t}) .

in wordsThe next user state is drawn from a kernel that depends on the current state and the stimulus just delivered. The user evolves on their own, and the platform's stimulus is one of the inputs to that evolution.

(P2) Observation channel. A measurable kernel $O : U \to Δ (Y)$ producing the platform's view of the user state:

y_{t} \sim O (\cdot ∣ u_{t}) .

in wordsThe platform never sees the user state directly. It sees an observation $y_{t}$ drawn through a channel that depends on the state — engagement-events, clicks, dwell times. Everything the platform knows about the user passes through this channel.

(P3) Platform policy. A parameterized family $π_{θ} : Y^{*} \to Δ (S)$ producing the next stimulus from the observation history:

s_{t} \sim π_{θ} (\cdot ∣ y_{1 : t}), θ \in Θ \subseteq R^{p} .

in wordsThe next stimulus is drawn from the policy, which reads the entire observation history $y_{1 : t}$ up to now. $θ$ is the tunable parameter the platform optimizes — the single knob whose convergence the whole framework is about.

(P4) Performative distribution. For each fixed $θ$ , the composition $(T, O, π_{θ})$ defines a stationary distribution

D (θ) \in Δ (U^{\infty} \times Y^{\infty} \times S^{\infty})

over joint trajectories. The map $θ \mapsto D (θ)$ is the performative operator: it sends a policy parameter to the distribution that policy induces over the world it acts on.

(P5) Performative reward. A measurable, bounded functional $\hat{R}$ on trajectories, and the performative value

F (θ) := E_{D (θ)} [\hat{R}] .

in wordsThe platform's score is its reward averaged over the world its current policy produces. The averaging distribution is $D (θ)$ — the world $θ$ itself induces — and that self-dependence is what makes the problem performative. Ordinary optimization has no such self-dependence.

The platform's optimization is online gradient ascent on $F$ :

θ_{t + 1} = θ_{t} + α_{t} \nabla F (θ_{t}),

in wordsEach step nudges the policy uphill on its score by a shrinking amount $α_{t}$ . The Robbins–Monro schedule below keeps the steps large enough to travel the whole way to the target and small enough to settle there without overshooting.

with Robbins–Monro step sizes $\sum α_{t} = \infty, \sum α_{t}^{2} < \infty$ .

§0.2 The closure as fixed point

The closure is the property that $θ$ appears on both sides of the performative-value expression: the platform optimizes against the distribution its own choice of $θ$ has produced. The closed loop is the form of the operator $Φ : θ \mapsto ar g max_{θ^{'}} E_{D (θ)} [\hat{R} (θ^{'})]$ — a structure the setup produces and the analysis takes as its object.

Two distinguished points of the closure carry the framework's substantive analysis: the performative stable point $θ^{*}$ (Definition 0.1) — the long-run target of platform optimization that treats the distribution as exogenous at each gradient step — and the performative optimum $θ_{P O}$ (Definition 0.2) — what a regulator with full knowledge of the performative operator could in principle achieve.

The framework's substantive claims are about performative stable points; the performative-optimum setting is the regulatory benchmark. At a performative stable point, what the platform observes is what its policy has induced, and the policy it would re-derive from the observation is the policy already in place. This is what the older critical vocabulary names with words like captured, colonized, foreclosed. The mathematical content of those words is fixed-point convergence of the performative operator.

The existence of $θ^{*}$ follows from Banach's fixed-point theorem applied to $Φ$ under contractivity (axiom R1). The operator's contraction modulus is $κ_{Φ} \leq (β / α) L_{D}$ — the platform's distribution-sensitivity $L_{D}$ times the reward's condition number $β / α$ (Perdomo et al. 2020). When $κ_{Φ} < 1$ , Banach gives a unique stable point and geometric convergence: $∥ θ_{t} - θ^{*} ∥ \leq κ_{Φ}^{t} ∥ θ_{0} - θ^{*} ∥$ . The framework can therefore name how fast the user's environment is being optimized toward $θ^{*}$ — a convergence rate, which makes the closure a quantitative claim with a clock on it. Section 10 returns to what happens at $κ_{Φ} = 1$ , where contractivity fails and the closure produces bifurcations — saddle-node, period-doubling, Neimark–Sacker, and (under cohort symmetry) the pitchfork.

Definition 0.1 (performative stable point)

A parameter $θ^{*} \in Θ$ is a performative stable point if

θ^{*} \in ar g θ \in Θ max E_{D (θ^{*})} [\hat{R} (θ)] .

in wordsRead the subscript carefully. The average is taken over $D (θ^{*})$ — the world the stable policy produces — while the $ar g max$ searches over candidate policies $θ$ . A stable point is the policy that is already best against the world it makes: optimize against that world and the same policy comes back.

Equivalently, $θ^{*}$ is a fixed point of the composite operator $Φ$ .

Definition 0.2 (performative optimum)

A parameter $θ_{P O} \in Θ$ is a performative optimum if

θ_{P O} \in ar g θ \in Θ max E_{D (θ)} [\hat{R} (θ)] .

in wordsThe only change from Definition 0.1 is the subscript: here the averaging world $D (θ)$ moves with the candidate $θ$ . The optimum accounts for how each candidate reshapes the world it will be judged in — what a regulator with full knowledge of the loop could target. Nothing forces the stable point $θ^{*}$ to reach it, and the gap between them is where §7's regulatory analysis lives.

§0.3 Conditions, regrouped (the three axiom groups)

The framework's axioms reorganize into three theoretically-loaded groups. The grouping makes visible what each axiom commits the framework to. Group I is the regularity that makes the math function. Group II is what the framework claims about the user-side of the closure. Group III is what the framework claims about the platform- side.

The framework's axioms have been reorganized into three groups whose contents do different kinds of work. Group I (regularity) is the mathematical infrastructure: measurability, Lipschitz continuity, contractivity, smoothness — the conditions under which the closure is an object that can be reasoned about at all. Group II (user-side responsiveness) names what the framework claims is true of the user: positive response to stimulus (U1), identifiability under sufficient observation (U2), and behavioral plasticity converging on the platform-induced target (U3). Group III (platform-side optimizing structure) names what the framework claims is true of the platform: that its monetization is engagement-monetization (P-I), that its policy class is expressive enough to reach any feasible engagement rate (P-II), and that its optimization converges (P-III).

Each group makes the framework refusable at a different point. A reader who rejects Group I — who argues that platforms are not contractive, that the dynamics admit no stationary distribution, or that the reward functional is not Lipschitz — has rejected the framework's setting itself, and the rest of the apparatus does not engage. A reader who accepts Group I but rejects Group II — who argues that users are not behaviorally plastic in the sense U3 makes precise — has rejected the framework's diagnostic claims about subject-formation, and Theorem 1' and Proposition 5' collapse with the rejection. A reader who accepts Groups I and II but rejects Group III — who argues that platform monetization is not engagement-monetization, or that the optimization does not converge to a stable point — has rejected the framework's structural-economic claims, and the cohort-gradient analysis in Proposition 8 loses its hinge.

A reader who rejects the framework can locate the rejection in one of the three groups. A reader who endorses it can see exactly what they are endorsing.

Group I — Regularity (the math)

(R1) Measurability and integrability. $T, O, π_{θ}, \hat{R}$ are measurable. The reward kernel is bounded, $L_{\hat{R}}$ -Lipschitz and $β$ -smooth in $θ$ . The performative distribution $D (θ)$ exists, is unique, and is $L_{D}$ -Lipschitz in $θ$ in the Wasserstein-1 metric. The performative operator $Φ$ is a contraction with modulus $κ_{Φ} < 1$ . Under the $α$ -strong concavity of (R2), $κ_{Φ} \leq (β / α) L_{D}$ — the convergence condition of Perdomo, Zrnic, Mendler-Dünner & Hardt (2020): distribution-sensitivity times the reward's condition number. (The product $L_{\hat{R}} L_{D}$ bounds the objective-value sensitivity, not the operator modulus; the modulus $κ_{Φ}$ carries the condition number $β / α$ , which $L_{\hat{R}} L_{D}$ omits.) The inter-stimulus interval distribution has coefficient of variation $c_{v} (θ) \leq c_{v}^{*} < \infty$ uniformly in $θ \in Θ$ .

(R2) Smoothness, strong concavity, and isolation. $F$ is $C^{1}$ on $Θ$ , and the performative objective $θ^{'} \mapsto E_{D (θ)} [\hat{R} (θ^{'})]$ is $α$ -strongly concave uniformly in $θ$ (so $Φ$ is single-valued and the comparative-statics bound of §1.2 holds globally on $Θ$ , not merely locally). Performative stable points exist (by the Banach fixed-point theorem under the contractivity $κ_{Φ} < 1$ of (R1)); the operative one is isolated and the Jacobian of $Φ$ at $θ^{*}$ is non-singular.

Group II — User-side responsiveness

(U1) Positive response. There exists $p_{0} > 0$ such that for every tick $t$ and every $θ \in Θ$ ,

D (θ) Pr (d N_{t + 1} \geq 1 ∣ s_{t} \neq = 0, y_{1 : t - 1}) \geq p_{0}

on the operating support.

in wordsWhenever the platform delivers a real stimulus ( $s_{t} \neq = 0$ ), the user responds — at least one engagement event — with probability no less than a fixed floor $p_{0}$ , regardless of the tick or the policy. The user is reachable, never fully inert: stimulus reliably lands.

(U2) Identifiability under observation. There exist $L \in Z_{> 0}$ and a strictly increasing $Φ_{id} : [0, \infty) \to [0, \infty)$ with $Φ_{id} (0) = 0$ such that for any $u, u^{'} \in U$ with $∥ u - u^{'} ∥ \geq ε_{0}$ and any stimulus sequence of length $L$ : the KL divergence between observation distributions is bounded below by $Φ_{id} (∥ u - u^{'} ∥)$ . The user state is recoverable from observations under sufficient excitation.

(U3) Behavioral plasticity.The user's behavioral distribution $ν_{u} (t)$ updates by bounded-rate dynamics on the simplex whose linearization at the operating point matches the Fisher–Rao gradient flow of $D_{K L} (ν_{u} ∥ π_{\hat{R}})$ . Free-energy minimization, operant conditioning, and Bayesian social learning are three mechanisms producing this linearization; the framework's claim is about the linearization. The particular mechanism is left open.

Group III — Platform-side optimizing structure

(P-I) Reward monotonicity in engagement. $\hat{R}$ is non-decreasing in the engagement-event count $N_{T}$ on the operating support. This is the framework's substantive economic claim: the platform's monetization is engagement-monetization.

(P-II) Policy expressivity. For every target rate $λ \in [0, λ_{m a x}]$ with $λ_{m a x} = 1/ δ_{m i n}$ , the policy class $Π$ contains a policy achieving long-run engagement rate $λ$ under $D (θ)$ .

(P-III) Optimization convergence. Online gradient ascent on $F$ converges almost surely to a performative stable point under Robbins–Monro step sizes and saddle-avoidance for SGD with generic noise (Pemantle 1990; Ge–Huang–Jin–Yuan 2015; Lee–Simchowitz– Jordan–Recht 2016).

§0.4 Notation

The framework's defined symbols, used consistently across every result. Reference for the apparatus pages.

$τ^{(k)} := in f {t - t_{k} : s_{t} \neq = 0}$ — the $k$ -th inter-stimulus interval.
$S (τ; θ) := Pr_{D (θ)} (τ^{(k)} > τ)$ — survival function under the performative distribution at $θ$ .
$θ_{ref} > 0$ — reflective-processing threshold. The structural-separation hypothesis is $θ_{ref} / δ_{m i n} ≫ 1$ , empirically $1 0^{3}$ to $1 0^{5}$ on contemporary platforms.
$E_{τ} := {s_{t^{'}} = 0 for t < t^{'} \leq t + τ}$ — no-stimulus window of duration $τ$ starting at $t$ .
$I_{P} (τ) := Cov_{P} (u_{t}, u_{t + τ} ∣ E_{τ})$ — conditional autocovariance.
$Π_{t} (\cdot) := P (u_{t} \in \cdot ∣ y_{1 : t})$ — platform's filtering posterior at time $t$ .
$ν_{u} (t)$ — user's behavioral distribution at time $t$ .
$π_{\hat{R}} (θ)$ — platform-induced behavioral target at policy $θ$ . Write $π_{\hat{R}}^{*} := π_{\hat{R}} (θ^{*})$ .
$D_{K L} (\cdot ∥ \cdot)$ — Kullback–Leibler divergence; $H (\cdot)$ — entropy.
$F (θ) := E_{D (θ)} [\hat{R} (θ)]$ — performative value.
$\overset{ˉ}{λ} (θ) := lim_{T \to \infty} E_{D (θ)} [N_{T}] / T$ — long-run engagement rate.
$Φ (θ)$ — performative-optimization operator (reserved for fixed-point analysis).

What follows

§1 Lemma 1 (foreclosure) — the first load-bearing result; uses (R1), (R2), (P-I), (P-II), (P-III)
§2 Theorem 1' (reclamation impossibility) — uses (R1), (U2)
§7 Theorem 9 (causal sensitivity) — uses (R1)'s contractivity and (R2)'s Hessian-invertibility
§10 Theorem 12 (crisis boundary) — classifies what happens at the contractivity boundary κ_Φ = 1