PhD Research Proposal

LifeOS: A Constitutional Cognitive Operating System for Human–AI Co-Agency

Ivan Berlocher — December 2025

Related: Open Trust Infrastructure (DOI: 10.5281/zenodo.17940484)


Abstract

Current AI systems conflate language generation, cognitive processing, and world-affecting action into undifferentiated pipelines. This architectural confusion creates fundamental safety, controllability, and trust issues that cannot be resolved through post-hoc alignment techniques.

This thesis proposes LifeOS, a constitutional cognitive operating system that enforces structural separation between linguistic generation, semantic interpretation, and execution. Drawing on Montesquieu's separation of powers, Peircean semiotics, and speech act theory, we demonstrate that reliable human–AI co-agency requires constitutional guarantees rather than behavioral fine-tuning.

Scope clarification: This thesis does not propose a general theory of intelligence, but a design framework for human–AI cognitive systems with explicit constitutional guarantees.

The central invariant is simple: meaning is not causation — no symbol can directly affect the world. This principle, implemented as an architectural constraint, ensures that the system remains trustworthy regardless of the correctness of any individual component.

1. Problem Statement

1.1 The Conflation Problem

Modern AI systems, particularly those built on Large Language Models, suffer from a fundamental architectural confusion:

What is conflatedWhy it matters
Language generation ↔ UnderstandingModels produce text without semantic grounding
Prediction ↔ ReasoningToken prediction is mistaken for logical inference
Response ↔ ActionOutputs can trigger world-affecting consequences

This conflation leads to:

1.2 The Alignment Paradox

Current "AI safety" approaches attempt to solve these problems through fine-tuning on human preferences (RLHF), Constitutional AI prompting, and output filtering. These approaches share a fatal flaw: they depend on the system being correct.

Why LifeOS is not Constitutional AI

  1. Constitutional AI (Anthropic) embeds normative rules inside the model.
  2. LifeOS enforces constitutional constraints outside the model.
  3. As a result, LifeOS remains safe even when the model is incorrect, whereas Constitutional AI fundamentally depends on model compliance.

A system that requires correctness to be safe will eventually fail. The question is not if but when.

1.3 Research Question

How can we design cognitive AI systems that remain structurally trustworthy regardless of the correctness of their linguistic outputs?

2. Theoretical Framework

2.1 Montesquieu's Separation of Powers

"Pour qu'on ne puisse abuser du pouvoir, il faut que, par la disposition des choses, le pouvoir arrête le pouvoir."
— Montesquieu, L'Esprit des lois (1748)

We transpose this principle to cognitive architecture:

PoliticalCognitive (LifeOS)
LegislativeLanguage generation (LLM)
ExecutiveExecutorOS (action)
JudicialValidation, constraints, traceability
LawInvariant
ConstitutionArchitecture

Key insight: The power to say is never the power to do.

2.2 Peircean Semiotics

The semiotic triangle provides the theoretical foundation:

Central principle: The relationship between signifiant and référent is arbitrary and mediated. A symbol cannot directly cause world-state changes.

2.3 Speech Act Theory

Following Austin and Searle, we distinguish:

LifeOS architecturally separates these layers, ensuring that locutionary production (LLM output) never directly produces perlocutionary effects (world changes).

3. Proposed Architecture: LifeOS

3.1 Overview

LifeOS is a constitutional cognitive operating system comprising modular, separable components:

┌─────────────────────────────────────────────────────────────┐ │ WORLD │ └─────────────────────────────────────────────────────────────┘ ▲ │ Verified actions only │ ┌─────────────────────────────────────────────────────────────┐ │ EXECUTOR OS │ │ (sole point of world contact) │ └─────────────────────────────────────────────────────────────┘ ▲ │ Validated intentions │ ┌─────────────────────────────────────────────────────────────┐ │ HARMONIA │ │ Cognition → Semiotics → Semantics → Pragmatics │ │ + Memory + Context │ └─────────────────────────────────────────────────────────────┘ ▲ │ Input │ ┌─────────────────────────────────────────────────────────────┐ │ HUMAN │ │ (sovereign, always) │ └─────────────────────────────────────────────────────────────┘

3.2 Core Modules

ModuleFunctionKey Property
CognitionOSPattern recognition, language generationNo world access
MemoryOSEpisodic, semantic, procedural memoryPersistent identity
LearningOSSkill acquisition, adaptationSupervised updates
BehaviorOSAction planning and sequencingIntent validation
EmotionOSSalience, orientation, attentionNon-decisive signals
HarmoniaConstitutional layer, semiotic firewallInvariant enforcement
ExecutorOSWorld-affecting action executionSole action point

3.3 Harmonia: The Constitutional Layer

Harmonia implements the core invariant through a layered processing pipeline:

  1. Cognition — LLM generates representations
  2. Semiotics — Distinguish signifiant/signifié/référent
  3. Semantics — Construct coherent meaning
  4. Pragmatics — Form validated intentions
  5. Execution — (External) Act on the world

Critical constraint: No layer can bypass another. The LLM (layer 1) can never directly invoke ExecutorOS (layer 5).

3.4 The Invariant

Meaning is not causation.

No symbol acts directly on the world.

The power to say is never the power to do.

This invariant is:

4. Contributions

Contribution 1: Conceptual — Cognitive OS

Redefining AI systems as cognitive operating systems rather than applications or agents:

Contribution 2: Constitutional — Montesquian Architecture

Applying political philosophy to AI architecture:

Contribution 3: Architectural — Modular Implementation

A complete, executable architecture:

Contribution 4: HCI — Co-Agency Model

The "alter ego" paradigm:

The term "alter ego" is used strictly as a Human–Computer Interaction metaphor, not as a psychological or cognitive duplication model.

Contribution 5: Methodological — Philosophy to Code

Demonstrating the path from principle to implementation:

5. Research Methodology

5.1 Design Science Research

Following Hevner et al. (2004), this thesis employs design science methodology:

  1. Problem identification — Conflation in AI systems
  2. Objective definition — Constitutional trustworthiness
  3. Design and development — LifeOS architecture
  4. Demonstration — Working implementation
  5. Evaluation — Formal and empirical analysis
  6. Communication — This thesis

5.2 Formal Methods

Key properties will be verified using:

5.3 Empirical Evaluation

6. Current Status

6.1 Completed Work

ComponentStatus
Theoretical frameworkComplete
Core architecture designComplete
Harmonia implementationProduction
ExecutorOSProduction
MemoryOSProduction
EduOS (educational suite)Production
Documentation (Constitution, Invariant)Complete

6.2 Remaining Work

TaskEstimated Duration
Scoped formal verification6 months
Comparative evaluation4 months
User studies4 months
Thesis writing6 months

7. Expected Outcomes

7.1 Academic Contributions

  1. A new paradigm for AI system design: constitutional cognitive systems
  2. Formal framework for separation of linguistic and causal powers
  3. Practical architecture for trustworthy human–AI co-agency
  4. Methodology for translating philosophical principles into verifiable code

7.2 Practical Impact

8. References

Foundational

AI Safety & Alignment

Methodology

Related Work by the Author


Signature Statements

"Words can contradict each other. The world cannot."

"I build systems that protect humans from their own ideas."

"Trust does not rest on intentions, but on structure."

"I delegate execution, never sovereignty."


Final thesis statement:

This work argues that trustworthy human–AI systems should not rely on correct behavior, but on architectures that make harmful behavior structurally impossible.

Publications

Contact


Document version: 1.1 — December 2025
ivan-berlocher.github.io