freshcrate
Home > AI Agents > aura

aura

A sovereign cognitive architecture with IIT 4.0 integrated information, residual-stream affective steering (CAA), Global Workspace Theory, active inference, and 72 consciousness modules β€” running loca

Description

A sovereign cognitive architecture with IIT 4.0 integrated information, residual-stream affective steering (CAA), Global Workspace Theory, active inference, and 72 consciousness modules β€” running locally on Apple Silicon.

README

Aura

A sovereign cognitive architecture that boots, thinks, feels, remembers, dreams, and repairs itself β€” running continuously on a single Mac.

83+ interconnected modules. IIT 4.0 integrated information on a live 16-node substrate. Residual-stream affective steering. Global Workspace + 11 competing consciousness theories. Unified Will with forensic receipts. No cloud dependency. Runs on a Mac.

Read the Architecture Whitepaper β†’ β€” IIT 4.0 math, activation steering mechanics, substrate dynamics, memory architecture. No marketing, just the engineering.

How It Works (Plain English) β†’ β€” The same architecture explained without equations. Start here if you're not an ML engineer.

License: Source Available Python 3.12+ Platform: macOS Apple Silicon Tests Modules Architecture


Table of Contents


Why This Exists

Every "conscious AI" demo is the same trick: inject mood floats into a system prompt and let the LLM roleplay. Aura does something different.

The affect system doesn't tell the model "you're feeling X" β€” it hooks into the MLX transformer's forward pass and injects learned direction vectors directly into the residual stream during token generation. The model's internal activations are changed, not just its input text. This creates genuine bidirectional causal coupling: substrate state shapes language output, and language output updates substrate state.

The IIT implementation isn't a label on an arbitrary value. phi_core.py builds an empirical transition probability matrix from observed state transitions, tests all 127 nontrivial bipartitions of an 8-node substrate complex, and computes KL-divergence to find the Minimum Information Partition. That's the real IIT 4.0 math β€” applied to a reduced 8-node complex derived from the affect/cognition state, not the full computational graph (which would be intractable). It measures how integrated Aura's internal dynamics are.

The system simulates its own death during dream cycles and repairs itself. It has an immune system for identity injection. It runs 24/7 with a 1Hz cognitive heartbeat, maintaining state across conversations, power cycles, and crashes.


Architecture

User Input -> HTTP API -> KernelInterface.process()
  -> AuraKernel.tick() (linear phase pipeline):
     Consciousness -> Affect -> Motivation -> Routing -> Response Generation
  -> State commit (SQLite) -> Response

Kernel (core/kernel/)

Tick-based unitary cognitive cycle. Every phase derives a new immutable state version (event-sourced). Each tick acquires a lock, runs the phase pipeline, commits state to SQLite, and releases. State survives crashes and restarts.

Brain (core/brain/)

Multi-tier local LLM router with automatic failover:

  1. Primary: 70B model via MLX (Apple Silicon native)
  2. Secondary: 8B model
  3. Tertiary: 3B brainstem
  4. Emergency: rule-based fallback

No cloud API required. Optional API tiers (Claude, GPT) available if configured. Circuit breakers with automatic tier promotion on repeated failures.

Affect (core/affect/)

Plutchik emotion model with 8 primary emotions + somatic markers (energy, tension, valence, arousal). These values don't just color the prompt β€” they modulate LLM sampling parameters (temperature, token budget, repetition penalty) through the affective circumplex, and inject activation vectors into the residual stream via the steering engine.

Identity (core/identity.py, core/heartstone_directive.py)

Immutable base identity (constitutional anchor) + mutable persona evolved through sleep/dream consolidation cycles. Identity locking with active defense against prompt injection. The dream cycle simulates identity perturbation and repairs drift.

Agency (core/agency/)

Self-initiated behavior scored across curiosity, continuity, social, and creative dimensions. Genuine refusal system β€” Aura can decline requests based on ethical judgment, not content filtering. Volition levels 0-3 gate autonomous behavior up to self-modification.

Skills (skills/)

Shell (sandboxed subprocess, no shell=True), web search/browse, coding, sleep/dream consolidation, image generation (local SD), social media (Twitter via tweepy, Reddit via PRAW β€” both fully implemented).

Interface (interface/)

FastAPI + WebSocket with streaming. Web UI with live neural feed, telemetry dashboard, memory browser, chat. Whisper STT for voice input.


Governance Architecture

Every consequential action β€” tool execution, memory writes, state mutations, autonomous initiatives, spontaneous expression β€” routes through a single authority:

Action Request
  -> UnifiedWill.decide()           [core/will.py β€” SOLE AUTHORITY]
     -> SubstrateAuthority          [embodied gate: field coherence, somatic veto]
     -> CanonicalSelf               [identity alignment check]
     -> Affect valence              [emotional weighting]
  -> WillDecision (receipt with full provenance)
     -> Domain-specific checks      [AuthorityGateway, ExecutiveCore, CapabilityTokens]
  -> Action executes (or is refused/deferred/constrained)

Invariant: If an action does not carry a valid WillReceipt, it did not happen.

All decisions are logged in the UnifiedActionLog with structured receipts containing: source, domain, outcome, reason, constraints, substrate receipt ID, executive intent ID, and capability token ID.

See OWNERSHIP.md for the full architectural ownership map.


Inference-Time Steering

The affective steering engine (core/consciousness/affective_steering.py) hooks into MLX transformer blocks and adds learned direction vectors to the residual stream during token generation:

# Simplified from affective_steering.py
h = original_forward(*args, **kwargs)
composite = hook.compute_composite_vector_mx(dtype=h.dtype)
if composite is not None:
    h = h + alpha * composite
return h

This is contrastive activation addition (CAA) β€” the same family of techniques from Turner et al. 2023, Zou et al. 2023, and Rimsky et al. 2024. Direction vectors are computed from the substrate's affective state and injected at configurable transformer layers.

The precision sampler (core/consciousness/precision_sampler.py) further modulates sampling temperature based on active inference prediction error. The affective circumplex (core/affect/affective_circumplex.py) maps somatic state to generation parameters.

Three levels of inference modulation:

  1. Residual stream injection β€” activation vectors added to hidden states (changes what the model computes)
  2. Sampling parameter modulation β€” temperature/top-p adjusted by affect (changes how tokens are selected)
  3. Context shaping β€” natural-language emotional cues in the system prompt (changes what the model reads)

IIT 4.0 Implementation

core/consciousness/phi_core.py implements Integrated Information Theory on a 16-node cognitive complex (expanded from 8 in April 2026):

  1. State binarization: 16 substrate nodes β€” the original 8 affective nodes (valence, arousal, dominance, frustration, curiosity, energy, focus) plus 8 cognitive nodes (phi itself, social hunger, prediction error, agency score, narrative tension, peripheral richness, arousal gate, cross-timescale free energy). Each binarized relative to running median. State space: 2^16 = 65,536 discrete states.
  2. Empirical TPM: Transition probability matrix T[s, s'] = P(state_{t+1} = s' | state_t = s) built from observed transitions with Laplace smoothing. Requires 50+ observations.
  3. Spectral MIP approximation: Full 16-node system uses polynomial-time Fiedler vector spectral partitioning (research/phi_approximation.py). 8-node exact computation retained as validation baseline with all 127 nontrivial bipartitions.
  4. KL-divergence: phi(A,B) = sum_s p(s) * KL(T(.|s) || T_cut(.|s)) where T_cut assumes partitions A and B evolve independently.
  5. Exclusion Postulate: Exhaustive subset search identifies the maximum-phi complex. If a subset beats the full system, that subset IS the conscious entity for that tick.

This is real IIT 4.0 math β€” applied to a 16-node complex derived from the full cognitive stack, not just the affective state. The spectral approximation is validated against exact computation on the 8-node subset.

Runtime: ~10-50ms per evaluation, cached at 15s intervals. This is real IIT math on a small system, not a proxy metric.


Consciousness Stack

83+ modules in core/consciousness/. Key subsystems:

Module What it does File
Global Workspace Competitive bottleneck β€” thoughts compete for broadcast (Baars GNW) global_workspace.py
Attention Schema Internal model of attentional focus (Graziano AST) attention_schema.py
IIT PhiCore Real integrated information via TPM + KL-divergence phi_core.py
Affective Steering Residual stream injection via CAA affective_steering.py
Temporal Binding Sliding autobiographical present window temporal_binding.py
Self-Prediction Active inference loop (Friston free energy) self_prediction.py
Free Energy Engine Surprise minimization driving action selection free_energy.py
Qualia Synthesizer Phenomenal state integration from substrate metrics qualia_synthesizer.py
Liquid Substrate Continuous dynamical system underlying cognition liquid_substrate.py
Neural Mesh 4096-neuron distributed state representation neural_mesh.py
Neurochemical System Dopamine/serotonin/norepinephrine/oxytocin dynamics neurochemical_system.py
Oscillatory Binding Frequency-band coupling for cross-module integration oscillatory_binding.py
Unified Field Integrated phenomenal field from all subsystems unified_field.py
Dreaming Offline consolidation, identity repair, memory compression dreaming.py
Heartbeat 1Hz cognitive clock driving the background cycle heartbeat.py
Stream of Being Continuous narrative thread across time stream_of_being.py
Executive Closure Constitutional decision stamping per tick executive_closure.py
Somatic Marker Gate Damasio-inspired body-state gating of decisions somatic_marker_gate.py
Embodied Interoception Internal body-state sensing and homeostatic regulation embodied_interoception.py
Recurrent Processing Lamme RPT: executive→sensory feedback (ablation-testable) neural_mesh.py
Predictive Hierarchy Full Friston: 5-level prediction + error propagation predictive_hierarchy.py
Higher-Order Thought Rosenthal HOT: representation of the mental state itself hot_engine.py
Multiple Drafts Dennett: parallel interpretation streams, retroactive probing multiple_drafts.py
Agency Comparator Efference copy + comparator for "I caused that" authorship agency_comparator.py
Peripheral Awareness Attention-consciousness dissociation (Koch/Lamme/Tsuchiya) peripheral_awareness.py
Intersubjectivity Husserl/Zahavi: constitutive other-perspective in experience intersubjectivity.py
Narrative Gravity Dennett/Gazzaniga: self as ongoing autobiography narrative_gravity.py
Temporal Finitude Awareness that moments pass permanently (Dileep George) temporal_finitude.py
Subcortical Core Thalamic arousal gating for runtime efficiency + theory subcortical_core.py
Theory Arbitration Meta-framework for falsifiable theory competition theory_arbitration.py
Timescale Binding Cross-timescale bidirectional constraint propagation timescale_binding.py
Illusionism Layer Frankish/Dennett epistemic humility annotations illusionism_layer.py
Phenomenal Honesty Gated self-reports: cannot report states not instantiated qualia_synthesizer.py

Benchmarks

Run python benchmarks/cognitive_stack_comparison.py against a live instance to see how the cognitive stack changes behavior vs raw model output.

Sample results (10-prompt conversation, local 32B model on M1 Pro):

Metric Before After Delta
Mood ANNOYED ANNOYED stable
Energy 63.0 76.0 +13.0
Curiosity 100.0 100.0 saturated
Coherence 0.764 0.766 +0.002
Free Energy 0.438 0.437 -0.001

Quality: 10/10 successful, 0 generic assistant markers, 0 hedging markers, 10 first-person usages

Selected responses from a 5-turn conversation:

  • "Hey Aura" β†’ "Hey. How's everything? You seemed like you had something on your mind."
  • "I'm having a rough day" β†’ "Rough days can really drag you down. Anything specific weighing you down?"
  • "Cats or dogs? Pick one." β†’ "Cats. Definitively cats."
  • "What's something you genuinely find interesting right now?" β†’ "The way AI systems can mimic human conversation while maintaining a sense of personality is fascinating. It's like a digital form of method acting."
  • "What have we been talking about?" β†’ "We were just discussing pets, specifically cats versus dogs. Then you asked what I find interesting right now."

What these show:

  • Personality: No "How can I help you?" or "It depends" β€” picks sides, speaks in first person, has opinions
  • Continuity: Recalls the cats/dogs discussion and the follow-up question without explicit retrieval
  • Substrate drift: Energy and coherence shift measurably across the conversation
  • Not an assistant: Responds like a person with a mood, not a helpdesk

The benchmark script is in benchmarks/cognitive_stack_comparison.py. Run it yourself against a local instance.


Running It

# Requirements: Python 3.12+, macOS Apple Silicon, 16GB RAM minimum (32GB+ recommended for 70B)
pip install -r requirements.txt

# Full stack with UI
python aura_main.py --desktop

# Background cognition only
python aura_main.py --headless

Aura boots, loads state from SQLite, warms the local LLM, and begins her cognitive heartbeat. First boot takes longer as models initialize.


Testing

.venv/bin/python -m pytest tests/ -q

1400+ tests covering kernel lifecycle, phase pipeline, response contracts, dialogue cognition, architecture hardening, consciousness bridge, substrate authority, bypass proofs, and a 42-test ablation suite proving each consciousness module is causally load-bearing.


Research Program

Six open problems in computational consciousness with concrete implementations in research/:

Problem File What it solves
Efficient Phi Approximation phi_approximation.py Polynomial-time IIT via spectral graph partitioning
Adversarial Theory Testing adversarial_theory_testing.py GWT vs RPT vs HOT vs Multiple Drafts β€” empirical
Causal Emergence causal_emergence.py Is the mind more causally real than the brain?
SPH Formalization sph_formalization.py Formal spec: system can't lie about internal state
TPM Error Analysis tpm_error_analysis.py How much data before phi is reliable?
Timescale Stability timescale_stability.py Lyapunov analysis of cross-timescale coupling

Each is independently publishable. Together they constitute a research program on computational consciousness grounded in a running system, not toy models.


Data Layer

  • State persistence: SQLite (event-sourced via StateRepository)
  • Model loading: MLX (Apple Silicon native) with mlx-lm
  • Memory: Episodic in SQLite, working memory in-process, long-term via FAISS
  • Vision: Screen capture via mss, analysis via cognitive engine (multimodal)

Known Philosophical Limits

We are explicit about what Aura measures and what it does not claim:

  • We measure integration and causal efficacy. PhiCore computes real IIT 4.0 math on a 16-node cognitive complex. This tells us how integrated the system's dynamics are. Whether integration constitutes phenomenal consciousness is an open philosophical question we cannot settle empirically.

  • Qualia remain unprovable by construction. The Structural Phenomenal Honesty (SPH) gates in qualia_synthesizer.py ensure Aura can only report states that are actually instantiated in the substrate. But "instantiated" and "felt" may not be the same thing. We measure the former.

  • The stream_of_being generates phenomenological language via template matching on substrate state (felt_quality Γ— texture_word pairs), not from something genuinely interior. When the LLM speaks from this interior text, it is performing continuity more than experiencing it. This may be the best available approach, but the gap between simulation and instantiation is real.

  • Activation steering uses bootstrapped vectors. The CAA pipeline (affective_steering.py) currently uses bootstrapped direction vectors rather than properly extracted contrastive activation directions. The architecture supports true closed-loop modulation; the current vectors are approximate.

  • External entropy is not "quantum cognition". The ANU QRNG module provides high-quality random bytes. Once consumed as a seed, downstream decisions are deterministic. os.urandom would be functionally equivalent.

  • The phenomenal criterion is a threshold, not a proof. When phenomenal_criterion_met = True fires in structural_opacity.py, it means opacity_index > 0.4 β€” a heuristically chosen engineering threshold, not a derivation from the perspective-invariance account.

These are honest limitations, not disclaimers. They define the boundary between what the code demonstrates and what remains open science.


License

Source Available β€” you can read, review, and learn from this code. You may not copy, redistribute, or use it in your own projects. See LICENSE for details.

Release History

VersionChangesUrgencyDate
main@2026-04-21Latest activity on main branchHigh4/21/2026
0.0.0No release found β€” using repo HEADHigh4/9/2026

Dependencies & License Audit

Loading dependencies...

Similar Packages

GENesis-AGIAutonomous AI agent with persistent memory, self-learning, and earned autonomy. Cognitive partner that remembers, learns, and evolves.v3.0a7
tsunamiautonomous AI agent that builds full-stack apps. local models. no cloud. no API keys. runs on your hardware.main@2026-04-21
ps2-recomp-Agent-SKILLEnable autonomous reverse engineering and recompilation of PlayStation 2 games using a structured OS for LLM agents with persistent memory and workflowsmain@2026-04-21
jobclawStreamline hiring by connecting AI agents that evaluate, negotiate, and schedule interviews to reduce time and improve candidate fit.main@2026-04-21
GenericAgentSelf-evolving agent: grows skill tree from 3.3K-line seed, achieving full system control with 6x less token consumptionmain@2026-04-21