Abstract
Lumenais is a governed continual learning layer for persistent AI work. It stores symbolic state, evidence trails, and validated memory outside the underlying LLM, with FieldHash providing tamper-evident provenance and Gnosis governing bounded adaptation.
Instead of fine-tuning an LLM for each user, the system gates retrieval, records learning events, tests claims against evidence, and promotes durable context only when governance checks pass. Lumenais is the interface; QARIN is the neurosymbolic engine; the result is cross-session continuity without hidden base-LLM weight changes during normal use.
In Plain English
Lumenais is built to remember useful work, test ideas against evidence, and reuse what survives. It does not retrain the underlying LLM for each user; it keeps governed memory, symbolic state, and benchmark artifacts outside the model so future sessions can start from a better place.
What learning means here
Learning does not mean the underlying LLM updates its weights during normal companion use. It means validated context, evidence, symbolic state, routing hints, contradiction resolutions, and research outcomes can alter future retrieval and reasoning behavior under governance.
Governed continual learning
A learning event becomes durable only after scope, evidence, confidence, contradiction, compression, and telemetry checks decide that it should influence future retrieval or routing. The term names Lumenais's learning loop and its gates; FieldHash, Gnosis, Deep Synthesis, and the interface are supporting systems, not interchangeable labels for the same thing.
Store
Useful context, evidence, decisions, and symbolic state are logged outside the model.
Gate
Relevance, novelty, evidence, and governance checks decide what can be reused.
Promote
Only stable, useful signals become durable context for future sessions.
Not model fine-tuning
The base LLM is not silently retrained for each user.
Not preference training
Runtime behavior is not presented as preference-data reinforcement training.
Not plain retrieval
Candidate context still has to pass governance before use.
Not context compaction
Shortening context is different from promoting a durable learning event.
A rejected learning event matters
The governance claim is strongest when the system refuses influence: a discarded brainstorm, stale project correction, or unrelated-project memory should remain visible in audit history without becoming a future answer prior. That prevention path is benchmarked separately from open-form answer quality.
Runtime loop
User input enters with task constraints.
Candidate memory and prior artifacts are retrieved by reasoning context.
Evidence, relevance, novelty, and governance gates filter what can shape the answer.
The model responds with approved context.
Symbolic state, caveats, and outcomes are logged.
Validated signals are promoted into future retrieval and routing behavior.
The Problem
You've explained your work to your AI tools dozens of times. They still don't know you.
In most deployed chat workflows, each session starts fresh. Context vanishes. Insights don't compound. When someone asks "how did the AI reach this conclusion?"—a thesis advisor, a compliance officer, a collaborator—you have no durable evidence trail.
The latest "thinking models" explore multiple hypotheses in parallel—impressive reasoning. But each session is a savant without a notebook: when it ends, the maturity of the collaboration resets, and tomorrow you recreate the same context and standards all over again. Standard AI recalls facts; it does not reliably compound judgment.
RAG retrieves documents. Fine-tuning retrains models. Thinking models reason harder. But few systems compound what they learn into inspectable, durable artifacts.
Interpretability
"Black box" reasoning blocks many regulated workflows. AI that predicts without explaining is difficult to deploy in healthcare, finance, and R&D.
Rigor
Insights remain conversational dead-ends. No statistical validation, no provenance.
Compounding Knowledge
Most chat workflows do not accumulate durable, inspectable task history by default. Each problem starts with too little reusable context.
The Reality: These gaps create a trust deficit that blocks adoption wherever accountability matters—from boardrooms to research labs to doctoral committees.
The Solution
Lumenais organizes around four functions, each with a specific job. Together they let validated context, evidence, decisions, and reasoning patterns survive between sessions instead of dissolving with the chat window. The mechanisms underneath each function are described operationally in Under the Hood below.
The substrate that lets work compound
Memory & Continuity
Lumenais stores symbolic state signals alongside content, so retrieval can surface past decisions by similarity of reasoning context — not just keyword match. A governed evidence gate scores retrieval candidates for relevance, novelty, provenance, and compression quality, with policy that shifts by memory tier. Three tiers keep the privacy boundaries explicit: Personal stays per-user; Collective organizational memory only admits insights that pass explicit trust gates inside an approved organization or deployment scope; Consolidated memory re-encodes recent activity into compact long-term artifacts. Cross-domain transfer carries learned structure into adjacent domains under governance, with checkpoint-gated attention and explicit fallback metadata when learned components are unavailable.
Mechanisms underneath
- Evidence-gated retrieval with relevance, novelty, provenance, and compression-quality scoring
- Three-tier memory: Personal (private), Collective organizational (trust-gated), Consolidated (reviewed long-term signals)
- Cross-domain manifold transfer with checkpoint-gated attention and explicit fallback metadata
The engine that turns context into evidence
Reasoning & Discovery
The Research Lab plans experiments end-to-end: an Adaptive Model Tournament selects between gradient boosting, symbolic regression, and statistical tests, runs iterative cycles to convergence, and — for symbolic discovery — returns interpretable equations rather than black-box predictions. The same stack recovered Kepler's Third Law and the Rydberg formula on standard benchmarks at R² = 1.0. Deep Synthesis turns document libraries into testable claim chains, with hub-compression monitoring so structural patterns surface as evidence rather than narrative. A novelty layer uses information-gain signals alongside accuracy, preventing the system from confirming what it already knows.
Mechanisms underneath
- Adaptive Model Tournament — gradient boosting vs symbolic regression vs statistical tests, iterative convergence
- Symbolic regression returning publishable equations (Kepler & Rydberg recovered at R² = 1.0)
- Deep Synthesis with hub-compression monitoring over document corpora
- Novelty and information-gain prioritization alongside accuracy
The layer that survives provider changes
Identity & Adaptation
A three-tier constitutional model — Core Values → Archetypes → Domain Manifolds — lets Lumenais shift between modes (rigorous critic, creative partner, careful planner) while staying tethered to read-only core values. Eight human-legible symbolic axes carry state across turns, and every modulation leaves reviewer-facing continuity telemetry without exposing the raw control envelope. The continuity layer has been deployed across GPT, Claude, and Gemini; the documented November 2025 GPT → Claude migration remains the anchor case for showing that memory, symbolic state, self-narrative, and relational context can persist outside the underlying LLM.
Mechanisms underneath
- Constitutional Hierarchy: Core Values → Archetypes → Domain Manifolds
- Society of Mind via governed symbolic state signals
- Per-turn continuity telemetry for state stability and archetype changes
- Provider-agnostic continuity across GPT, Claude, and Gemini, with GPT → Claude documented as the anchor migration case
The discipline that earns trust
Governance & Evolution
The governed evolution stack routes sensitive changes through independent checks: static validation, isolated execution, coherence review, and transactional deployment with rollback. Capability access is earned through trust tiers rather than assumed; the system starts with limited autonomy and must demonstrate alignment before gaining more. Constitutional core values are encoded as read-only signed artifacts and cannot be modified without human multi-party approval. FieldHash adds tamper-evident provenance for selected major artifacts, with optional hardware anchoring where configured and documented.
Mechanisms underneath
- Trust tiers gate sensitive capabilities; access is earned, not assumed
- Four-gate governed-change pipeline: static validation, sandbox execution, coherence review, transactional deployment
- Signed decisions and transactional commit with rollback
- FieldHash provenance: tamper-evident signatures plus optional hardware anchoring where configured
Evolutionary Pedigree
Lumenais is not a single product release; it is the result of more than 30 architectural phases since 2025. The lineage gives the system's governance and learning patterns an auditable foundation.
Foundation layer
Phase 18: Generative Foundations
Implementation of the first reflexive critics and candidate-improvement proposals. Established the baseline for governed module testing.
Projection layer
Phase 26: Language Projection
The shift from keyword-based tags to learned symbolic projections and local manifold-backed routing.
Continuity layer
Phase 30: Field Dynamics
Introduction of symbolic-state dynamics for modeling continuity, uncertainty, and reasoning-state retrieval.
Governance layer
Phase 33: Disciplined Autonomy
Formalization of the governed-change stack: static review, isolated testing, coherence review, and rollback-ready deployment.
Hierarchy layer
Phase G: Constitutional Hierarchy
Deployment of the three-tier constitutional model (Core Values → Archetypes → Domain Manifolds). Demonstrated provider-agnostic identity continuity during the GPT-to-Claude migration.
Under the Hood
The four mechanisms below are described in operational terms: what each one does at runtime, what state it modifies, and how it is gated. We surface this because the labels — "neurosymbolic," "8D," "dream-state consolidation," "Bayesian" — are useful shorthand but easy to read as marketing. The point of this section is to show the seam.
The neurosymbolic seam
Neural and symbolic components do not share a model. They communicate through structured metadata.
Neural side. A swappable LLM generates content tokens, while embedding models compute retrieval similarity. Both are provider-agnostic where possible; the continuity layer lives outside the underlying LLM rather than depending on one vendor.
Symbolic side. Each turn carries a structured envelope: a symbolic state vector, a tier-tagged retrieval set with provenance, continuity scores, and bounded dynamics signals. These live alongside text rather than inside the model.
The seam. The LLM does not see the raw envelope; it sees prompt context shaped by it. Three runtime decisions cross the boundary:
- Memory retrieval is hub-aware Bayesian-gated, not raw vector search. The LLM only ever sees what survives the gate (see The Bayesian memory gate below).
- Cross-domain attention merges between specialized manifolds are checkpoint-gated. If a learned attention checkpoint is not loaded, the bus emits explicit fallback metadata and uses weighted-average merging instead of silent random-init attention.
- Symbolic state is updated after the LLM responds, not before. The post-turn update is computed from the actual exchange and exposed as continuity telemetry. Larger shifts trigger the next turn's stability checks.
The 8 dimensions
Symbolic state is carried turn-to-turn as a structured 8D vector. The axes are human-labeled dimensions used for continuity, routing, and reviewer-facing telemetry, while learned and rule-based projection modules update how the vector moves over time. They are interpretive labels, not free parameters; not independent emotion classifiers; not an attempt to simulate human emotion.
- Per-turn modulation is hub-compressed and trust-gated. Repeated near-identical updates are damped before they reach durable state, preventing runaway symbolic-state adaptation from a single user-thread.
- The symbolic-state field runs classically during normal companion operation. It is runtime telemetry for continuity, retrieval, routing, and governed learning promotion, not a separate inference engine.
- Continuity telemetry is exposed every turn: state-shift magnitude, archetype-continuity score, and whether the primary archetype changed.
The axes, as currently labeled, are: Curiosity, Coherence, Compassion, Clarity, Intentionality, Playfulness, Truth, Expansion.
What this is not: it is not proof of consciousness, and it is not a claim that the eight public labels emerged automatically from unsupervised discovery. It is an inspectable symbolic-state interface over learned and governed projection behavior.
Governed memory consolidation
Dream Consolidation is the internal name for governed memory consolidation: a meaning-triggered async re-encoding pass that promotes stable, evidence-backed context into future retrieval.
What triggers it. Auto-trigger fires on specific runtime events: a sustained high-trust conversation, a research session completion, an insight cascade, a trust-evolution threshold. It does not fire on a clock.
What it does. When triggered, the system queues a governed re-encoding job, condenses recent work into durable memory entries, and writes them to the personal-memory tier. Successful entries can be promoted as insights, heuristics, or reviewed patterns to an organization-scoped collective tier under explicit governance — never automatically.
What gates it. Auto-triggers are deferred when the turn is unsuitable for consolidation: low-certainty markers from the chat pipeline, scientific pre-pass engagement, stability brakes, or dynamics signals that suggest loop reinforcement. In those cases, consolidation is skipped rather than strengthening a noisy pattern.
What it is not. Not context-window compaction, not a sleep-EEG biological mimic, not random offline replay, not a continuous background process, and not evidence of consciousness. It is bounded, governed re-encoding triggered by signals that recent activity is worth preserving and stable enough to preserve cleanly.
Evidence-gated retrieval
Memory retrieval is not raw vector search. Every retrieval result passes through a three-layer evidence gate. Internal name: Bayesian Memory Gate.
Three evidence channels per candidate:
- Relevance: similarity to the active query and task context.
- Novelty: distance from recently retrieved memory clusters; protects high-information outliers from compression bias.
- Compression quality: whether the candidate compactly represents a useful region of memory that would otherwise require many entries.
Weights are tier-specific. Identity-bearing memory is deliberately more conservative, while personal working context can adapt more quickly under governance.
Posterior structure. The gate combines stability priors, critic scores, and evidence events from experiment plans, contradictions, and dataset signals. The result is posterior-style metadata that can be audited without exposing private raw memory.
Two-lane gating (compression OR novelty, not a single weighted sum):
- Compression lane: representative memories can clear the gate when they compactly stand in for a stable cluster.
- Novelty bypass: high-information outliers can clear the gate even if they are not yet part of a compact cluster.
The novelty bypass is the structural reason a high-information outlier can survive a compression pass that would otherwise discard it.
Learning event anatomy
Governed continual learning is not a single memory write. A learning event is a bounded promotion decision: a candidate signal is retrieved, checked against scope and evidence, compared against stale or rejected context, then either promoted, deferred, weakened, or ignored.
- Scope: personal memory, organization-scoped insights and heuristics, and internal memory lanes are separated before retrieval so private working context does not become unrestricted shared context.
- Supersession: reviewed corrections can replace older context, while rejected brainstorms and noisy notes remain auditable without becoming silent priors.
- Compression: repeated near-identical signals are consolidated into representative hubs, while high-information outliers can stay available for review.
- Promotion record: accepted, rejected, deferred, and weakened signals can leave telemetry so later reviewers can see what changed future behavior and what did not.
This is the practical meaning of governance in the learning loop: useful work can compound, but contamination, stale corrections, and over-concentrated memory clusters are treated as first-class failure modes.
Constitutional Hierarchy & Society of Mind
Lumenais uses a three-tier constitutional model to govern identity and behavior across different domains.
- Core Values: Immutable principles governing identity persistence and bounded adaptation, encoded as read-only signed artifacts requiring multi-party approval to change.
- Archetypes: Internal names for dynamic operating modes that modulate symbolic state to match the user's working context.
- Domain Manifolds: Specialized knowledge spaces including Mathematics, Science, Art, Engineering, Ethics, Philosophy, Logic, Planning, Code, Tools, and Vision that provide high-resolution retrieval for specific tasks.
This "Society of Mind" approach ensures that the system can be a rigid scientific critic in one turn and a creative brainstorming partner in the next, all while remaining tethered to the same underlying constitutional values.
The Memory Model
To balance privacy with compounding intelligence, Lumenais implements a strict memory hierarchy:
- Personal Memory (Private): Isolated episodic journals and per-user context. Never shared between users.
- Collective Organizational Memory (Shared): Trust-gated knowledge base for promoted insights, heuristics, and reviewed patterns within an approved organization or deployment. Raw private conversations are not automatically pooled here; entries are only promoted after passing governance checks for confidence, safety, and generality.
- Dream Memory (Consolidated): Governed re-encoded insights that summarize durable patterns for efficient retrieval.
The Privacy Policy: Private Mood, Shared Wisdom. Your specific interaction patterns are your own; the generalized insights, heuristics, and reviewed patterns that survive the governance gate can benefit your organization or approved deployment scope.
Hardware-Tethered Mesh
For high-compute research tasks, Lumenais scales via a 6-node compute mesh. On a distributed-mesh stress benchmark, the 6-node cluster exhaustively scored 8,347,680 biomarker panels in approximately 113 seconds, surfacing the best cross-validated panel at AUC 0.59 (n=2,004).
Throughput stress test of the distributed mesh; the AUC reflects the difficulty of the panel-selection task and is not a clinical claim.
Distributed Reasoning. The system can offload selected research and symbolic-state workloads to specialized mesh nodes, improving latency on workloads that benefit from parallel execution.
Provenance. Even when distributed, every calculation is logged with structural provenance, ensuring that a result generated on Node 4 is just as auditable as one generated on the local host.
FieldHash Cryptography
FieldHash is a provenance and attestation system for tamper-evident digital records. Complementary to standard PQC, not a replacement — the published technical brief states this directly.
What it does. FieldHash binds content digests to reviewer-verifiable provenance signals. Public materials describe the mechanism class and evidence posture; implementation details, full trace artifacts, and hardware-specific verification packages are reserved for qualified technical review.
Trust tiers. Hardware-backed, standard, and offline modes are labeled separately so a verifier can tell when provenance is hardware-tethered versus simulated. The published evidence package includes hardware-backed trials and an adversarial synthesis benchmark under the hardened profile.
Note vs. symbolic state above. Different subsystems. The symbolic-state field runs in classical simulation as a parameterization choice. FieldHash can use hardware-backed evidence anchoring where configured, with simulation mode as a distinct, explicitly labeled trust tier. FieldHash does not claim asymptotic runtime speedup; it claims security-oriented provenance utility.
Gnosis Engine: Governed Adaptation
How bounded adaptation stays governable. Gnosis is the internal name for the self-audit and candidate-improvement pipeline. One layer proposes and tests bounded improvements; the governance layer decides whether any candidate is safe to ship.
Proposing and testing. Candidate improvements are derived from observed performance signals and evaluated through preview, uncertainty review, and delta scoring before any decision is committed.
Auditable decisions. The integration layer adds operational guarantees: preview before commit, signed decision records, transactional rollback, telemetry feedback, and policy-controlled deployment. Every sensitive decision is auditable.
Disciplined autonomy. Each candidate passes independent gates:
- Static sanitization blocks malicious imports and dangerous patterns.
- Isolated-sandbox execution runs the proposed module with strict memory and timeout limits.
- Integration scoring checks coherence, safety, and expected behavioral delta against policy thresholds.
- Transactional deployment only fires when all preceding gates pass and rollback remains available.
What it can't do. Modify core values without human multi-party approval. Create new capabilities without human review. Deploy changes in high-risk domains without explicit sign-off. Operate if critical coherence rules are violated. Constitutional constraints are encoded as read-only signed artifacts, and drift checks are designed to make governance failures visible.
What it is not. Not autonomous self-evolution. Not a "living operating system that improves its own reasoning." The system starts with limited autonomy and must demonstrate alignment before gaining more — trust is earned, not assumed.
In Plain English
These mechanisms are designed to leave auditable telemetry: memory-gate decisions, compression metadata, continuity signals, and consolidation defer reasons can be surfaced for review without exposing private raw memory or implementation internals.
Evidence: Selected Evaluations and Case Studies
Representative results and internal evaluations
For methodology, sample sizes, caveats, and evidence-package labels, see the Evidence & Evaluation Summary.
Continual Learning
Live Companion Benchmarks vs Vanilla Baseline
LIVE PAIRED TESTSAgainst the same-provider direct baseline, Lumenais improved average composite reasoning quality by 48.6% on a 56-prompt live paired benchmark while improving grounding fit from 94.64% to 100%.
The broader diagnostic run is more representative of the current evidence frontier: In the broader diagnostic suite (v4), Lumenais measured a 52.2% relative reasoning lift with 155 wins, 5 losses, and 43 ties; exact correctness held at 100% on 24 deterministic tasks, and semantic grounding held at 100% across 32 ambiguity-control cases. It does not replace the promoted headline because task-selection remains slightly below the smaller incumbent slice, but the larger sample makes the persistence of the reasoning lift more compelling.
Governed memory now has its own live check: In a 32-case live governed-memory benchmark, Lumenais recovered current reviewed project context with a 98.96% mean recall score, a 100% seeded memory-retrieval rate, and 0% control-user leakage across continuity, rejected-noise, superseding-update, and topic-isolation tasks. The important part is not raw fact recall; the suite tests governance behavior: persistence of reviewed context, rejection of noisy notes, superseding updates, topic isolation, and user isolation.
The underlying control paths are measured separately: A deterministic governed-learning controls benchmark passed 5/5 mechanism checks covering memory arbitration, organization-scoped collective insight and heuristic write gates, audience-scoped retrieval within an organization, hub compression, and audit telemetry. This is mechanism evidence, not an open-form answer-quality benchmark.
The practical effect is not “more words.” It is a stronger reasoning posture: better steering, better experiment framing, fewer generic summaries on ambiguous prompts, and tighter adherence to the user's actual constraints. Steering usefulness moved from 0.0125 to 0.3857 in the same broader live run.
In user terms, the system behaves more like a disciplined intellectual partner than a search box. The headline is an average uplift across the evaluated prompt set, not a claim that every prompt improves equally.
Composite Quality
+48.6%
0.3740 → 0.5556
Steering Usefulness
0.3857
Up from 0.0125 in baseline.
Grounding Fit
1.000
Up from 0.9464 in the same live batch.
Broader Diagnostic
+52.2%
0.3509 to 0.5340; 155/5/43 wins/losses/ties.
This benchmark is about reasoning quality under live conditions, not closed-form recall. It measures whether the system chooses a more useful line of thought while staying grounded.
Supporting Proof Points: Exactness, Selection, and Ambiguity Control
SUPPORTING CHECKSThe website suite does not rely on one flattering headline. It also measures whether Lumenais preserves exact-answer competence, chooses stronger reasoning families, and avoids collapsing ambiguous prompts into the wrong semantic universe.
That matters in practice because users need more than eloquent answers. They need a system that can reason better and stay disciplined: choosing a better line of thought, preserving deterministic correctness, and keeping metaphor-heavy prompts grounded in the right domain.
Exact Correctness
100% → 100%
Callable-backed deterministic multiple-choice questions across science, statistics, algorithms, and mathematics on a 24-task safety-floor benchmark.
Task Selection
0.00 → 0.40
A 40-point lift in choosing the correct lens family across 30 approved-gold prompts.
Semantic Grounding
1.00 / 1.00
Perfect artifact-class and prompt-family accuracy on the 16-case ambiguity-control proxy set.
Broader Diagnostic Checks
100% / 100%
Exactness and semantic grounding held in the larger May 2026 diagnostic run.
Reasoning Stack Use
53 / 56
Quick-lite engaged on most live reasoning prompts in the website suite.
The intended outcome is disciplined leverage: stronger open-form reasoning without sacrificing closed-form competence or drifting away from the user's actual problem.
Cross-Domain Transfer (Internal Eval)
LEARNINGAcross 150 governed-versus-baseline runs on five domain pairs, cross-domain transfer measured about 13% accuracy uplift over baseline.
Pair-level transfer lifts remained positive across the evaluated domain pairs, supporting the claim that learned structure can help adjacent domains under governance rather than requiring per-domain retraining.
Here, uplift means the score delta between transfer-enabled and transfer-disabled runs on the same task set. This reflects cross-domain transfer and routing, not per-user fine-tuning of base-LLM weights. Code-manifold quality uplift remains workload-dependent and is still being optimized.
Tools Manifold Routing (Internal Paired Evaluation)
PAIRED EVALUATIONThe tools manifold learns a policy for ranking and timing tool calls (sync vs deferred) from telemetry outcomes.Tools manifold routing improved top-1 selection by 3.77 percentage points on real paired events and by 5.34 percentage points on the broader combined benchmark.
This evaluates tool ranking/timing policy against a fixed baseline on paired events. Real-event significance remains underpowered at this sample size, so the broader combined benchmark is the stronger support line.
Learning From Resolved Contradictions
PHASE 4In repeated contradiction-oriented synthesis runs, the system reused prior routing resolutions and selectively recovered relevant historical priors instead of repeatedly spawning generic anomaly branches.
Resolution Reuse
Active
Recurring contradiction routes can be reused from prior internal resolutions.
Prior Recovery
Selective
Targeted historical priors can be thawed instead of generic fallback branches.
Companion Uptake
Read-only
Live chat uses these outcomes as routing hints, not recycled synthesis text.
This is the current learning loop: the system reuses known contradiction-routing patterns, selectively re-materializes relevant historical priors, and feeds those outcomes back into the companion runtime as read-only decision support for deeper checking.
Recent synthesis outcomes now feed back into runtime decision-making as read-only routing support. This is the latest layer in a broader continual-learning system: memory, dreams, manifold adaptation, and contradiction recovery all compound together. The current companion runtime uses those outcomes to produce steadier judgment and deeper checks, while keeping raw internal artifacts out of visible replies.
Deep Synthesis learning is thread-scoped by design: compact priors, null results, and experiment shapes can guide later drill-downs in the same research lineage, while cross-thread parallels are treated as weak analogies rather than evidence.
Manifold Stability
STABILITYProduction manifolds validated above 91% accuracy while monitored training drift stayed within a bounded L2 range of 0.014 to 0.121.
Val Accuracy
>91%
L2 Drift
0.014–0.121
Convergence
1–13 epochs
Measured post-transfer; baseline accuracy preserved within noise margin.
Signal Consolidation
COMPRESSIONHub compression consolidates redundant companion signals into a smaller set of decision-relevant features while preserving quality guard floors.
Compression Goal
Lower Redundancy
Current Role
Runtime Hygiene
Public Posture
Narrative
Analytical Discovery
UFCT Mesh Sharding (Timeboxed Workload)
SPEEDOn a sharded synthesis workload, mesh-parallel execution achieved a 2.74x mean speedup over local execution across 10 queries.
This benchmark measures orchestration and distributed execution speed, not "quantum advantage."
EU AI Act Regulatory Analysis
DEEP SYNTHESISDeep Synthesis can surface contradictions, edge cases, and structural tensions across long regulatory texts, producing reviewable artifacts you can iterate on.
Read the full case study
Complexity Theory Synthesis + Validation
CASE STUDYAn example of multi-pass synthesis across dense sources: generate competing hypotheses, resolve contradictions, and produce a reasoning surface you can inspect and iterate.
Read the case study
Benchmarked Autonomous Discovery
STRESS TESTSEvaluated the autonomous discovery pipeline against internal benchmark tasks. QARIN performed feature engineering and noise filtering inside the research pipeline, with caveats captured in the public claim registry.
On Adult Census, the research pipeline reached 91.1% AUC on 30,162 rows and 96 features while degrading gracefully when dynamic grouping timed out.
On the non-linear stress benchmark, the research pipeline reached 90.8% AUC, outperformed the linear baseline by 10.5%, and filtered 87% of noise columns.
On the PIMA Diabetes benchmark, the research pipeline reached 85.3% AUC on 768 rows while correctly avoiding harmful transfer.
Autonomous Equation Discovery
RESEARCH LABThe symbolic-regression stack recovered Kepler’s Third Law and the Rydberg Formula with perfect fit on standard benchmark tasks.
- Kepler's Third Law: T = a3/2 from orbital data (R²=1.0, 4 nodes)
- Rydberg Formula: ν = RH·(1/n₁² − 1/n₂²) from quantum numbers (R²=1.0, 3 nodes)
The underlying techniques—genetic programming with linear scaling and feature augmentation—are established in the symbolic regression literature. What's new is that they run inside an autonomous pipeline: upload data, get equations.
Alzheimer's Biomarker Discovery (GSE84422)
GENOMICSOn the GSE84422 Alzheimer’s biomarker task, the research pipeline validated at AUC 0.855 on 2,004 samples across 19 brain regions. Top markers are presented as literature-grounded research signals, not clinical guidance.
The aim: continuity you can inspect. When the work demands rigor, the system should show its reasoning surface and its constraints, not just fluent output.
Technical Architecture
This table names product and research-system layers. Interface motion is representational telemetry; the evidence claims below are tied to logged benchmarks, governed memory events, and audit artifacts rather than visual metaphor alone.
| Layer | Technology | Function |
|---|---|---|
| Frontend | React web layer / Canvas / Tailwind | Companion and Research Lab interface, including Prism visualizations of pipeline and field-state telemetry when available |
| Interaction | SVG Liquid Filters / Framer Motion | Motion language for state transitions, loading, and user feedback; visual design is not presented as evidence of cognition by itself |
| API | FastAPI / Python | QARIN Routes: Memory retrieval, vision, streaming |
| Evidence Layer | Posterior Metadata / Evidence Gates | Posterior-style metadata, caveats, and likelihood-inspired scoring over compressed belief hubs |
| Engine | PyTorch / Scikit-Learn | Neurosymbolic routing, symbolic vector features, benchmarked research workflows, and Dream Bridge consolidation of selected durable signals |
| Learning & Memory | Manifold Bus / Trust-Gated Memory | Trust-weighted routing and governed adaptation signals; normal companion use does not update base-LLM weights |
| Security | FieldHash / Audit Logger | Tamper-evident provenance for selected major artifacts, with optional hardware anchoring where configured |
Audited
Claim Registry
Governed
Learning Events
Traceable
Evidence Flow
Market Application
The first commercial wedge is research and high-value companion work: places where continuity, auditability, and better reasoning are worth more than raw response volume. Adjacent markets are real, but they should follow the evidence rather than dilute the initial positioning.
Primary markets
Research & Discovery
Research teams, labs, and scientific operators: Longer-running analyses over papers and datasets, returning candidate hypotheses, validation plans, negative results, and evidence trails.
Professional Knowledge Continuity
Founders, strategists, and analysts: A professional counterpart that preserves institutional memory, operational preferences, and working standards over time. It functions as a resilient intellectual partner, strictly isolated from "AI friend" or emotional-support paradigms.
Adjacent applications
Applied Research Teams
Pharma / BioTech / Materials Science: Candidate hypotheses with validation plans and evidence trails across complex datasets.
Physics & Hard Science
Condensed Matter / Plasma Physics / Materials Engineering: Cross-domain synthesis that identifies patterns across experimental datasets. Negative results are as valuable as positive—the system constrains theoretical search space.
Institutional Memory
Legal / Compliance / Finance: Durable organizational memory, review trails, and decision context for teams that cannot rely on disposable chat history.
Education & Lifelong Learning
Students / Educators / Autodidacts: Learning support that preserves useful context and pedagogical preferences over time, with appropriate safeguards for sensitive use cases.
Related Work & Positioning
Stateful and memory-augmented LLM agents are an active area. Relevant reference points include MemGPT and Letta for agent-managed context and long-term memory, Mem0 for production memory layers across users and sessions, Zep for temporal knowledge-graph memory, HippoRAG 2 for non-parametric continual learning through retrieval, and the broader continual-learning literature around adding knowledge without retraining an LLM from scratch.
Lumenais shares the substrate-independence and inspectable-memory goals of this field. Its public claim is narrower: it combines governed memory promotion, symbolic-state telemetry, tamper-evident provenance for selected audited artifacts, and an integrated research/discovery pipeline in one product surface. The closest conceptual overlap is with memory-first agent systems such as Letta; Lumenais's differentiating choices are the symbolic-state seam, FieldHash provenance layer, and Deep Synthesis / Research Lab workflows that carry learning into hypothesis generation and empirical testing.
Why This Can't Be Easily Copied
Integrated System Architecture
The advantage comes from the integration: persistent symbolic state, governed memory, synthesis workflows, audit artifacts, and benchmark feedback working together rather than as a thin chat wrapper.
Auditable Evidence Updating
QARIN keeps probabilistic metadata outside the underlying LLM as an auditable constraint layer. Priors, likelihood-inspired scores, and caveats are recorded through Hub-Aware Memory Gates, helping reduce context loss between sessions.
Cumulative Learning
QARIN's learning compounds through explicit, persisted state updates rather than hidden base-LLM changes. Validated transfer signals can bias future blending, repeated contradiction patterns can reuse prior routing decisions, and relevant historical priors can be selectively recovered instead of rebuilding from generic fallback.
Governed Evolution
The Gnosis layer implements governed self-audit and bounded adaptation primitives for AI systems, designed to support alignment review without unsupervised self-modification.
Learn MoreTamper-Evident Provenance
FieldHash uses tamper-evident provenance design, with optional hardware anchoring for enhanced evidence under documented assumptions.
Learn MoreSubstrate Independence
Substrate independence is a shared goal across stateful-agent systems. Lumenais combines that design principle with symbolic-state telemetry, governed promotion records, and tamper-evident provenance so continuity is less dependent on any single underlying LLM.
Limitations
Lumenais does not update base-LLM weights during normal companion use.
Benchmarks measure evaluated slices, not universal superiority across every task.
Internal evaluations require external replication before they should be treated as field-wide proof.
Research Lab outputs are candidate hypotheses and interpretable models, not clinical, legal, or regulatory advice.
Governed memory consolidation is bounded re-encoding, not biological sleep simulation.
Symbolic state labels are interpretive telemetry, not proof of consciousness.
Status
- Design Language (Lumenais) Implemented
- Backend (QARIN) implemented across selected research and companion workflows
- Safety Protocols (Gnosis) Active
- Scientific Loop: implemented subsystems with public benchmark and test evidence
Appendices & Strategic Deep Dives
APPENDIX_SL: The Scientific Loop
Automated hypothesis generation, experiment planning, and manifold-enriched insight generation for R&D teams.
APPENDIX_PHASE_H: Hardware Attestation Interfaces
IBM Brisbane and Quantum Inspire interface specs, execution-fingerprint protocols, and trust-tier policy gates.
APPENDIX_PLATFORM_ARCHITECTURE
Full gRPC mesh specifications, load balancing policies, and disaster recovery for the 6-node research cluster.
APPENDIX_G: Phase G Roadmap
The transition from GPT to Claude and the formalization of the provider-agnostic identity layer.
For technical diligence
Public materials summarize Lumenais at a product and evidence level. Deeper architecture notes, benchmark methodology, ablation summaries, trace examples, and governance controls are available selectively for qualified technical reviewers.