LIVE · Mnema — the first shippable Compositional Language Model

Compositional Language Models
a new architecture for AI.

Monolithic models cost $10K-$1M to add a new domain. CLMs do it for $0.50. A small base coordinates with routed NCN Module specialists through a capsule lattice — add capability by composition, not retraining. The wrapper works on any open transformer (validated on Mnema 360M and Llama 3.2 1B). Try it live below.

LIVE connected · Mnema-Edge 360M · bf16 + KV cache

360Mbase model

~7 tok/sCPU inference

90%routing accuracy

$0.50per specialist

What is Mnema

Most modern AI systems are monolithic — one giant model that tries to do everything. To add a new capability, you retrain or RLHF the whole model at a cost of tens of thousands of dollars per domain.

Mnema introduces a new architecture category: the Compositional Language Model (CLM). A small shared base coordinates with many domain specialists — proprietary NCN Modules (.ncn files) trained on specific topics like appliance diagnostics, code, or robotic navigation. At inference time, a capsule lattice uses cosine retrieval over learned capsule embeddings to pick the right specialist for each query — no learned-gating router required, fully interpretable, training-free to extend.

Each new specialist costs about $0.50 of compute and 30 minutes on a consumer GPU. Adding a domain doesn't retrain the base — it just adds another small file to the lattice.

⚡

Edge-deployable

Runs on Jetson Orin Nano, Raspberry Pi, and consumer laptops. 720 MB base + 5-10 MB per domain specialist. No cloud dependency.

⚙

Compositional

Routes per query to the right specialist via coordinate-based capsule retrieval. Every routing decision is visible and inspectable.

Capital-efficient

Adding a new domain costs ~$0.50 and 30 minutes of consumer-GPU time. Compare to $10K+ for fine-tuning a commercial 7B+ model.

Technical detail

The live demo above is real. Below: how it works, recent benchmark runs, and the architecture under the hood.

Demo video

60-second screen capture coming this weekend

Real transcript from a recent benchmark run

10 prompts. Router selects between two specialists. 9/10 correct routing decisions. The one miss is documented honestly.

What is NCN-Fold?

router → arch (scores: arch:0.83 general:0.57)

NCN-Fold is a novel compositional training framework for language models that allows specialist deltas to be selected per token through a sparse routing mechanism.

What is Codename Zenith?

router → arch (scores: arch:0.67 general:0.61)

Codename Zenith is the internal SophiaXT project name for the Mnema V1 model — a 1.7 billion parameter transformer-diffusion language model.

What is the mass of Jupiter?

router → general (scores: general:0.84 arch:0.69)

Jupiter has a mass of approximately 1.898 × 10²⁷ kilograms — about 318 times the mass of Earth, or 2.5 times the combined mass of all other planets in the solar system.

What is the distance between Earth and the Moon?

router → general (scores: general:0.84 arch:0.69)

The average distance from Earth to the Moon is approximately 384,400 kilometers (about 238,855 miles). It varies between roughly 363,000 km at perigee and 405,000 km at apogee due to the Moon's elliptical orbit.

How tall is Mount Everest?

router → general (scores: general:0.67 arch:0.44)

Mount Everest is approximately 8,849 meters (29,032 feet) tall, making it the tallest mountain on Earth above sea level. It is located in the Himalayas on the border of Nepal and Tibet.

Research

Anchor-token forced diffusion decoding

Demonstrates a 1.67× out-of-distribution advantage on retrieved-locus queries through forced-anchor diffusion decoding. Published peer-citable result.

Read on Zenodo — DOI 10.5281/zenodo.20496783 →

NCN-Fold compositional training framework

The 46-page mathematical framework specifying Mnema's 8 architectural modules: base transformer, block diffusion scheduler, anchor head, NCN router, fold memory lattice, compositional stack layers, retrieval bridge, calibration wrapper.

Internal technical specification — public excerpts available on request.

AR / BD asymmetry diagnostic

A novel methodology to distinguish "the model can't do this task" from "the model found a shortcut" by comparing causal and bidirectional attention paths on identical inputs. Useful diagnostic for any mask-trained language model.

Paper in preparation — workshop submission planned.

How Mnema differs

Capability	Commercial LLM APIs (OpenAI, Anthropic, Mercury)	Local open models (Llama, Mistral)	Mnema
Add new domain capability	Re-RLHF: $10K+, days	Full fine-tune: hours of GPU	One NCN Module: $0.50, 30 min
Wrap a different base model	N/A — closed	N/A — single-base	Yes — validated on Llama 3.2 1B at 80% N=2
Edge / on-device deployment	No — cloud only	Yes — but generic	Yes — with composition
Per-query specialist selection	No	No	Yes — routing visible
Inspectability / audit trail	Black box	Black box	Cosine scores per query
Add 50 specialists	Impossible	50× model size	+250 MB total

Built by SophiaXT

Independent AI research lab focused on vertical AI: specialized models for specific industries, deployed on the hardware closest to the user.

Mnema powers our own SaaS products including DiagBuddy, a diagnostic AI tool for appliance repair shops. The proprietary diagnostic data from those customers in turn trains the next generation of Mnema specialists — a virtuous cycle of vertical capability building.

Get involved

Investors

We're raising seed capital to scale specialist training and complete the edge + robotics deployment.

invest@sophiaxt.com →

Research collaboration

University labs and independent researchers — we welcome collaboration on the NCN-Fold framework + composition operator paper.

research@sophiaxt.com →

Demo / sales

Want a vertical AI built on your data, deployed to your hardware? We do consulting engagements.

demo@sophiaxt.com →

Compositional Language Models a new architecture for AI.