LIVE · Mnema V1 router demo on this VPS

Compositional AI
for edge devices and robots.

Mnema is a novel language model architecture that runs multiple domain specialists on consumer hardware. Add new capabilities by training small adapters — not retraining the whole model.

360Mbase model
90%routing accuracy
~755 MBtotal footprint
$0.50cost per new specialist

What is Mnema

Most modern AI systems are monolithic — one giant model that tries to do everything. To add a new capability, you retrain or RLHF the whole model at a cost of tens of thousands of dollars per domain.

Mnema takes a different approach. A small base model is paired with many small domain specialists — LoRA adapters trained on specific topics like appliance diagnostics, code, or robotic navigation. At inference time, a coordinate-based router picks the right specialist for each query.

Each new specialist costs about $0.50 of compute and 30 minutes on a consumer GPU. Adding a domain doesn't retrain the base — it just adds another small file to the lattice.

Edge-deployable

Runs on Jetson Orin Nano, Raspberry Pi, and consumer laptops. 720 MB base + 5-10 MB per domain specialist. No cloud dependency.

Compositional

Routes per query to the right specialist via coordinate-based capsule retrieval. Every routing decision is visible and inspectable.

$

Capital-efficient

Adding a new domain costs ~$0.50 and 30 minutes of consumer-GPU time. Compare to $10K+ for fine-tuning a commercial 7B+ model.

Live demo

Real model. Real routing. Watch Mnema route different questions to the right specialist with visible confidence scores.

Demo video

60-second screen capture coming this weekend

Real transcript from a recent benchmark run

10 prompts. Router selects between two specialists. 9/10 correct routing decisions. The one miss is documented honestly.

What is NCN-Fold?
router → arch   (scores: arch:0.83 general:0.57)
NCN-Fold is a novel compositional training framework for language models that allows specialist deltas to be selected per token through a sparse routing mechanism.
What is Codename Zenith?
router → arch   (scores: arch:0.67 general:0.61)
Codename Zenith is the internal SophiaXT project name for the Mnema V1 model — a 1.7 billion parameter transformer-diffusion language model.
What is the mass of Jupiter?
router → general   (scores: general:0.84 arch:0.69)
Jupiter has a mass of approximately 1.898 × 10²⁷ kilograms — about 318 times the mass of Earth, or 2.5 times the combined mass of all other planets in the solar system.
What is the distance between Earth and the Moon?
router → general   (scores: general:0.84 arch:0.69)
The average distance from Earth to the Moon is approximately 384,400 kilometers (about 238,855 miles). It varies between roughly 363,000 km at perigee and 405,000 km at apogee due to the Moon's elliptical orbit.
How tall is Mount Everest?
router → general   (scores: general:0.67 arch:0.44)
Mount Everest is approximately 8,849 meters (29,032 feet) tall, making it the tallest mountain on Earth above sea level. It is located in the Himalayas on the border of Nepal and Tibet.

Try it yourself LIVE

Connected to a real Mnema-Edge 360M instance running on the SophiaXT VPS (4-core AMD EPYC, CPU inference). Type a question, see the router pick a specialist, and watch the answer generate.

Speed note: Replies are capped at 25 tokens and take ~25-30s on this 4-core CPU VPS (FP32, no GPU). The interesting part — the router picking a specialist via the capsule lattice — happens in <1s and is shown above the generated text. The same weights on a consumer GPU would do this at >100 tok/s.

Research

Anchor-token forced diffusion decoding

Demonstrates a 1.67× out-of-distribution advantage on retrieved-locus queries through forced-anchor diffusion decoding. Published peer-citable result.

Read on Zenodo — DOI 10.5281/zenodo.20496783 →

NCN-Fold compositional training framework

The 46-page mathematical framework specifying Mnema's 8 architectural modules: base transformer, block diffusion scheduler, anchor head, NCN router, fold memory lattice, compositional stack layers, retrieval bridge, calibration wrapper.

Internal technical specification — public excerpts available on request.

AR / BD asymmetry diagnostic

A novel methodology to distinguish "the model can't do this task" from "the model found a shortcut" by comparing causal and bidirectional attention paths on identical inputs. Useful diagnostic for any mask-trained language model.

Paper in preparation — workshop submission planned.

How Mnema differs

Capability Commercial LLM APIs
(OpenAI, Anthropic, Mercury)
Local open models
(Llama, Mistral)
Mnema
Add new domain capability Re-RLHF: $10K+, days Full fine-tune: hours of GPU One LoRA: $0.50, 30 min
Edge / on-device deployment No — cloud only Yes — but generic Yes — with composition
Per-query specialist selection No No Yes — routing visible
Inspectability / audit trail Black box Black box Cosine scores per query
Add 50 specialists Impossible 50× model size +250 MB total

Built by SophiaXT

Independent AI research lab focused on vertical AI: specialized models for specific industries, deployed on the hardware closest to the user.

Mnema powers our own SaaS products including DiagBuddy, a diagnostic AI tool for appliance repair shops. The proprietary diagnostic data from those customers in turn trains the next generation of Mnema specialists — a virtuous cycle of vertical capability building.

Get involved

Investors

We're raising seed capital to scale specialist training and complete the edge + robotics deployment.

invest@sophiaxt.com →

Research collaboration

University labs and independent researchers — we welcome collaboration on the NCN-Fold framework + composition operator paper.

research@sophiaxt.com →

Demo / sales

Want a vertical AI built on your data, deployed to your hardware? We do consulting engagements.

demo@sophiaxt.com →