neural ai/
Architecture

Private Alpha

Architecture

How Anant 1.0's cognitive memory works.

This page is the technical documentation for Anant 1.0, the first cognitive memory system from Neural AI. It describes systems that are currently implemented. Areas in active development and open research questions are documented separately on the Research Agenda.

The Memory Engine

Every incoming message passes through a four-stage pipeline that converts unstructured language into structured, retrievable memory:

1. Extraction

An LLM-based pipeline pulls structured facts, entities, relationships, emotions, and preferences from natural conversation. Ten diverse few-shot examples ensure accuracy across English, Hindi, and Hinglish.

2. Verification & Quality Gate

Extracted facts are verified against the original message, then scored by a quality gate with calibrated examples. Facts that don't meet the threshold are rejected. Injection attempts are detected and blocked at this stage.

3. Contradiction Resolution & Temporal Versioning

New facts are checked against existing knowledge. Contradictions trigger temporal versioning: “Worked at Google (March–June)” becomes “Works at Microsoft (June–present)”. History is preserved, not overwritten.

4. Storage & Linking

Facts are stored with confidence scores, belief states (known / believed / inferred / speculated), permanence levels, and embeddings. Similar memories are linked or merged using cosine similarity thresholds.


The Living Knowledge Graph

Every entity mentioned in conversation - people, places, organizations, concepts - becomes a node in the user's knowledge graph. Relationships between entities are stored as typed, weighted edges. The graph supports:

  • Multi-type relationships: Raj can be both your colleague AND your gym partner.
  • Coreference resolution: “mom”, “maa”, “mother”, and “Sunita” resolve to one person.
  • Cross-script merging: “पापा” and “papa” are recognized as the same entity.
  • Transitive inference: If A knows B and B knows C, A might know C.
  • Emotional importance weighting: Entities mentioned with strong emotions rank higher than frequently mentioned but emotionally neutral ones.

Consolidation Cycles — Inspired by CLS Theory

McClelland, McNaughton, and O'Reilly (1995) describe how the hippocampus rapidly stores specific episodes while the neocortex slowly extracts general patterns through complementary learning systems. Anant implements an offline consolidation cycle that:

  • Consolidates similar memories to prevent graph bloat.
  • Applies confidence decay based on the Ebbinghaus forgetting curve (half-life varies by permanence: 7 days for ephemeral, 90 for medium-term, 365 for long-term).
  • Infers transitive relationships between people (person-to-person, not keyword-based).
  • Detects knowledge gaps (“you mentioned a sister but never said her name”) and generates follow-up questions.
  • Re-ranks entity importance based on emotional weight, not just mention frequency.
  • Generates follow-up questions about unresolved events (“how did the interview go?”).

4-Channel Retrieval with RRF Fusion

When you ask a question, Anant doesn't just search text. Four retrieval channels run in parallel:

  • Semantic search - pgvector cosine similarity on 384-dimensional embeddings.
  • Keyword search - PostgreSQL full-text search with English and Simple (Hindi/Hinglish) tokenizers.
  • Knowledge graph traversal - Recursive CTE walks relationships up to 2 hops deep.
  • Temporal retrieval - Surfaces recent and time-relevant memories.

Results are fused using Reciprocal Rank Fusion, re-ranked by a cross-encoder, and filtered by an LLM context selector. Multi-hop confidence is discounted by hop count (0.85^n) so deep inferences are presented with appropriate uncertainty.


Epistemic Awareness - The AI That Knows What It Doesn't Know

Every memory has a belief state:

  • Known - directly stated by the user (“I work at Razorpay”).
  • Believed - was known but hasn't been mentioned in a while (confidence decayed).
  • Inferred - derived from other facts (“you might know Priya through Raj”).
  • Speculated - very low confidence guess.

When Anant responds, it tags its sources: “You told me you work at Razorpay” vs “Based on what you've shared, it seems like you might know Priya through Raj.” It never presents guesses as facts.


Proactive Intelligence - The For You Page

Every day, Anant runs six analysis passes over everything it knows about you: emotional trends, cross-domain insights, relationship dynamics, goal tracking, blind spots, and actionable next steps. Each pass feeds its output into the next, creating a coherent briefing.

The result is an AI that doesn't just wait for questions - it notices patterns, surfaces upcoming events, and checks in on things that matter.


Native Hindi and Hinglish Support

India has 400 million Hindi speakers who code-switch between Devanagari and Roman script constantly. Anant handles this natively:

  • Keyword retrieval falls back to Simple tokenization for non-English text.
  • Entity resolution merges across scripts (चाचा ↔ chacha ↔ uncle).
  • Coreference works across languages (मम्मी ↔ mummy ↔ mom ↔ mother).
  • The LLM responds in whatever language the user writes in.

Prompt Injection Defense

User input is wrapped in XML boundary tags before entering any LLM prompt. The extraction pipeline has an independent injection detection layer - the quality gate evaluates whether content is genuine personal sharing or a manipulation attempt.

In testing, 7 out of 8 sophisticated injection attacks were fully blocked, including identity override, system prompt extraction, memory injection, cross-user data theft, deletion attempts, jailbreak roleplay, and gradual identity erosion.


How Anant differs from existing memory systems

Persistent memory for language models is not a new idea. RAG systems retrieve relevant text chunks. Mem0 and Letta build memory layers around language models. ChatGPT and Claude have each shipped memory features over the past year. Each of these solves a real problem.

What they share is a common assumption: memory is a feature that retrieves text. Anant is built on a different assumption: memory is an architecture that represents, validates, consolidates, and reasons over structured cognition.

The difference is not a single capability. It is what happens when four architectural commitments are made together.

Structured cognition, not text retrieval

RAG systems and most memory layers store text chunks and retrieve them by semantic similarity. The unit of memory is unstructured prose. Retrieval is a search problem, and the language model is asked to comprehend the retrieved text on the fly during generation.

Anant extracts structured cognition during ingestion. Every conversation produces typed entities (people, organizations, events, places, concepts), typed relationships between them (works_at, partner_of, located_in, caused_by), and discrete memories with provenance. The unit of memory is not text — it is a graph of structured assertions, each linked back to the conversation that produced it. Retrieval becomes a structured query over a knowledge graph, not a similarity search over a text corpus.

Belief states as first-class architectural primitives

Existing memory systems treat all retrieved data as equally authoritative. A fact you stated five minutes ago and a fact inferred indirectly from a conversation last month return at the same confidence. The language model is left to figure out what to trust.

Anant differentiates four belief states at the architecture level: known (directly observed and confirmed), believed (stated with high confidence but not independently verified), inferred (derived through reasoning over other memories), and speculated (hypothesized during consolidation). Each memory carries its belief state through retrieval and into the prompt itself, so the language model knows the epistemic status of every piece of context it receives. Confidence is not a post-hoc filter. It is part of how the system thinks.

Consolidation, not just storage

RAG and most memory layers store and retrieve. They do not transform memory over time. The same chunk you indexed last year is the same chunk retrieved today, with the same embedding and the same content.

Anant runs offline consolidation cycles inspired by complementary learning systems theory. During these cycles, near-duplicate memories are merged, abstract patterns are extracted from specific episodes, transitive relationships are inferred, and confidence decays following Ebbinghaus forgetting curves. The memory state Anant retrieves from tomorrow is not the same memory state it stored today. It is a continually-refining representation that mirrors how biological memory consolidates over time.

Multi-channel retrieval, not a single index

Most memory systems retrieve through one mechanism — typically dense vector similarity. This works well for paraphrased semantic recall but fails on three other things biological memory does easily: exact-match lexical recall, relationship traversal across the knowledge graph, and temporal queries scoped to specific time windows.

Anant runs four parallel retrieval channels — semantic embedding similarity, lexical full-text search, recursive graph traversal, and temporal window queries — and fuses their results using Reciprocal Rank Fusion before re-ranking with a cross-encoder. Each channel catches what the others miss. The integration is what produces retrieval that feels like cognition rather than search.

The integration is the architecture

None of these four commitments is novel in isolation. Structured information extraction is well-studied. Belief states appear in older symbolic AI systems. Memory consolidation has been explored in continual learning research. Hybrid retrieval is increasingly common.

What Anant does is make all four architectural commitments together, in a single coherent system, with each layer informing the others. Structured extraction enables belief states. Belief states inform consolidation. Consolidation enriches the retrieval graph. Multi-channel retrieval exploits the structure that extraction produced.

The cognitive memory thesis is not that any of these pieces is uniquely difficult. It is that the integration is the architecture, and the architecture is what current systems are missing.

End of Documentation