PAIR Systems
Autonomous IR for enterprise AI systems
We are building the infrastructure and optimization layer that lets every enterprise deploy a world-class, self-improving agentic memory system.
2017: Zero shot neural retrieval was first demonstrated2018-2019: multilingual retrieval became real2020-2021: CLIP showed retrieval could span text and vision2025-2026: universal embeddings and agent-driven optimization loops make self-tuning retrieval plausible2017Vectara10+ years as a published ML researcherVectaraTom Diffenbach: ex-Google L6, senior systemsZaid Abdurehman: ex-Oracle, security and hardeningRogger Luo and Weisi Fan: PhD-level ML and optimization depthGoodMem should be framed as a memory + retrieval + inference control plane for enterprise AI systems.
Becomes more valuable as PAIR’s optimization loops turn it into a self-improving retrieval system.
Start where enterprises already have pain today. Expand toward a universal retrieval layer that improves itself over time.
$60K
Live recurring ARR today
$460K
ARR if current Incorta infrastructure partnership closes
$120K
Total historical revenue
Infrastructure partner
$300K-$400K+
Predictable annual platform license
OEM per-instance
$25K-$110K
Annual subscription per customer instance
SI channel
Enterprise and federal access
Direct enterprise
Higher ACV and tighter product feedback loops
1. Enter
Through OEMs, SIs, or direct enterprise demand
→
2. Deploy
Customers buy infrastructure plus help getting it into production
→
3. Expand
Production wins drive renewals, more instances, and direct pull
$4.5B-$7.5B
Core TAM
$1B-$2.5B
Near-term SAM
$30M-$100M
3-5 year SOM
AI Platforms for DS/ML (31.3B), AI Application Development Platforms (8.4B), and AI Data (3.1B)TAM captures only the retrieval, memory, reranking, and inference-control share of those categories (5-10%; 25-35%; 25-50%, respectively).SAM narrows to self-hosted, provider-neutral, API-first buyers (25-35%)SOM assumes a low-single-digit 3-5 year share of that near-term market (3-4%), with autonomous optimization as the longer-term upsideSource model: Gartner worldwide AI spending forecast, January 15, 2026. Category anchors used in the internal TAM model: AI Platforms for DS/ML ($31.1B), AI Application Development Platforms ($8.4B), and AI Data ($3.1B).
10+ production deployments and $2.5M-$3.5M ARRIncorta to >$500K ARR1+ additional OEM / platform partner4-5 direct customersThe Vision
Make self-improving enterprise memory and retrieval deployable everywhere
As agentic AI takes on real enterprise decisions, self-improving memory over multimodal, unstructured data becomes mandatory. PAIR Systems will be the foundational infrastructure for that transition.
Vectara, Contextual AI, Credal, Ragie, Onyx, RAGFlowMem0, Zep, CogneeGlean, Elastic, CoveoAWS, Google Cloud, AzureGoodMem competitive landscape
Fortune 50 via Incorta: $60K/year, live and payingIncorta infrastructure partnership: $400K/year, unsignedWanclouds: $30K, one-time historical revenue| Anchor category | 2026 spend | GoodMem carve-out | Why it maps | Implied contribution |
|---|---|---|---|---|
| AI Platforms for DS/ML | $31.1B |
5-10% |
Retrieval, reranking, memory, and inference-control slice inside broader AI platforms | $1.6B-$3.1B |
| AI Application Development Platforms | $8.4B |
25-50% |
Agent / RAG application layer where retrieval and orchestration are central | $2.1B-$4.2B |
| AI Data | $3.1B |
25-50% |
AI data infrastructure touching memory, retrieval, and serving | $0.8B-$1.6B |
| Core TAM | $4.5B-$7.5B |
SAM = $1B-$2.5B Filter assumes roughly 25-35% of the core TAM values self-hosting, provider neutrality, API-first infrastructure, and controlled enterprise deploymentSOM = $30M-$100M ARR Assumes a low-single-digit 3-4% share of SAM over 3-5 years for a differentiated infrastructure companyMenlo cross-check Menlo’s 2025 enterprise GenAI spend breakdown shows about $1.5B in AI infrastructure; that supports the low end of the Gartner-derived range but does not independently confirm the high endBoundary condition This is a software TAM for the memory / retrieval / control-plane layer, not total AI spend, GPU spend, services, or model API usageExternal anchors used in the internal market model: Gartner worldwide AI spending forecast and Menlo Ventures’ 2025 enterprise GenAI spend breakdown. Carve-out percentages and share assumptions are internal market-model judgments.
| CRUMB task | Plain-English task | Base (MRR) | Fine-tuned | Delta |
|---|---|---|---|---|
clinical_trial |
Clinical trial matching from patient histories | 0.6333 | 0.7458 | +0.1125 |
code_retrieval |
Code solution retrieval for multi-constraint problems | 0.3889 | 0.6207 | +0.2318 |
legal_qa |
State-specific legal statute retrieval | 0.2316 | 0.2841 | +0.0525 |
paper_retrieval |
Scientific paper retrieval from multi-aspect criteria | 0.4494 | 0.4512 | +0.0018 |
set_operation |
Set-based entity retrieval | 0.2583 | 0.2628 | +0.0045 |
stack_exchange |
Reasoning-heavy community QA retrieval | 0.2141 | 0.2886 | +0.0745 |
theorem_retrieval |
Mathematical theorem retrieval | 0.3125 | 0.3266 | +0.0141 |
tip_of_the_tongue |
Vague movie / TV retrieval from remembered details | 0.0674 | 0.1387 | +0.0713 |
code retrieval, clinical trial matching, StackExchange QA, and tip-of-the-tongue retrievalpaper retrieval, set operations, and theorem retrieval suggest where more task-specific tuning is still neededTask labels adapted from Killingback and Zamani, Benchmarking Information Retrieval Models on Complex Retrieval Tasks (CRUMB, arXiv:2509.07253), plus the public CRUMB benchmark repository. Scores shown here are from Pair’s internal cloud fine-tuner output.
PAIR Systems | Confidential draft