MCP for the agent era · v1.2.0 schema · Apache 2.0 release in prep

A research-grade
paper knowledge base
for AI agents.

MCP service · curated database · soon-open-source extraction toolchain. Built for serious CV/ML researchers. Plug into Claude Code; survey a field in 5 minutes, drill into any paragraph in any paper, compare methods and experimental data across 100+ papers in one call.

21,116
curated papers
1.15M
paragraphs
16,864
hero figures
365,777
citation edges
17
MCP tools

Venues covered: CVPR · ICCV · ECCV · ICLR · ICML · NeurIPS · AAAI · ACL · EMNLP · NAACL · IJCAI · WACV · BMVC · 3DV · SIGGRAPH · TPAMI · IJCV · Refreshed weekly.

The product

Three pillars, one stack.

Designed for the way researchers actually work in 2026: an LLM agent at your fingertips, doing the reading. We give that agent the knowledge base — pre-distilled, structured, cite-locked.

1. MCP Service

17 tools, agent-native.

One MCP connection grants your agent search, quote, compare_methods, find_baselines, survey, trends, narrative_threads, get_figure, bibtex and 8 more — designed per Anthropic's tool-design guidelines with defer_loading.

~0.5s p95 search · ~$0 per call
2. Curated Database

Quality, not quantity.

21K papers from CV/ML top venues, distilled with GPT-5.5 and Opus-4.7 against open schema v1.2 — 7-class contribution taxonomy + 6-field narrative arc + dataset/metric grids + figure metadata. Hybrid retrieval: BGE-1024 dense + BM25 + RRF + Qwen3 cross-encoder reranking.

paragraph quote · citation graph · author/facet index
3. Open Source

Self-host your lab corpus.

Schema spec, distillation prompts (Opus / GPT-vision), Marker/PyMuPDF figure extraction, LanceDB + SQLite FTS5 index builder, citation graph resolver, FastMCP server — all releasing Apache 2.0. Cards as CC-BY-SA dataset (Hugging Face). Ingest your lab's private papers, expose to your team.

Apache 2.0 · v0.1 incoming
Tool surface

17 tools — designed by use case.

Optimized for agent consumption per Anthropic best practices: every tool has explicit "when to use" + "when NOT to use" clauses, deferred tools auto-discovered via discover_tools.

Retrieval · find papers
search(query, k, filters)
Hybrid retrieval (BGE + BM25 + RRF + Qwen3 rerank). Primary entry point.
deep(paper_ids, include)
Hydrate full distilled cards by ID.
find_baselines(paper_id)
What this paper compared against (in-corpus + external models).
find_citing(paper_id)
Reverse lookup: who cites this paper (365K resolved edges).
find_lineage(paper_id, direction)
Walk the builds_on chain — method ancestors / descendants.
graph(seed, mode, depth)
Topical / same-method / same-dataset neighbors.
find_experts(topic)
Top authors in a topic (paper count, sample works).
Field-level aggregation · lay of the land
survey(topic, k=50)
Field map: method/task/contrib breakdown, anchors per method, dominant datasets, narrative samples.
trends(topic, year_range)
Temporal method-family adoption + emerging/declining signals.
narrative_threads(topic, k=100)
100+ papers as a single narrative arc — problem→gap→insight→approach→evidence.
compare_methods(topic | paper_ids)
Tabular grid: method · datasets · key metrics · SOTA claims · code.
analyze(query, k=100)
Batch 50–200 L1 cards in one call. Optional server-side digest.
Deep dive · paragraph + figure level
quote(query, k, paper_id?)
Verbatim paragraph retrieval over 1.15M chunks. Optional per-paper filter.
get_figure(paper_id)
Hero figure (architecture diagram) as base64 JPG + caption.
compare(paper_ids, aspects)
Side-by-side card matrix (method / datasets / results / code).
bibtex(paper_ids)
Clean BibTeX entries, ready to paste into your .bib.
discover_tools(query) · status()
Discover deferred tools by description match · server health + corpus stats.
Quick start

Two commands.
One minute.

Connect from Claude Code (or any MCP-compatible agent). Replace YOUR_KEY after requesting access.

Other MCP clients (Codex, custom LangChain agent): plain HTTP + JSON-RPC over streamable-http transport. Auth via X-API-Key header or Authorization: Bearer.

1 · Register the MCP server
# in your terminal
claude mcp add \
  --transport http \
  -s user \
  litscan-rag \
  https://mcp.acceptpaper.com/mcp \
  -H "X-API-Key: YOUR_KEY"
2 · Use it
# in a Claude Code session
> Survey sparse-view 3D human reconstruction
> papers from 2024–2026. Identify dominant
> method families, anchor papers, key
> benchmarks, and emerging directions.

# Claude internally:
# 1. calls survey(topic, k=50) → field map
# 2. trends(topic) → temporal signal
# 3. quote(...) on contested claims
# 4. find_baselines(top-3) for comparison
# → returns synthesized answer with cites
Why acceptpaper

Built for AI-native research workflows.

Existing tools were designed for human eyes on a webpage. acceptpaper was designed for an LLM agent acting on your behalf.

Capability acceptpaper PaperQA2 Elicit OpenScholar
MCP-native (agent interface)
Pre-distilled structured schema v1.2 (7-class + 6-field)flat textsemi-structflat
Paragraph-level retrieval 1.15M chunks
Figure retrieval (hero JPG) 16Kv3 only
Citation graph (resolved) 365K edgestoolpartialvia S2
Venue-tier-aware ranking
Cost per query (retrieval) $0$0.05–2$0.01–.50~$0.10
Domain focus CV/ML top-tierbiology-heavybroadbroad
Open source license Apache 2.0 (in prep)✓ Apacheclosed✓ Apache
Self-hostable for lab corpus soonpartial
Schema v1.2

Every card is a small cite-locked database.

A paper is more than text. Our distillation produces 30+ structured fields per paper — 7-class contribution taxonomy, 6-field narrative arc, eval datasets with numeric values + SOTA flags, baseline comparison strings, key modules, hero figure metadata. Agents read exactly the slice they need without re-parsing PDFs every query.

// excerpt of one L2 card
{
  "source_id": "cvf:CVPR.2026:1024",
  "title": "DiHuR: Diffusion-Guided Generalizable Human Reconstruction",
  "contribution_type": {
    "primary": "combination",
    "secondary": ["empirical_study", "ablation_heavy"]
  },
  "narrative": {
    "problem": "Reconstructing detailed 3D humans from ~3 sparse cameras…",
    "insight":  "SMPL vertices map to consistent semantic regions…",
    "evidence": "CD 1.117 vs GP-NeRF 3.876 on THuman; gains hold on ZJU-MoCap"
  },
  "eval_data": [
    {"name":"THuman","metric":"Chamfer Distance","value":1.117,"is_sota":true},
    // + 9 more entries
  ],
  "compares_with": ["NeuS", "SparseNeuS", "PIFuHD", "GP-NeRF", "SIFU"],
  "hero_figure": {"page":1, "caption":"Given 3 views with minimal overlap…"}
}
Open source

Roadmap to Apache 2.0.

The hosted service stays free for academic use. The toolchain ships open-source so any lab can self-host their own corpus.

v0.1
Schema spec + 1K samples
Pydantic L2Card spec, JSON examples, build_index.py for LanceDB+FTS5.
target: Q1 2026
v0.2
Distillation pipeline
Production prompts, retry, validator, figure extraction, paragraph chunker. Bring your own LLM key.
target: Q2 2026
v0.3
Full corpus on HuggingFace
46K cards, 1.15M paragraphs, 16K figures, citation graph. CC-BY-SA. One-line download.
target: Q3 2026
v1.0
Production self-host
Docker compose, Terraform module, lab multi-tenant auth, docs site.
target: Q4 2026
Get access

Researchers welcome.
No bots. No scraping.

Free API key for academic / research use. We ask only that you tell us briefly who you are and what you're working on — to prioritize features and venue coverage.

Request an API key

Typical reply within 24h.

Service preview
  • Free tier: 1000 search queries/day, unlimited deep/find_*.
  • SLA target: p95 < 500ms search; p99 < 2s.
  • Corpus freshness: weekly arXiv ingest planned; conference proceedings within 1 week of release.
  • Privacy: queries not stored beyond 24h log retention.
  • Citation: acknowledgment appreciated; not required.