arxiv-deep-research

arxiv-deep-research is a Model Context Protocol (MCP) research agent for arXiv that builds data, evaluation, and training pipelines for agentic GenAI systems.

arxiv-deep-research project cover

arxiv-deep-research

arxiv-deep-research is a MCP server designed as a research agent for arXiv. It powers multi‑agent Generative AI systems by giving them a reliable way to search, download, and deeply analyze research papers — with data, evaluation, and training pipelines that look a lot like the work of a Research Software Engineer in Generative AI.

As a project, it shows experience:

  • building agentic applications that coordinate multiple tools and agents
  • creating evaluation and benchmarking pipelines for AI systems
  • working end‑to‑end in Python to ship open‑source infrastructure others can extend
  • integrating with real multi‑agent frameworks like AutoGen and Magentic‑UI

These are the same skills you’d use as a Research Software Engineer – Generative AI: designing systems, building datasets and evaluation loops, and turning research ideas into practical tools that other people can use.

Features

  • search - query arXiv with filters (date, category, sort)
  • download - fetch paper PDF, convert to markdown
  • read - access stored paper content
  • list - view all downloaded papers
  • prompts - deep paper analysis workflow

Prerequisites

  • Python 3.11+

Installation

git clone <repo-url>
cd arxiv-mcp-server
python3 -m venv .venv
source .venv/bin/activate
pip install -e .

Configuration

Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "arxiv": {
      "command": "/path/to/.venv/bin/python",
      "args": ["-m", "arxiv_mcp_server", "--storage-path", "/path/to/papers"]
    }
  }
}

Cursor

Add to MCP settings:

{
  "mcpServers": {
    "arxiv": {
      "command": "python",
      "args": ["-m", "arxiv_mcp_server"],
      "env": {
        "PYTHONPATH": "/path/to/arxiv-mcp-server/src"
      }
    }
  }
}

Default storage: ~/.arxiv-mcp-server/papers

Tools

search_papers

{
  "query": "transformer architecture",
  "max_results": 10,
  "date_from": "2023-01-01",
  "categories": ["cs.AI", "cs.LG"],
  "sort_by": "relevance"
}

download_paper

{
  "paper_id": "2401.12345"
}

list_papers

{}

read_paper

{
  "paper_id": "2401.12345"
}

Prompts

deep-paper-analysis

Comprehensive paper analysis workflow:

{
  "paper_id": "2401.12345"
}

Covers: executive summary, methodology, results, implications, future directions.

Environment variables

VariableDefault
ARXIV_STORAGE_PATH~/.arxiv-mcp-server/papers