Memory with MCP: Long-Term Memory for LLMs Powered by HPKV

6 min readBy Mehran Toosi

MCP Memory with HPKV

The hallmark of modern LLMs is their ability to follow conversations naturally. But even the most advanced models have a fundamental limitation: their context window. Once information falls outside this window, it's forgotten. For AI assistants and agents to be truly useful, they need persistent memory systems - the ability to recall past interactions, preferences, and context over extended periods.

The Memory Problem for AI Systems

As a developer working with LLMs, you've likely experienced this frustrating scenario: you're building a feature with an AI assistant, and halfway through, it confidently references a function you've "already defined" - except that function doesn't exist. Or perhaps it suggests using a library method that sounds plausible but was completely hallucinated. These aren't just occasional quirks - they're symptoms of the fundamental memory problem in LLMs.

Without persistent memory, LLMs operate with a kind of functional amnesia that severely limits their usefulness in real development workflows:

Hallucinating Non-existent Code: "You can simply use the parseConfigWithFallback() function" - except that such function doesn't exist.

Forgetting Failed Approaches: Explaining why a certain approach won't work, only to have the model suggest the exact same approach again an hour later.

Inconsistent Project Understanding: The model gives architecture recommendations based on one understanding of your project, then contradicts itself in the next session with completely different assumptions.

Reinventing the Wheel: You've established a project-specific pattern for handling certain tasks, but the model keeps suggesting alternative approaches because it's forgotten what was already decided.

Forgetting API Limitations: "Let's use the streaming API for this" - after you've already explained twice that the API doesn't support streaming.

These memory-related failures become exponentially more problematic in long-running projects. Without the ability to remember past interactions, LLMs can't truly function as collaborative partners in development. Each new conversation essentially resets their understanding, forcing developers to repeatedly re-establish context, correct the same misconceptions, and rebuild shared knowledge.

The standard solution has been to expand context windows - from 32K tokens to 200K and beyond. But this approach has significant limitations:

  1. Cost: Larger contexts mean higher inference costs
  2. Relevance: Most historical content isn't relevant to the current query
  3. Focus: Too much context creates "needle in a haystack" problems for the model

What's needed is selective, persistent memory that can be retrieved based on relevance rather than recency.

Introducing the MCP Memory Server

Today, we're excited to announce the general availability of our MCP Memory Server - a ready-to-use system that gives LLMs true long-term memory. Built on HPKV and Nexus Search, it implements the Model Context Protocol (MCP) to seamlessly integrate with supported models.

What is MCP?

The Model Context Protocol (MCP) is an emerging standard for extending AI model capabilities through external services. Our MCP Memory Server implements a specialized protocol for memory management, allowing any compatible AI system to store and retrieve memories without needing to handle the storage and semantic search logic themselves.

How It Works: The MCP Memory API

The MCP Memory Server provides four key functions:

1. Store Memory

Purpose: Creates a new memory entry from a conversation exchange

Key features:

  • Organizes memories by project and session
  • Maintains sequential ordering with sequence numbers
  • Stores both user requests and assistant responses
  • Supports optional metadata for better retrieval
  • Automatically creates a key in the format: project_name_date_session_name_sequence_number

2. Search Memory (Semantic Query)

Purpose: Performs AI-powered natural language search over stored memories

Key features:

  • Uses semantic understanding to find relevant past exchanges
  • Returns a generated summary of relevant information
  • Includes source memory keys with confidence scores
  • Handles complex natural language queries
  • Intelligently combines information from multiple memory entries

3. Search Keys (Vector Similarity)

Purpose: Finds semantically similar memory keys based on vector similarity

Key features:

  • Returns a ranked list of memory keys matching the query
  • Configurable number of results with the topK parameter
  • Adjustable similarity threshold with minScore
  • Faster than full semantic search when you only need keys
  • Perfect for finding related conversations without generating summaries

4. Get Memory (Exact Key Retrieval)

Purpose: Retrieves a specific memory by its exact key

Key features:

  • Direct access to a specific memory entry
  • Returns the complete memory object including metadata
  • Useful when you already know which memory you need
  • Can be combined with search keys for two-stage retrieval

Real-World Implementation: Cursor IDE

One of the first major use cases of the MCP Memory Server is in Cursor IDE, where AI coding assistance requires persistent understanding of project context, user preferences, and past interactions.

With the MCP Memory Server, Cursor's AI assistant can:

  1. Remember project structures and conventions across sessions
  2. Recall user preferences for coding style and patterns
  3. Reference previous explanations and decisions
  4. Build on past problem-solving approaches

Cursor will use semantic memory search to intelligently retrieve relevant past conversations based on what the developer is currently working on - without forcing them to manually manage or reference this context.

Getting Started

The MCP Memory Server is available now for all HPKV users. Here's how to start using it:

  1. Sign up for an HPKV account
  2. Generate an API key in your dashboard
  3. Integrate the MCP Memory tools in your AI application

Adding MCP Memory to Cursor IDE

Edit your mcp.json file:

{ "mcpServers": { "hpkv-memory-server": { "command": "npx", "args": ["mcp-remote", "https://memory-mcp.hpkv.io/sse"] } } }

After adding the MCP Memory Server, you'll be notified to login to your HPKV account. Once you do, you'll see a list of your API keys. Select the one you want to use to authenticate with the MCP Memory Server.

Cursor Rules

In order to integrate the MCP Memory Server with Cursor seamlessly, we created a rule document that you can add to your Cursor project and set the rule type to Always.

Beyond the Code: New Interaction Paradigms

The MCP Memory Server enables entirely new interaction paradigms:

  1. Truly Personalized Experiences: Systems that remember user preferences, past challenges, and successful solutions

  2. Continuous Learning Agents: Agents that improve over time by remembering what worked and what didn't

  3. Cross-Session Coherence: Maintaining consistent understanding and personality across multiple interactions

  4. Self-Reflection Capabilities: Agents that can review their past actions and refine their approach

Try It Yourself!

Want to try it yourself? The MCP Memory Server is available on all HPKV plans, including our free tier with 100 calls/month. For production applications, our Pro and Business tiers provide higher limits and advanced features.