Rememberizer Docs
Sign inSign upContact us
English
English
  • Why Rememberizer?
  • Background
    • What are Vector Embeddings and Vector Databases?
    • Glossary
    • Standardized Terminology
  • Personal Use
    • Getting Started
      • Search your knowledge
      • Mementos Filter Access
      • Common knowledge
      • Manage your embedded knowledge
  • Integrations
    • Rememberizer App
    • Rememberizer Slack integration
    • Rememberizer Google Drive integration
    • Rememberizer Dropbox integration
    • Rememberizer Gmail integration
    • Rememberizer Memory integration
    • Rememberizer MCP Servers
    • Manage third-party apps
  • Developer Resources
    • Developer Overview
  • Integration Options
    • Registering and using API Keys
    • Registering Rememberizer apps
    • Authorizing Rememberizer apps
    • Creating a Rememberizer GPT
    • LangChain integration
    • Vector Stores
    • Talk-to-Slack the Sample Web App
  • Enterprise Integration
    • Enterprise Integration Patterns
  • API Reference
    • API Documentation Home
    • Authentication
  • Core APIs
    • Search for documents by semantic similarity
    • Retrieve documents
    • Retrieve document contents
    • Retrieve Slack content
    • Memorize content to Rememberizer
  • Account & Configuration
    • Retrieve current user account details
    • List available data source integrations
    • Mementos
    • Get all added public knowledge
  • Vector Store APIs
    • Vector Store Documentation
    • Get vector store information
    • Get a list of documents in a Vector Store
    • Get document information
    • Add new text document to a Vector Store
    • Upload files to a Vector Store
    • Update file content in a Vector Store
    • Remove a document in Vector Store
    • Search for Vector Store documents by semantic similarity
  • Additional Resources
    • Notices
      • Terms of Use
      • Privacy Policy
      • B2B
        • About Reddit Agent
  • Releases
    • Release Notes Home
  • 2025 Releases
    • Apr 25th, 2025
    • Apr 18th, 2025
    • Apr 11th, 2025
    • Apr 4th, 2025
    • Mar 28th, 2025
    • Mar 21st, 2025
    • Mar 14th, 2025
    • Jan 17th, 2025
  • 2024 Releases
    • Dec 27th, 2024
    • Dec 20th, 2024
    • Dec 13th, 2024
    • Dec 6th, 2024
  • Nov 29th, 2024
  • Nov 22nd, 2024
  • Nov 15th, 2024
  • Nov 8th, 2024
  • Nov 1st, 2024
  • Oct 25th, 2024
  • Oct 18th, 2024
  • Oct 11th, 2024
  • Oct 4th, 2024
  • Sep 27th, 2024
  • Sep 20th, 2024
  • Sep 13th, 2024
  • Aug 16th, 2024
  • Aug 9th, 2024
  • Aug 2nd, 2024
  • Jul 26th, 2024
  • Jul 12th, 2024
  • Jun 28th, 2024
  • Jun 14th, 2024
  • May 31st, 2024
  • May 17th, 2024
  • May 10th, 2024
  • Apr 26th, 2024
  • Apr 19th, 2024
  • Apr 12th, 2024
  • Apr 5th, 2024
  • Mar 25th, 2024
  • Mar 18th, 2024
  • Mar 11th, 2024
  • Mar 4th, 2024
  • Feb 26th, 2024
  • Feb 19th, 2024
  • Feb 12th, 2024
  • Feb 5th, 2024
  • Jan 29th, 2024
  • Jan 22nd, 2024
  • Jan 15th, 2024
  • LLM Documentation
    • Rememberizer LLM Ready Documentation
Powered by GitBook
On this page
  • Introduction
  • Getting Started
  • Prerequisites
  • Installation
  • Authentication Setup
  • Configuration Options
  • Basic Usage
  • Understanding Document Structure
  • Advanced Examples
  • Building a RAG Question-Answering System
  • Building a Conversational Agent with Memory
  • Best Practices
  • Optimizing Retrieval Performance
  • Security Considerations
  • Integration Patterns
  • Troubleshooting
  • Common Issues
  • Debug Tips
  • Related Resources
  1. Integration Options

LangChain integration

Learn how to integrate Rememberizer as a LangChain retriever to provide your LangChain application with access to powerful vector database search.

PreviousCreating a Rememberizer GPTNextVector Stores

Last updated 1 month ago

Rememberizer integrates with LangChain through the RememberizerRetriever class, allowing you to easily incorporate Rememberizer's semantic search capabilities into your LangChain-powered applications. This guide explains how to set up and use this integration to build advanced LLM applications with access to your knowledge base.

Introduction

LangChain is a popular framework for building applications with large language models (LLMs). By integrating Rememberizer with LangChain, you can:

  • Use your Rememberizer knowledge base in RAG (Retrieval Augmented Generation) applications

  • Create chatbots with access to your documents and data

  • Build question-answering systems that leverage your knowledge

  • Develop agents that can search and reason over your information

The integration is available in the langchain_community.retrievers module.

Getting Started

Prerequisites

Before you begin, you need:

  1. A Rememberizer account with Common Knowledge created

  2. An API key for accessing your Common Knowledge

  3. Python environment with LangChain installed

Installation

Install the required packages:

pip install langchain langchain_community

If you plan to use OpenAI models (as shown in examples below):

pip install langchain_openai

Authentication Setup

There are two ways to authenticate the RememberizerRetriever:

  1. Environment Variable: Set the REMEMBERIZER_API_KEY environment variable

    import os
    os.environ["REMEMBERIZER_API_KEY"] = "rem_ck_your_api_key"
  2. Direct Parameter: Pass the API key directly when initializing the retriever

    retriever = RememberizerRetriever(rememberizer_api_key="rem_ck_your_api_key")

Configuration Options

The RememberizerRetriever class accepts these parameters:

Parameter
Type
Default
Description

top_k_results

int

10

Number of documents to return from search

rememberizer_api_key

str

None

API key for authentication (optional if set as environment variable)

Behind the scenes, the retriever makes API calls to Rememberizer's search endpoint with additional configurable parameters:

Advanced Parameter
Description

prev_chunks

Number of chunks before the matched chunk to include (default: 2)

next_chunks

Number of chunks after the matched chunk to include (default: 2)

return_full_content

Whether to return full document content (default: true)

Basic Usage

Here's a simple example of retrieving documents from Rememberizer using LangChain:

import os
from langchain_community.retrievers import RememberizerRetriever

# Set your API key
os.environ["REMEMBERIZER_API_KEY"] = "rem_ck_your_api_key"

# Initialize the retriever
retriever = RememberizerRetriever(top_k_results=5)

# Get relevant documents for a query
docs = retriever.get_relevant_documents(query="How do vector embeddings work?")

# Display the first document
if docs:
    print(f"Document: {docs[0].metadata['name']}")
    print(f"Content: {docs[0].page_content[:200]}...")

Understanding Document Structure

Each document returned by the retriever has:

  • page_content: The text content of the matched document chunk

  • metadata: Additional information about the document

Example of metadata structure:

{
  'id': 13646493,
  'document_id': '17s3LlMbpkTk0ikvGwV0iLMCj-MNubIaP',
  'name': 'What is a large language model (LLM)_ _ Cloudflare.pdf',
  'type': 'application/pdf',
  'path': '/langchain/What is a large language model (LLM)_ _ Cloudflare.pdf',
  'url': 'https://drive.google.com/file/d/17s3LlMbpkTk0ikvGwV0iLMCj-MNubIaP/view',
  'size': 337089,
  'created_time': '',
  'modified_time': '',
  'indexed_on': '2024-04-04T03:36:28.886170Z',
  'integration': {'id': 347, 'integration_type': 'google_drive'}
}

Advanced Examples

Building a RAG Question-Answering System

This example creates a question-answering system that retrieves information from Rememberizer and uses GPT-3.5 to formulate answers:

import os
from langchain_community.retrievers import RememberizerRetriever
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI

# Set up API keys
os.environ["REMEMBERIZER_API_KEY"] = "rem_ck_your_api_key"
os.environ["OPENAI_API_KEY"] = "your_openai_api_key"

# Initialize the retriever and language model
retriever = RememberizerRetriever(top_k_results=5)
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

# Create a retrieval QA chain
qa_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type="stuff",  # Simplest method - just stuff all documents into the prompt
    retriever=retriever,
    return_source_documents=True
)

# Ask a question
response = qa_chain.invoke({"query": "What is RAG in the context of AI?"})

# Print the answer
print(f"Answer: {response['result']}")
print("\nSources:")
for idx, doc in enumerate(response['source_documents']):
    print(f"{idx+1}. {doc.metadata['name']}")

Building a Conversational Agent with Memory

This example creates a conversational agent that can maintain conversation history:

import os
from langchain_community.retrievers import RememberizerRetriever
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain_openai import ChatOpenAI

# Set up API keys
os.environ["REMEMBERIZER_API_KEY"] = "rem_ck_your_api_key"
os.environ["OPENAI_API_KEY"] = "your_openai_api_key"

# Initialize components
retriever = RememberizerRetriever(top_k_results=5)
llm = ChatOpenAI(model_name="gpt-3.5-turbo")
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

# Create the conversational chain
conversation = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=retriever,
    memory=memory
)

# Example conversation
questions = [
    "What is RAG?",
    "How do large language models use it?",
    "What are the limitations of this approach?",
]

for question in questions:
    response = conversation.invoke({"question": question})
    print(f"Question: {question}")
    print(f"Answer: {response['answer']}\n")

Best Practices

Optimizing Retrieval Performance

  1. Be specific with queries: More specific queries usually yield better results

  2. Adjust top_k_results: Start with 3-5 results and adjust based on application needs

  3. Use context windows: The retriever automatically includes context around matched chunks

Security Considerations

  1. Protect your API key: Store it securely using environment variables or secret management tools

  2. Create dedicated keys: Create separate API keys for different applications

  3. Rotate keys regularly: Periodically generate new keys and phase out old ones

Integration Patterns

  1. Pre-retrieval processing: Consider preprocessing user queries to improve search relevance

  2. Post-retrieval filtering: Filter or rank retrieved documents before passing to the LLM

  3. Hybrid search: Combine Rememberizer with other retrievers using EnsembleRetriever

from langchain.retrievers import EnsembleRetriever
from langchain_community.retrievers import RememberizerRetriever, WebResearchRetriever

# Create retrievers
rememberizer_retriever = RememberizerRetriever(top_k_results=3)
web_retriever = WebResearchRetriever(...)  # Configure another retriever

# Create an ensemble with weighted score
ensemble_retriever = EnsembleRetriever(
    retrievers=[rememberizer_retriever, web_retriever],
    weights=[0.7, 0.3]  # Rememberizer results have higher weight
)

Troubleshooting

Common Issues

  1. Authentication errors: Verify your API key is correct and properly configured

  2. No results returned: Ensure your Common Knowledge contains relevant information

  3. Rate limiting: Be mindful of API rate limits for high-volume applications

Debug Tips

  • Set the LangChain debug mode to see detailed API calls:

    import langchain
    langchain.debug = True
  • Examine raw search results before passing to LLM to identify retrieval issues

Related Resources

For detailed instructions on creating Common Knowledge and generating an API key, see .

LangChain

LangChain

Rememberizer

in Rememberizer

- An alternative approach for AI integration

Registering and Using API Keys
Retriever conceptual guide
Retriever how-to guides
API Documentation
Vector Stores
Creating a Rememberizer GPT
Rememberizer | 🦜️🔗 LangChain
Logo