Rememberizer Docs
Sign inSign upContact us
English
English
  • Why Rememberizer?
  • Background
    • What are Vector Embeddings and Vector Databases?
    • Glossary
    • Standardized Terminology
  • Personal Use
    • Getting Started
      • Search your knowledge
      • Mementos Filter Access
      • Common knowledge
      • Manage your embedded knowledge
  • Integrations
    • Rememberizer App
    • Rememberizer Slack integration
    • Rememberizer Google Drive integration
    • Rememberizer Dropbox integration
    • Rememberizer Gmail integration
    • Rememberizer Memory integration
    • Rememberizer MCP Servers
    • Manage third-party apps
  • Developer Resources
    • Developer Overview
  • Integration Options
    • Registering and using API Keys
    • Registering Rememberizer apps
    • Authorizing Rememberizer apps
    • Creating a Rememberizer GPT
    • LangChain integration
    • Vector Stores
    • Talk-to-Slack the Sample Web App
  • Enterprise Integration
    • Enterprise Integration Patterns
  • API Reference
    • API Documentation Home
    • Authentication
  • Core APIs
    • Search for documents by semantic similarity
    • Retrieve documents
    • Retrieve document contents
    • Retrieve Slack content
    • Memorize content to Rememberizer
  • Account & Configuration
    • Retrieve current user account details
    • List available data source integrations
    • Mementos
    • Get all added public knowledge
  • Vector Store APIs
    • Vector Store Documentation
    • Get vector store information
    • Get a list of documents in a Vector Store
    • Get document information
    • Add new text document to a Vector Store
    • Upload files to a Vector Store
    • Update file content in a Vector Store
    • Remove a document in Vector Store
    • Search for Vector Store documents by semantic similarity
  • Additional Resources
    • Notices
      • Terms of Use
      • Privacy Policy
      • B2B
        • About Reddit Agent
  • Releases
    • Release Notes Home
  • 2025 Releases
    • Apr 25th, 2025
    • Apr 18th, 2025
    • Apr 11th, 2025
    • Apr 4th, 2025
    • Mar 28th, 2025
    • Mar 21st, 2025
    • Mar 14th, 2025
    • Jan 17th, 2025
  • 2024 Releases
    • Dec 27th, 2024
    • Dec 20th, 2024
    • Dec 13th, 2024
    • Dec 6th, 2024
  • Nov 29th, 2024
  • Nov 22nd, 2024
  • Nov 15th, 2024
  • Nov 8th, 2024
  • Nov 1st, 2024
  • Oct 25th, 2024
  • Oct 18th, 2024
  • Oct 11th, 2024
  • Oct 4th, 2024
  • Sep 27th, 2024
  • Sep 20th, 2024
  • Sep 13th, 2024
  • Aug 16th, 2024
  • Aug 9th, 2024
  • Aug 2nd, 2024
  • Jul 26th, 2024
  • Jul 12th, 2024
  • Jun 28th, 2024
  • Jun 14th, 2024
  • May 31st, 2024
  • May 17th, 2024
  • May 10th, 2024
  • Apr 26th, 2024
  • Apr 19th, 2024
  • Apr 12th, 2024
  • Apr 5th, 2024
  • Mar 25th, 2024
  • Mar 18th, 2024
  • Mar 11th, 2024
  • Mar 4th, 2024
  • Feb 26th, 2024
  • Feb 19th, 2024
  • Feb 12th, 2024
  • Feb 5th, 2024
  • Jan 29th, 2024
  • Jan 22nd, 2024
  • Jan 15th, 2024
  • LLM Documentation
    • Rememberizer LLM Ready Documentation
Powered by GitBook
On this page
  • Introduction
  • Technical Overview
  • How Vector Stores Work
  • Key Components
  • Architecture
  • Getting Started
  • Creating a Vector Store
  • Configuration Options
  • Managing Vector Stores
  • API Key Management
  • Creating API Keys
  • Revoking API Keys
  • Using the Vector Store API
  • Code Examples
  • Performance Considerations
  • Optimizing for Different Data Volumes
  • Chunking Strategies
  • Query Optimization
  • Advanced Usage
  • Reindexing
  • Query Enhancement
  • Migrating from Other Vector Databases
  • Migration Overview
  • Benefits of Migrating to Rememberizer
  • Migrating from Pinecone
  • Migrating from Qdrant
  • Migrating from Supabase pgvector
  • Migration Best Practices
  1. Integration Options

Vector Stores

This guide will help you understand how to use the Rememberizer Vector Store as a developer.

PreviousLangChain integrationNextTalk-to-Slack the Sample Web App

Last updated 1 month ago

The Rememberizer Vector Store simplifies the process of dealing with vector data, allowing you to focus on text input and leveraging the power of vectors for various applications such as search and data analysis.

Introduction

The Rememberizer Vector Store provides an easy-to-use interface for handling vector data while abstracting away the complexity of vector embeddings. Powered by PostgreSQL with the pgvector extension, Rememberizer Vector Store allows you to work directly with text. The service handles chunking, vectorizing, and storing the text data, making it easier for you to focus on your core application logic.

For a deeper understanding of the theoretical concepts behind vector embeddings and vector databases, see .

Technical Overview

How Vector Stores Work

Rememberizer Vector Stores convert text into high-dimensional vector representations (embeddings) that capture semantic meaning. This enables:

  1. Semantic Search: Find documents based on meaning rather than just keywords

  2. Similarity Matching: Identify conceptually related content

  3. Efficient Retrieval: Quickly locate relevant information from large datasets

Key Components

  • Document Processing: Text is split into optimally sized chunks with overlapping boundaries for context preservation

  • Vectorization: Chunks are converted to embeddings using state-of-the-art models

  • Indexing: Specialized algorithms organize vectors for efficient similarity search

  • Query Processing: Search queries are vectorized and compared against stored embeddings

Architecture

Rememberizer implements vector stores using:

  • PostgreSQL with pgvector extension: For efficient vector storage and search

  • Collection-based organization: Each vector store has its own isolated collection

  • API-driven access: Simple RESTful endpoints for all operations

Getting Started

Creating a Vector Store

  1. Navigate to the Vector Stores Section in your dashboard

  2. Click on "Create new Vector Store":

    • A form will appear prompting you to enter details.

  3. Fill in the Details:

    • Name: Provide a unique name for your vector store.

    • Description: Write a brief description of the vector store.

    • Embedding Model: Select the model that converts text to vectors.

    • Indexing Algorithm: Choose how vectors will be organized for search.

    • Search Metric: Define how similarity between vectors is calculated.

    • Vector Dimension: The size of the vector embeddings (typically 768-1536).

  4. Submit the Form:

    • Click on the "Create" button. You will receive a success notification, and the new store will appear in your vector store list.

Configuration Options

Embedding Models

Model
Dimensions
Description
Best For

openai/text-embedding-3-large

1536

High-accuracy embedding model from OpenAI

Production applications requiring maximum accuracy

openai/text-embedding-3-small

1536

Smaller, faster embedding model from OpenAI

Applications with higher throughput requirements

Indexing Algorithms

Algorithm
Description
Tradeoffs

IVFFLAT (default)

Inverted file with flat compression

Good balance of speed and accuracy; works well for most datasets

HNSW

Hierarchical Navigable Small World

Better accuracy for large datasets; higher memory requirements

Search Metrics

Metric
Description
Best For

cosine (default)

Measures angle between vectors

General purpose similarity matching

inner product (ip)

Dot product between vectors

When vector magnitude is important

L2 (Euclidean)

Straight-line distance between vectors

When spatial relationships matter

Managing Vector Stores

  1. View and Edit Vector Stores:

    • Access the management dashboard to view, edit, or delete vector stores.

  2. Viewing Documents:

    • Browse individual documents and their associated metadata within a specific vector store.

  3. Statistics:

    • View detailed statistics such as the number of vectors stored, query performance, and operational metrics.

API Key Management

API keys are used to authenticate and authorize access to the Rememberizer Vector Store's API endpoints. Proper management of API keys is essential for maintaining the security and integrity of your vector stores.

Creating API Keys

  1. Head over to your Vector Store details page

  2. Navigate to the API Key Management Section:

    • It can be found within the "Configuration" tab

  3. Click on "Add API Key":

    • A form will appear prompting you to enter details.

  4. Fill in the Details:

    • Name: Provide a name for the API key to help you identify its use case.

  5. Submit the Form:

    • Click on the "Create" button. The new API key will be generated and displayed. Make sure to copy and store it securely. This key is used to authenticate requests to that specific vector store.

Revoking API Keys

If an API key is no longer needed, you can delete it to prevent any potential misuse.

For security reasons, you may want to rotate your API keys periodically. This involves generating a new key and revoking the old one.

Using the Vector Store API

After creating a Vector Store and generating an API key, you can interact with it using the REST API.

Code Examples

import requests
import json

API_KEY = "your_api_key_here"
VECTOR_STORE_ID = "vs_abc123"  # Replace with your vector store ID
BASE_URL = "https://api.rememberizer.ai/api/v1"

# Upload a document to the vector store
def upload_document(file_path, document_name=None):
    if document_name is None:
        document_name = file_path.split("/")[-1]
    
    with open(file_path, "rb") as f:
        files = {"file": (document_name, f)}
        headers = {"x-api-key": API_KEY}
        
        response = requests.post(
            f"{BASE_URL}/vector-stores/{VECTOR_STORE_ID}/documents",
            headers=headers,
            files=files
        )
        
        if response.status_code == 201:
            print(f"Document '{document_name}' uploaded successfully!")
            return response.json()
        else:
            print(f"Error uploading document: {response.text}")
            return None

# Upload text content to the vector store
def upload_text(content, document_name):
    headers = {
        "x-api-key": API_KEY,
        "Content-Type": "application/json"
    }
    
    data = {
        "name": document_name,
        "content": content
    }
    
    response = requests.post(
        f"{BASE_URL}/vector-stores/{VECTOR_STORE_ID}/documents/text",
        headers=headers,
        json=data
    )
    
    if response.status_code == 201:
        print(f"Text document '{document_name}' uploaded successfully!")
        return response.json()
    else:
        print(f"Error uploading text: {response.text}")
        return None

# Search the vector store
def search_vector_store(query, num_results=5, prev_chunks=1, next_chunks=1):
    headers = {"x-api-key": API_KEY}
    
    params = {
        "q": query,
        "n": num_results,
        "prev_chunks": prev_chunks,
        "next_chunks": next_chunks
    }
    
    response = requests.get(
        f"{BASE_URL}/vector-stores/{VECTOR_STORE_ID}/documents/search",
        headers=headers,
        params=params
    )
    
    if response.status_code == 200:
        results = response.json()
        print(f"Found {len(results['matched_chunks'])} matches for '{query}'")
        
        # Print the top result
        if results['matched_chunks']:
            top_match = results['matched_chunks'][0]
            print(f"Top match (distance: {top_match['distance']}):")
            print(f"Document: {top_match['document']['name']}")
            print(f"Content: {top_match['matched_content']}")
        
        return results
    else:
        print(f"Error searching: {response.text}")
        return None

# Example usage
# upload_document("path/to/document.pdf")
# upload_text("This is a sample text to be vectorized", "sample-document.txt")
# search_vector_store("How does vector similarity work?")
// Vector Store API Client
class VectorStoreClient {
  constructor(apiKey, vectorStoreId) {
    this.apiKey = apiKey;
    this.vectorStoreId = vectorStoreId;
    this.baseUrl = 'https://api.rememberizer.ai/api/v1';
  }

  // Get vector store information
  async getVectorStoreInfo() {
    const response = await fetch(`${this.baseUrl}/vector-stores/${this.vectorStoreId}`, {
      method: 'GET',
      headers: {
        'x-api-key': this.apiKey
      }
    });
    
    if (!response.ok) {
      throw new Error(`Failed to get vector store info: ${response.statusText}`);
    }
    
    return response.json();
  }

  // Upload a text document
  async uploadTextDocument(name, content) {
    const response = await fetch(`${this.baseUrl}/vector-stores/${this.vectorStoreId}/documents/text`, {
      method: 'POST',
      headers: {
        'x-api-key': this.apiKey,
        'Content-Type': 'application/json'
      },
      body: JSON.stringify({
        name,
        content
      })
    });
    
    if (!response.ok) {
      throw new Error(`Failed to upload text document: ${response.statusText}`);
    }
    
    return response.json();
  }

  // Upload a file
  async uploadFile(file, onProgress) {
    const formData = new FormData();
    formData.append('file', file);
    
    const xhr = new XMLHttpRequest();
    
    return new Promise((resolve, reject) => {
      xhr.open('POST', `${this.baseUrl}/vector-stores/${this.vectorStoreId}/documents`);
      xhr.setRequestHeader('x-api-key', this.apiKey);
      
      xhr.upload.onprogress = (event) => {
        if (event.lengthComputable && onProgress) {
          const percentComplete = (event.loaded / event.total) * 100;
          onProgress(percentComplete);
        }
      };
      
      xhr.onload = () => {
        if (xhr.status === 201) {
          resolve(JSON.parse(xhr.responseText));
        } else {
          reject(new Error(`Failed to upload file: ${xhr.statusText}`));
        }
      };
      
      xhr.onerror = () => {
        reject(new Error('Network error during file upload'));
      };
      
      xhr.send(formData);
    });
  }

  // Search documents in the vector store
  async searchDocuments(query, options = {}) {
    const params = new URLSearchParams({
      q: query,
      n: options.numResults || 10,
      prev_chunks: options.prevChunks || 1,
      next_chunks: options.nextChunks || 1
    });
    
    if (options.threshold) {
      params.append('t', options.threshold);
    }
    
    const response = await fetch(
      `${this.baseUrl}/vector-stores/${this.vectorStoreId}/documents/search?${params}`,
      {
        method: 'GET',
        headers: {
          'x-api-key': this.apiKey
        }
      }
    );
    
    if (!response.ok) {
      throw new Error(`Search failed: ${response.statusText}`);
    }
    
    return response.json();
  }

  // List all documents in the vector store
  async listDocuments() {
    const response = await fetch(
      `${this.baseUrl}/vector-stores/${this.vectorStoreId}/documents`,
      {
        method: 'GET',
        headers: {
          'x-api-key': this.apiKey
        }
      }
    );
    
    if (!response.ok) {
      throw new Error(`Failed to list documents: ${response.statusText}`);
    }
    
    return response.json();
  }

  // Delete a document
  async deleteDocument(documentId) {
    const response = await fetch(
      `${this.baseUrl}/vector-stores/${this.vectorStoreId}/documents/${documentId}`,
      {
        method: 'DELETE',
        headers: {
          'x-api-key': this.apiKey
        }
      }
    );
    
    if (!response.ok) {
      throw new Error(`Failed to delete document: ${response.statusText}`);
    }
    
    return true;
  }
}

// Example usage
/*
const client = new VectorStoreClient('your_api_key', 'vs_abc123');

// Search documents
client.searchDocuments('How does semantic search work?')
  .then(results => {
    console.log(`Found ${results.matched_chunks.length} matches`);
    results.matched_chunks.forEach(match => {
      console.log(`Document: ${match.document.name}`);
      console.log(`Score: ${match.distance}`);
      console.log(`Content: ${match.matched_content}`);
      console.log('---');
    });
  })
  .catch(error => console.error(error));
*/
require 'net/http'
require 'uri'
require 'json'

class VectorStoreClient
  def initialize(api_key, vector_store_id)
    @api_key = api_key
    @vector_store_id = vector_store_id
    @base_url = 'https://api.rememberizer.ai/api/v1'
  end

  # Get vector store details
  def get_vector_store_info
    uri = URI("#{@base_url}/vector-stores/#{@vector_store_id}")
    request = Net::HTTP::Get.new(uri)
    request['x-api-key'] = @api_key
    
    response = send_request(uri, request)
    JSON.parse(response.body)
  end

  # Upload text content
  def upload_text(name, content)
    uri = URI("#{@base_url}/vector-stores/#{@vector_store_id}/documents/text")
    request = Net::HTTP::Post.new(uri)
    request['Content-Type'] = 'application/json'
    request['x-api-key'] = @api_key
    
    request.body = {
      name: name,
      content: content
    }.to_json
    
    response = send_request(uri, request)
    JSON.parse(response.body)
  end

  # Search documents
  def search(query, num_results: 5, prev_chunks: 1, next_chunks: 1, threshold: nil)
    uri = URI("#{@base_url}/vector-stores/#{@vector_store_id}/documents/search")
    params = {
      q: query,
      n: num_results,
      prev_chunks: prev_chunks,
      next_chunks: next_chunks
    }
    
    params[:t] = threshold if threshold
    
    uri.query = URI.encode_www_form(params)
    request = Net::HTTP::Get.new(uri)
    request['x-api-key'] = @api_key
    
    response = send_request(uri, request)
    JSON.parse(response.body)
  end

  # List documents
  def list_documents
    uri = URI("#{@base_url}/vector-stores/#{@vector_store_id}/documents")
    request = Net::HTTP::Get.new(uri)
    request['x-api-key'] = @api_key
    
    response = send_request(uri, request)
    JSON.parse(response.body)
  end

  # Upload file (multipart form)
  def upload_file(file_path)
    uri = URI("#{@base_url}/vector-stores/#{@vector_store_id}/documents")
    
    file_name = File.basename(file_path)
    file_content = File.binread(file_path)
    
    boundary = "RememberizerBoundary#{rand(1000000)}"
    
    request = Net::HTTP::Post.new(uri)
    request['Content-Type'] = "multipart/form-data; boundary=#{boundary}"
    request['x-api-key'] = @api_key
    
    post_body = []
    post_body << "--#{boundary}\r\n"
    post_body << "Content-Disposition: form-data; name=\"file\"; filename=\"#{file_name}\"\r\n"
    post_body << "Content-Type: application/octet-stream\r\n\r\n"
    post_body << file_content
    post_body << "\r\n--#{boundary}--\r\n"
    
    request.body = post_body.join
    
    response = send_request(uri, request)
    JSON.parse(response.body)
  end

  private

  def send_request(uri, request)
    http = Net::HTTP.new(uri.host, uri.port)
    http.use_ssl = (uri.scheme == 'https')
    
    response = http.request(request)
    
    unless response.is_a?(Net::HTTPSuccess)
      raise "API request failed: #{response.code} #{response.message}\n#{response.body}"
    end
    
    response
  end
end

# Example usage
=begin
client = VectorStoreClient.new('your_api_key', 'vs_abc123')

# Search for documents
results = client.search('What are the best practices for data security?')
puts "Found #{results['matched_chunks'].length} results"

# Display top result
if results['matched_chunks'].any?
  top_match = results['matched_chunks'].first
  puts "Top match (distance: #{top_match['distance']}):"
  puts "Document: #{top_match['document']['name']}"
  puts "Content: #{top_match['matched_content']}"
end
=end
# Set your API key and Vector Store ID
API_KEY="your_api_key_here"
VECTOR_STORE_ID="vs_abc123"
BASE_URL="https://api.rememberizer.ai/api/v1"

# Get vector store information
curl -X GET "${BASE_URL}/vector-stores/${VECTOR_STORE_ID}" \
  -H "x-api-key: ${API_KEY}"

# Upload a text document
curl -X POST "${BASE_URL}/vector-stores/${VECTOR_STORE_ID}/documents/text" \
  -H "x-api-key: ${API_KEY}" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "example-document.txt",
    "content": "This is a sample document that will be vectorized and stored in the vector database for semantic search."
  }'

# Upload a file
curl -X POST "${BASE_URL}/vector-stores/${VECTOR_STORE_ID}/documents" \
  -H "x-api-key: ${API_KEY}" \
  -F "file=@/path/to/your/document.pdf"

# Search for documents
curl -X GET "${BASE_URL}/vector-stores/${VECTOR_STORE_ID}/documents/search?q=semantic%20search&n=5&prev_chunks=1&next_chunks=1" \
  -H "x-api-key: ${API_KEY}"

# List all documents
curl -X GET "${BASE_URL}/vector-stores/${VECTOR_STORE_ID}/documents" \
  -H "x-api-key: ${API_KEY}"

# Delete a document
curl -X DELETE "${BASE_URL}/vector-stores/${VECTOR_STORE_ID}/documents/123" \
  -H "x-api-key: ${API_KEY}"

Performance Considerations

Coming soon: Vector Store Architecture Diagram

This technical architecture diagram will illustrate:

  • The PostgreSQL + pgvector foundation architecture

  • Indexing algorithm structures (IVFFLAT vs. HNSW)

  • How search metrics work in vector space (visual comparison)

  • Document chunking process with overlap visualization

  • Performance considerations visualized across different scales

Optimizing for Different Data Volumes

Data Volume
Recommended Configuration
Notes

Small (<10k documents)

IVFFLAT, cosine similarity

Simple configuration provides good performance

Medium (10k-100k documents)

IVFFLAT, ensure regular reindexing

Balance between search speed and index maintenance

Large (>100k documents)

HNSW, consider increasing vector dimensions

Higher memory usage but maintains performance at scale

Chunking Strategies

The chunking process significantly impacts search quality:

  • Chunk Size: Rememberizer uses a default chunk size of 1024 bytes with a 200-byte overlap

  • Smaller Chunks (512-1024 bytes): More precise matches, better for specific questions

  • Larger Chunks (1500-2048 bytes): More context in each match, better for broader topics

  • Overlap: Ensures context is not lost at chunk boundaries

Query Optimization

  • Context Windows: Use prev_chunks and next_chunks to retrieve surrounding content

  • Results Count: Start with 3-5 results (n parameter) and adjust based on precision needs

  • Threshold: Adjust the t parameter to filter results by similarity score

Advanced Usage

Reindexing

Rememberizer automatically triggers reindexing when vector counts exceed predefined thresholds, but consider manual reindexing after:

  • Uploading a large number of documents

  • Changing the embedding model

  • Modifying the indexing algorithm

Query Enhancement

For better search results:

  1. Be specific in search queries

  2. Include context when possible

  3. Use natural language rather than keywords

  4. Adjust parameters based on result quality

Migrating from Other Vector Databases

If you're currently using other vector database solutions and want to migrate to Rememberizer Vector Store, the following guides will help you transition your data efficiently.

Migration Overview

Migrating vector data involves:

  1. Exporting data from your source vector database

  2. Converting the data to a format compatible with Rememberizer

  3. Importing the data into your Rememberizer Vector Store

  4. Verifying the migration was successful

Benefits of Migrating to Rememberizer

  • PostgreSQL Foundation: Built on mature database technology with built-in backup and recovery

  • Integrated Ecosystem: Seamless connection with other Rememberizer components

  • Simplified Management: Unified interface for vector operations

  • Advanced Security: Row-level security and fine-grained access controls

  • Scalable Architecture: Performance optimization as your data grows

Migrating from Pinecone

import os
import pinecone
import requests
import json
import time

# Set up Pinecone client
pinecone.init(api_key="PINECONE_API_KEY", environment="PINECONE_ENV")
source_index = pinecone.Index("your-pinecone-index")

# Set up Rememberizer Vector Store client
REMEMBERIZER_API_KEY = "your_rememberizer_api_key"
VECTOR_STORE_ID = "vs_abc123"  # Your Rememberizer vector store ID
BASE_URL = "https://api.rememberizer.ai/api/v1"

# 1. Set up batch size for migration (adjust based on your data size)
BATCH_SIZE = 100

# 2. Function to get vectors from Pinecone
def fetch_vectors_from_pinecone(index_name, batch_size, cursor=None):
    # Use the list operation if available in your Pinecone version
    try:
        result = source_index.list(limit=batch_size, cursor=cursor)
        vectors = result.get("vectors", {})
        next_cursor = result.get("cursor")
        return vectors, next_cursor
    except AttributeError:
        # For older Pinecone versions without list operation
        # This is a simplified approach; actual implementation depends on your data access pattern
        query_response = source_index.query(
            vector=[0] * source_index.describe_index_stats()["dimension"],
            top_k=batch_size,
            include_metadata=True,
            include_values=True
        )
        return {item.id: {"id": item.id, "values": item.values, "metadata": item.metadata} 
                for item in query_response.matches}, None

# 3. Function to upload vectors to Rememberizer
def upload_to_rememberizer(vectors):
    headers = {
        "x-api-key": REMEMBERIZER_API_KEY,
        "Content-Type": "application/json"
    }
    
    for vector_id, vector_data in vectors.items():
        # Convert Pinecone vector data to Rememberizer format
        document_name = vector_data.get("metadata", {}).get("filename", f"pinecone_doc_{vector_id}")
        content = vector_data.get("metadata", {}).get("text", "")
        
        if not content:
            print(f"Skipping {vector_id} - no text content found in metadata")
            continue
            
        data = {
            "name": document_name,
            "content": content,
            # Optional: include additional metadata
            "metadata": vector_data.get("metadata", {})
        }
        
        response = requests.post(
            f"{BASE_URL}/vector-stores/{VECTOR_STORE_ID}/documents/text",
            headers=headers,
            json=data
        )
        
        if response.status_code == 201:
            print(f"Document '{document_name}' uploaded successfully!")
        else:
            print(f"Error uploading document {document_name}: {response.text}")
        
        # Add a small delay to prevent rate limiting
        time.sleep(0.1)

# 4. Main migration function
def migrate_pinecone_to_rememberizer():
    cursor = None
    total_migrated = 0
    
    print("Starting migration from Pinecone to Rememberizer...")
    
    while True:
        vectors, cursor = fetch_vectors_from_pinecone("your-pinecone-index", BATCH_SIZE, cursor)
        
        if not vectors:
            break
            
        print(f"Fetched {len(vectors)} vectors from Pinecone")
        upload_to_rememberizer(vectors)
        
        total_migrated += len(vectors)
        print(f"Progress: {total_migrated} vectors migrated")
        
        if not cursor:
            break
    
    print(f"Migration complete! {total_migrated} total vectors migrated to Rememberizer")

# Run the migration
# migrate_pinecone_to_rememberizer()
const { PineconeClient } = require('@pinecone-database/pinecone');
const axios = require('axios');

// Pinecone configuration
const pineconeApiKey = 'PINECONE_API_KEY';
const pineconeEnvironment = 'PINECONE_ENVIRONMENT';
const pineconeIndexName = 'YOUR_PINECONE_INDEX';

// Rememberizer configuration
const rememberizerApiKey = 'YOUR_REMEMBERIZER_API_KEY';
const vectorStoreId = 'vs_abc123';
const baseUrl = 'https://api.rememberizer.ai/api/v1';

// Batch size configuration
const BATCH_SIZE = 100;

// Initialize Pinecone client
async function initPinecone() {
  const pinecone = new PineconeClient();
  await pinecone.init({
    apiKey: pineconeApiKey,
    environment: pineconeEnvironment,
  });
  return pinecone;
}

// Fetch vectors from Pinecone
async function fetchVectorsFromPinecone(pinecone, batchSize, paginationToken = null) {
  const index = pinecone.Index(pineconeIndexName);
  
  try {
    // For newer Pinecone versions
    const listResponse = await index.list({
      limit: batchSize,
      paginationToken: paginationToken
    });
    
    return {
      vectors: listResponse.vectors || {},
      nextToken: listResponse.paginationToken
    };
  } catch (error) {
    // Fallback for older Pinecone versions
    // This is simplified; actual implementation depends on your data access pattern
    const stats = await index.describeIndexStats();
    const dimension = stats.dimension;
    
    const queryResponse = await index.query({
      vector: Array(dimension).fill(0),
      topK: batchSize,
      includeMetadata: true,
      includeValues: true
    });
    
    const vectors = {};
    queryResponse.matches.forEach(match => {
      vectors[match.id] = {
        id: match.id,
        values: match.values,
        metadata: match.metadata
      };
    });
    
    return { vectors, nextToken: null };
  }
}

// Upload vectors to Rememberizer
async function uploadToRememberizer(vectors) {
  const headers = {
    'x-api-key': rememberizerApiKey,
    'Content-Type': 'application/json'
  };
  
  const results = [];
  
  for (const [vectorId, vectorData] of Object.entries(vectors)) {
    const documentName = vectorData.metadata?.filename || `pinecone_doc_${vectorId}`;
    const content = vectorData.metadata?.text || '';
    
    if (!content) {
      console.log(`Skipping ${vectorId} - no text content found in metadata`);
      continue;
    }
    
    const data = {
      name: documentName,
      content: content,
      // Optional: include additional metadata
      metadata: vectorData.metadata || {}
    };
    
    try {
      const response = await axios.post(
        `${baseUrl}/vector-stores/${vectorStoreId}/documents/text`,
        data,
        { headers }
      );
      
      if (response.status === 201) {
        console.log(`Document '${documentName}' uploaded successfully!`);
        results.push({ id: vectorId, success: true });
      } else {
        console.error(`Error uploading document ${documentName}: ${response.statusText}`);
        results.push({ id: vectorId, success: false, error: response.statusText });
      }
    } catch (error) {
      console.error(`Error uploading document ${documentName}: ${error.message}`);
      results.push({ id: vectorId, success: false, error: error.message });
    }
    
    // Add a small delay to prevent rate limiting
    await new Promise(resolve => setTimeout(resolve, 100));
  }
  
  return results;
}

// Main migration function
async function migratePineconeToRememberizer() {
  try {
    console.log('Starting migration from Pinecone to Rememberizer...');
    
    const pinecone = await initPinecone();
    let nextToken = null;
    let totalMigrated = 0;
    
    do {
      const { vectors, nextToken: token } = await fetchVectorsFromPinecone(
        pinecone, 
        BATCH_SIZE, 
        nextToken
      );
      
      nextToken = token;
      
      if (Object.keys(vectors).length === 0) {
        break;
      }
      
      console.log(`Fetched ${Object.keys(vectors).length} vectors from Pinecone`);
      
      const results = await uploadToRememberizer(vectors);
      const successCount = results.filter(r => r.success).length;
      
      totalMigrated += successCount;
      console.log(`Progress: ${totalMigrated} vectors migrated successfully`);
      
    } while (nextToken);
    
    console.log(`Migration complete! ${totalMigrated} total vectors migrated to Rememberizer`);
    
  } catch (error) {
    console.error('Migration failed:', error);
  }
}

// Run the migration
// migratePineconeToRememberizer();

Migrating from Qdrant

import requests
import json
import time
from qdrant_client import QdrantClient
from qdrant_client.http import models as rest

# Set up Qdrant client
QDRANT_URL = "http://localhost:6333"  # or your Qdrant cloud URL
QDRANT_API_KEY = "your_qdrant_api_key"  # if using Qdrant Cloud
QDRANT_COLLECTION_NAME = "your_collection"

qdrant_client = QdrantClient(
    url=QDRANT_URL,
    api_key=QDRANT_API_KEY  # Only for Qdrant Cloud
)

# Set up Rememberizer Vector Store client
REMEMBERIZER_API_KEY = "your_rememberizer_api_key"
VECTOR_STORE_ID = "vs_abc123"  # Your Rememberizer vector store ID
BASE_URL = "https://api.rememberizer.ai/api/v1"

# Batch size for processing
BATCH_SIZE = 100

# Function to fetch points from Qdrant
def fetch_points_from_qdrant(collection_name, batch_size, offset=0):
    try:
        # Get collection info to determine vector dimension
        collection_info = qdrant_client.get_collection(collection_name=collection_name)
        
        # Scroll through points
        scroll_result = qdrant_client.scroll(
            collection_name=collection_name,
            limit=batch_size,
            offset=offset,
            with_payload=True,
            with_vectors=True
        )
        
        points = scroll_result[0]  # Tuple of (points, next_offset)
        next_offset = scroll_result[1]
        
        return points, next_offset
    except Exception as e:
        print(f"Error fetching points from Qdrant: {e}")
        return [], None

# Function to upload vectors to Rememberizer
def upload_to_rememberizer(points):
    headers = {
        "x-api-key": REMEMBERIZER_API_KEY,
        "Content-Type": "application/json"
    }
    
    results = []
    
    for point in points:
        # Extract data from Qdrant point
        point_id = point.id
        metadata = point.payload
        text_content = metadata.get("text", "")
        document_name = metadata.get("filename", f"qdrant_doc_{point_id}")
        
        if not text_content:
            print(f"Skipping {point_id} - no text content found in payload")
            continue
            
        data = {
            "name": document_name,
            "content": text_content,
            # Optional: include additional metadata
            "metadata": metadata
        }
        
        try:
            response = requests.post(
                f"{BASE_URL}/vector-stores/{VECTOR_STORE_ID}/documents/text",
                headers=headers,
                json=data
            )
            
            if response.status_code == 201:
                print(f"Document '{document_name}' uploaded successfully!")
                results.append({"id": point_id, "success": True})
            else:
                print(f"Error uploading document {document_name}: {response.text}")
                results.append({"id": point_id, "success": False, "error": response.text})
        except Exception as e:
            print(f"Exception uploading document {document_name}: {str(e)}")
            results.append({"id": point_id, "success": False, "error": str(e)})
        
        # Add a small delay to prevent rate limiting
        time.sleep(0.1)
    
    return results

# Main migration function
def migrate_qdrant_to_rememberizer():
    offset = None
    total_migrated = 0
    
    print("Starting migration from Qdrant to Rememberizer...")
    
    while True:
        points, next_offset = fetch_points_from_qdrant(
            QDRANT_COLLECTION_NAME, 
            BATCH_SIZE,
            offset
        )
        
        if not points:
            break
            
        print(f"Fetched {len(points)} points from Qdrant")
        
        results = upload_to_rememberizer(points)
        success_count = sum(1 for r in results if r.get("success", False))
        
        total_migrated += success_count
        print(f"Progress: {total_migrated} points migrated successfully")
        
        if next_offset is None:
            break
            
        offset = next_offset
    
    print(f"Migration complete! {total_migrated} total points migrated to Rememberizer")

# Run the migration
# migrate_qdrant_to_rememberizer()
const { QdrantClient } = require('@qdrant/js-client-rest');
const axios = require('axios');

// Qdrant configuration
const qdrantUrl = 'http://localhost:6333'; // or your Qdrant cloud URL
const qdrantApiKey = 'your_qdrant_api_key'; // if using Qdrant Cloud
const qdrantCollectionName = 'your_collection';

// Rememberizer configuration
const rememberizerApiKey = 'YOUR_REMEMBERIZER_API_KEY';
const vectorStoreId = 'vs_abc123';
const baseUrl = 'https://api.rememberizer.ai/api/v1';

// Batch size configuration
const BATCH_SIZE = 100;

// Initialize Qdrant client
const qdrantClient = new QdrantClient({ 
  url: qdrantUrl,
  apiKey: qdrantApiKey // Only for Qdrant Cloud
});

// Fetch points from Qdrant
async function fetchPointsFromQdrant(collectionName, batchSize, offset = 0) {
  try {
    // Get collection info
    const collectionInfo = await qdrantClient.getCollection(collectionName);
    
    // Scroll through points
    const scrollResult = await qdrantClient.scroll(collectionName, {
      limit: batchSize,
      offset: offset,
      with_payload: true,
      with_vectors: true
    });
    
    return {
      points: scrollResult.points,
      nextOffset: scrollResult.next_page_offset
    };
  } catch (error) {
    console.error(`Error fetching points from Qdrant: ${error.message}`);
    return { points: [], nextOffset: null };
  }
}

// Upload vectors to Rememberizer
async function uploadToRememberizer(points) {
  const headers = {
    'x-api-key': rememberizerApiKey,
    'Content-Type': 'application/json'
  };
  
  const results = [];
  
  for (const point of points) {
    // Extract data from Qdrant point
    const pointId = point.id;
    const metadata = point.payload || {};
    const textContent = metadata.text || '';
    const documentName = metadata.filename || `qdrant_doc_${pointId}`;
    
    if (!textContent) {
      console.log(`Skipping ${pointId} - no text content found in payload`);
      continue;
    }
    
    const data = {
      name: documentName,
      content: textContent,
      // Optional: include additional metadata
      metadata: metadata
    };
    
    try {
      const response = await axios.post(
        `${baseUrl}/vector-stores/${vectorStoreId}/documents/text`,
        data,
        { headers }
      );
      
      if (response.status === 201) {
        console.log(`Document '${documentName}' uploaded successfully!`);
        results.push({ id: pointId, success: true });
      } else {
        console.error(`Error uploading document ${documentName}: ${response.statusText}`);
        results.push({ id: pointId, success: false, error: response.statusText });
      }
    } catch (error) {
      console.error(`Error uploading document ${documentName}: ${error.message}`);
      results.push({ id: pointId, success: false, error: error.message });
    }
    
    // Add a small delay to prevent rate limiting
    await new Promise(resolve => setTimeout(resolve, 100));
  }
  
  return results;
}

// Main migration function
async function migrateQdrantToRememberizer() {
  try {
    console.log('Starting migration from Qdrant to Rememberizer...');
    
    let offset = null;
    let totalMigrated = 0;
    
    do {
      const { points, nextOffset } = await fetchPointsFromQdrant(
        qdrantCollectionName, 
        BATCH_SIZE, 
        offset
      );
      
      offset = nextOffset;
      
      if (points.length === 0) {
        break;
      }
      
      console.log(`Fetched ${points.length} points from Qdrant`);
      
      const results = await uploadToRememberizer(points);
      const successCount = results.filter(r => r.success).length;
      
      totalMigrated += successCount;
      console.log(`Progress: ${totalMigrated} points migrated successfully`);
      
    } while (offset !== null);
    
    console.log(`Migration complete! ${totalMigrated} total points migrated to Rememberizer`);
    
  } catch (error) {
    console.error('Migration failed:', error);
  }
}

// Run the migration
// migrateQdrantToRememberizer();

Migrating from Supabase pgvector

If you're already using Supabase with pgvector, the migration to Rememberizer is particularly straightforward since both use PostgreSQL with the pgvector extension.

import psycopg2
import requests
import json
import time
import os
from dotenv import load_dotenv

# Load environment variables
load_dotenv()

# Supabase PostgreSQL configuration
SUPABASE_DB_HOST = os.getenv("SUPABASE_DB_HOST")
SUPABASE_DB_PORT = os.getenv("SUPABASE_DB_PORT", "5432")
SUPABASE_DB_NAME = os.getenv("SUPABASE_DB_NAME")
SUPABASE_DB_USER = os.getenv("SUPABASE_DB_USER")
SUPABASE_DB_PASSWORD = os.getenv("SUPABASE_DB_PASSWORD")
SUPABASE_VECTOR_TABLE = os.getenv("SUPABASE_VECTOR_TABLE", "documents")

# Rememberizer configuration
REMEMBERIZER_API_KEY = os.getenv("REMEMBERIZER_API_KEY")
VECTOR_STORE_ID = os.getenv("VECTOR_STORE_ID")  # e.g., "vs_abc123"
BASE_URL = "https://api.rememberizer.ai/api/v1"

# Batch size for processing
BATCH_SIZE = 100

# Connect to Supabase PostgreSQL
def connect_to_supabase():
    try:
        conn = psycopg2.connect(
            host=SUPABASE_DB_HOST,
            port=SUPABASE_DB_PORT,
            dbname=SUPABASE_DB_NAME,
            user=SUPABASE_DB_USER,
            password=SUPABASE_DB_PASSWORD
        )
        return conn
    except Exception as e:
        print(f"Error connecting to Supabase PostgreSQL: {e}")
        return None

# Fetch documents from Supabase pgvector
def fetch_documents_from_supabase(conn, batch_size, offset=0):
    try:
        cursor = conn.cursor()
        
        # Adjust this query based on your table structure
        query = f"""
        SELECT id, content, metadata, embedding
        FROM {SUPABASE_VECTOR_TABLE}
        ORDER BY id
        LIMIT %s OFFSET %s
        """
        
        cursor.execute(query, (batch_size, offset))
        documents = cursor.fetchall()
        cursor.close()
        
        return documents
    except Exception as e:
        print(f"Error fetching documents from Supabase: {e}")
        return []

# Upload documents to Rememberizer
def upload_to_rememberizer(documents):
    headers = {
        "x-api-key": REMEMBERIZER_API_KEY,
        "Content-Type": "application/json"
    }
    
    results = []
    
    for doc in documents:
        doc_id, content, metadata, embedding = doc
        
        # Parse metadata if it's stored as JSON string
        if isinstance(metadata, str):
            try:
                metadata = json.loads(metadata)
            except:
                metadata = {}
        elif metadata is None:
            metadata = {}
        
        document_name = metadata.get("filename", f"supabase_doc_{doc_id}")
        
        if not content:
            print(f"Skipping {doc_id} - no content found")
            continue
            
        data = {
            "name": document_name,
            "content": content,
            "metadata": metadata
        }
        
        try:
            response = requests.post(
                f"{BASE_URL}/vector-stores/{VECTOR_STORE_ID}/documents/text",
                headers=headers,
                json=data
            )
            
            if response.status_code == 201:
                print(f"Document '{document_name}' uploaded successfully!")
                results.append({"id": doc_id, "success": True})
            else:
                print(f"Error uploading document {document_name}: {response.text}")
                results.append({"id": doc_id, "success": False, "error": response.text})
        except Exception as e:
            print(f"Exception uploading document {document_name}: {str(e)}")
            results.append({"id": doc_id, "success": False, "error": str(e)})
        
        # Add a small delay to prevent rate limiting
        time.sleep(0.1)
    
    return results

# Main migration function
def migrate_supabase_to_rememberizer():
    conn = connect_to_supabase()
    if not conn:
        print("Failed to connect to Supabase. Aborting migration.")
        return
    
    offset = 0
    total_migrated = 0
    
    print("Starting migration from Supabase pgvector to Rememberizer...")
    
    try:
        while True:
            documents = fetch_documents_from_supabase(conn, BATCH_SIZE, offset)
            
            if not documents:
                break
                
            print(f"Fetched {len(documents)} documents from Supabase")
            
            results = upload_to_rememberizer(documents)
            success_count = sum(1 for r in results if r.get("success", False))
            
            total_migrated += success_count
            print(f"Progress: {total_migrated} documents migrated successfully")
            
            offset += BATCH_SIZE
            
    finally:
        conn.close()
    
    print(f"Migration complete! {total_migrated} total documents migrated to Rememberizer")

# Run the migration
# migrate_supabase_to_rememberizer()
const { Pool } = require('pg');
const axios = require('axios');
require('dotenv').config();

// Supabase PostgreSQL configuration
const supabasePool = new Pool({
  host: process.env.SUPABASE_DB_HOST,
  port: process.env.SUPABASE_DB_PORT || 5432,
  database: process.env.SUPABASE_DB_NAME,
  user: process.env.SUPABASE_DB_USER,
  password: process.env.SUPABASE_DB_PASSWORD,
  ssl: {
    rejectUnauthorized: false
  }
});

const supabaseVectorTable = process.env.SUPABASE_VECTOR_TABLE || 'documents';

// Rememberizer configuration
const rememberizerApiKey = process.env.REMEMBERIZER_API_KEY;
const vectorStoreId = process.env.VECTOR_STORE_ID; // e.g., "vs_abc123"
const baseUrl = 'https://api.rememberizer.ai/api/v1';

// Batch size configuration
const BATCH_SIZE = 100;

// Fetch documents from Supabase pgvector
async function fetchDocumentsFromSupabase(batchSize, offset = 0) {
  try {
    // Adjust this query based on your table structure
    const query = `
      SELECT id, content, metadata, embedding
      FROM ${supabaseVectorTable}
      ORDER BY id
      LIMIT $1 OFFSET $2
    `;
    
    const result = await supabasePool.query(query, [batchSize, offset]);
    return result.rows;
  } catch (error) {
    console.error(`Error fetching documents from Supabase: ${error.message}`);
    return [];
  }
}

// Upload documents to Rememberizer
async function uploadToRememberizer(documents) {
  const headers = {
    'x-api-key': rememberizerApiKey,
    'Content-Type': 'application/json'
  };
  
  const results = [];
  
  for (const doc of documents) {
    // Parse metadata if it's stored as JSON string
    let metadata = doc.metadata;
    if (typeof metadata === 'string') {
      try {
        metadata = JSON.parse(metadata);
      } catch (e) {
        metadata = {};
      }
    } else if (metadata === null) {
      metadata = {};
    }
    
    const documentName = metadata.filename || `supabase_doc_${doc.id}`;
    
    if (!doc.content) {
      console.log(`Skipping ${doc.id} - no content found`);
      continue;
    }
    
    const data = {
      name: documentName,
      content: doc.content,
      metadata: metadata
    };
    
    try {
      const response = await axios.post(
        `${baseUrl}/vector-stores/${vectorStoreId}/documents/text`,
        data,
        { headers }
      );
      
      if (response.status === 201) {
        console.log(`Document '${documentName}' uploaded successfully!`);
        results.push({ id: doc.id, success: true });
      } else {
        console.error(`Error uploading document ${documentName}: ${response.statusText}`);
        results.push({ id: doc.id, success: false, error: response.statusText });
      }
    } catch (error) {
      console.error(`Error uploading document ${documentName}: ${error.message}`);
      results.push({ id: doc.id, success: false, error: error.message });
    }
    
    // Add a small delay to prevent rate limiting
    await new Promise(resolve => setTimeout(resolve, 100));
  }
  
  return results;
}

// Main migration function
async function migrateSupabaseToRememberizer() {
  try {
    console.log('Starting migration from Supabase pgvector to Rememberizer...');
    
    let offset = 0;
    let totalMigrated = 0;
    
    while (true) {
      const documents = await fetchDocumentsFromSupabase(BATCH_SIZE, offset);
      
      if (documents.length === 0) {
        break;
      }
      
      console.log(`Fetched ${documents.length} documents from Supabase`);
      
      const results = await uploadToRememberizer(documents);
      const successCount = results.filter(r => r.success).length;
      
      totalMigrated += successCount;
      console.log(`Progress: ${totalMigrated} documents migrated successfully`);
      
      offset += BATCH_SIZE;
    }
    
    console.log(`Migration complete! ${totalMigrated} total documents migrated to Rememberizer`);
    
  } catch (error) {
    console.error('Migration failed:', error);
  } finally {
    await supabasePool.end();
  }
}

// Run the migration
// migrateSupabaseToRememberizer();

Migration Best Practices

Follow these recommendations for a successful migration:

  1. Plan Ahead:

    • Estimate the data volume and time required for migration

    • Schedule migration during low-traffic periods

    • Increase disk space before starting large migrations

  2. Test First:

    • Create a test vector store in Rememberizer

    • Migrate a small subset of data (100-1000 vectors)

    • Verify search functionality with key queries

  3. Data Validation:

    • Compare document counts before and after migration

    • Run benchmark queries to ensure similar results

    • Validate that metadata is correctly preserved

  4. Optimize for Performance:

    • Use batch operations for efficiency

    • Consider geographic colocation of source and target databases

    • Monitor API rate limits and adjust batch sizes accordingly

  5. Post-Migration Steps:

    • Verify index creation in Rememberizer

    • Update application configurations to point to new vector store

    • Keep source database as backup until migration is verified

For detailed API reference and endpoint documentation, visit the Vector Store Documentation page.


Make sure to handle the API keys securely and follow best practices for API key management.

What are Vector Embeddings and Vector Databases?
Create a New Vector Store
View Details of a Vector Store
Create a New API Key
Create a New Vector Store
View Details of a Vector Store
Create a New API Key