Glossary

A comprehensive glossary of terms and concepts used in Rememberizer

This glossary provides definitions for key terms and concepts used throughout the Rememberizer documentation. Use it as a reference when you encounter unfamiliar terminology.

Note: This glossary represents the standardized terminology for Rememberizer. While you may encounter slight variations in the documentation, the terms and definitions provided here should be considered the canonical reference.

A

API Key: A secure authentication token used to access Rememberizer's API endpoints programmatically. API keys are primarily used for vector store access and common knowledge integration.

Authorized Request Origin: A security setting that specifies which domains can make API requests to Rememberizer, limiting potential cross-site request forgery attacks.

B

Batch Operations: Processing multiple items (searches, uploads, etc.) in a single request to improve efficiency. Rememberizer supports batch operations for high-volume workloads.

Batch Size: The number of items processed together during operations like migration, search, or document ingestion, affecting performance and resource usage.

C

Chunking: The process of dividing documents into optimally sized pieces (typically 512-2048 bytes) with overlapping boundaries to preserve context during vector searches.

Client ID: A public identifier issued to third-party applications that enables OAuth2 authorization with Rememberizer.

Client Secret: A private key issued with a Client ID that must be kept secure and is used to authenticate the application during OAuth2 flows.

Collection-based Organization: The way vector stores are organized in Rememberizer, with each store having its own isolated collection for data management.

Common Knowledge: Information published by users that can be accessed by other users or applications, creating a shared knowledge resource. Common Knowledge is based on a Memento and can be accessed via API. Also sometimes referred to as "Shared Knowledge" in the user interface.

Context Windows: The surrounding content included with matched chunks in search results, controlled by prev_chunks and next_chunks parameters.

Cosine Similarity: A measure of similarity between vectors calculated by finding the cosine of the angle between them, used as the default search metric in Rememberizer.

E

Embedding Model: An AI model that generates vector embeddings from text. Rememberizer supports several embedding models, including OpenAI's text-embedding-3-large and text-embedding-3-small.

Enterprise Integration Patterns: Standardized approaches for implementing Rememberizer in large-scale enterprise environments, including architectural designs for security, scaling, and compliance.

G

Global Settings: System-wide configurations for controlling default permissions and behaviors across all connected apps in Rememberizer.

H

HNSW (Hierarchical Navigable Small World): An indexing algorithm offering better accuracy for large datasets at the cost of higher memory requirements, available as an indexing option in Rememberizer Vector Stores.

I

Indexing Algorithm: The method used to organize vectors for efficient retrieval. Rememberizer supports IVFFLAT (default) and HNSW algorithms.

IVFFLAT: An indexing algorithm that provides a good balance of search speed and accuracy for vector databases, used as the default in Rememberizer.

K

Data Source: The various origins of data in Rememberizer, including integrations with platforms like Google Drive, Slack, Dropbox, and Gmail. Also referred to as "Knowledge Source" or "Integration" in some contexts.

L

LangChain Integration: Functionality that allows Rememberizer to be used as a retriever in LangChain applications, supporting RAG (Retrieval Augmented Generation) systems.

M

Memento: A filtering mechanism that controls which knowledge is shared with third-party applications, allowing users to selectively share specific files, documents, or content groups. Sometimes referred to as "Memento Filter" in the user interface.

Memory Integration: A feature enabling apps to store valuable information in Rememberizer for later retrieval, with configurable read/write permissions. Also referred to as "Shared Memory" in some contexts.

O

OAuth2 Authentication: The standard authorization protocol used for third-party apps to access Rememberizer data with user consent, providing secure delegated access. Sometimes shortened to "OAuth" in the documentation.

R

RAG (Retrieval Augmented Generation): A technique that combines retrieval systems (like Rememberizer) with generative models to provide more accurate, grounded responses based on specific knowledge.

Read Own/Write Own: A permission level where apps can only access and modify their own memory data in Rememberizer.

Read All/Write Own: A permission level where apps can read memory data from all apps but can only modify their own memory data.

Reindexing: The process of rebuilding vector indexes after significant changes to improve search performance in Rememberizer Vector Stores.

RememberizerRetriever: The specific LangChain retriever class that interfaces with Rememberizer's semantic search capabilities.

Rememberizer GPT: A custom GPT application that integrates with Rememberizer's API to provide access to personal knowledge within ChatGPT.

Rememberizer Vector Store: A PostgreSQL-based vector database service with pgvector extension that handles chunking, vectorizing, and storing text data. The terms "Vector Store" and "Vector Database" are used interchangeably in Rememberizer documentation, with "Vector Store" being the preferred term.

S

Search Metric: The mathematical method used to calculate similarity between vectors. Rememberizer supports cosine similarity (default), inner product, and L2 (Euclidean) distance. The terms "distance," "similarity," and "matching" are sometimes used interchangeably to refer to how closely vectors relate to each other.

Semantic Search: Search functionality that finds content based on meaning rather than just keywords, allowing for conceptually related results even when terminology differs.

Shared Memory: A system that allows third-party apps to store and access data in a user's Rememberizer account, providing persistence across multiple applications.

V

Vector Database: A specialized database optimized for storing and retrieving vector embeddings efficiently, enabling semantic search capabilities.

Vector Dimension: The size of vector embeddings (typically 768-1536 numbers), affecting the detail and nuance captured in the semantic representation.

Vector Embeddings: Numerical representations (lists of several hundred numbers) that capture the semantic meaning of text, allowing for similarity comparisons beyond keyword matching. Often referred to simply as "Embeddings" in technical contexts.

API Header Conventions

When using Rememberizer APIs, the following header conventions should be followed:

  • Authorization Header: Authorization: Bearer YOUR_JWT_TOKEN

  • API Key Header: X-API-Key: YOUR_API_KEY (capitalized as shown)

  • Content-Type Header: Content-Type: application/json

For more in-depth explanations of key concepts:

Last updated