Rememberizer Docs
Sign inSign upContact us
English
English
  • Why Rememberizer?
  • Background
    • What are Vector Embeddings and Vector Databases?
    • Glossary
    • Standardized Terminology
  • Personal Use
    • Getting Started
      • Search your knowledge
      • Mementos Filter Access
      • Common knowledge
      • Manage your embedded knowledge
  • Integrations
    • Rememberizer App
    • Rememberizer Slack integration
    • Rememberizer Google Drive integration
    • Rememberizer Dropbox integration
    • Rememberizer Gmail integration
    • Rememberizer Memory integration
    • Rememberizer MCP Servers
    • Manage third-party apps
  • Developer Resources
    • Developer Overview
  • Integration Options
    • Registering and using API Keys
    • Registering Rememberizer apps
    • Authorizing Rememberizer apps
    • Creating a Rememberizer GPT
    • LangChain integration
    • Vector Stores
    • Talk-to-Slack the Sample Web App
  • Enterprise Integration
    • Enterprise Integration Patterns
  • API Reference
    • API Documentation Home
    • Authentication
  • Core APIs
    • Search for documents by semantic similarity
    • Retrieve documents
    • Retrieve document contents
    • Retrieve Slack content
    • Memorize content to Rememberizer
  • Account & Configuration
    • Retrieve current user account details
    • List available data source integrations
    • Mementos
    • Get all added public knowledge
  • Vector Store APIs
    • Vector Store Documentation
    • Get vector store information
    • Get a list of documents in a Vector Store
    • Get document information
    • Add new text document to a Vector Store
    • Upload files to a Vector Store
    • Update file content in a Vector Store
    • Remove a document in Vector Store
    • Search for Vector Store documents by semantic similarity
  • Additional Resources
    • Notices
      • Terms of Use
      • Privacy Policy
      • B2B
        • About Reddit Agent
  • Releases
    • Release Notes Home
  • 2025 Releases
    • Apr 25th, 2025
    • Apr 18th, 2025
    • Apr 11th, 2025
    • Apr 4th, 2025
    • Mar 28th, 2025
    • Mar 21st, 2025
    • Mar 14th, 2025
    • Jan 17th, 2025
  • 2024 Releases
    • Dec 27th, 2024
    • Dec 20th, 2024
    • Dec 13th, 2024
    • Dec 6th, 2024
  • Nov 29th, 2024
  • Nov 22nd, 2024
  • Nov 15th, 2024
  • Nov 8th, 2024
  • Nov 1st, 2024
  • Oct 25th, 2024
  • Oct 18th, 2024
  • Oct 11th, 2024
  • Oct 4th, 2024
  • Sep 27th, 2024
  • Sep 20th, 2024
  • Sep 13th, 2024
  • Aug 16th, 2024
  • Aug 9th, 2024
  • Aug 2nd, 2024
  • Jul 26th, 2024
  • Jul 12th, 2024
  • Jun 28th, 2024
  • Jun 14th, 2024
  • May 31st, 2024
  • May 17th, 2024
  • May 10th, 2024
  • Apr 26th, 2024
  • Apr 19th, 2024
  • Apr 12th, 2024
  • Apr 5th, 2024
  • Mar 25th, 2024
  • Mar 18th, 2024
  • Mar 11th, 2024
  • Mar 4th, 2024
  • Feb 26th, 2024
  • Feb 19th, 2024
  • Feb 12th, 2024
  • Feb 5th, 2024
  • Jan 29th, 2024
  • Jan 22nd, 2024
  • Jan 15th, 2024
  • LLM Documentation
    • Rememberizer LLM Ready Documentation
Powered by GitBook
On this page
  • How Rememberizer Uses Vector Embeddings
  • Understanding Vector Embeddings
  • Beyond Text: Multimodal Embeddings
  • Real-World Applications
  • How Rememberizer's Vector Search Differs from Keyword Search
  • Technical Resources
  • The Foundation of Modern AI
  1. Background

What are Vector Embeddings and Vector Databases?

Why Rememberizer is more than just a database or keyword search engine

PreviousBackgroundNextGlossary

Last updated 1 month ago

Rememberizer uses vector embeddings in vector databases to enable searches for semantic similarity within user knowledge sources. This is a fundamentally more advanced and nuanced form of information retrieval than simply looking for keywords in content through a traditional search engine or database.

How Rememberizer Uses Vector Embeddings

In their most advanced form (as used by Rememberizer), vector embeddings are created by language models with architectures similar to the AI LLMs (Large Language Models) that underpin OpenAI's GPT models and ChatGPT service, as well as models/services from Google (Gemini), Anthropic (Claude), Meta (LLaMA), and others.

Understanding Vector Embeddings

What does a vector embedding look like? Consider a coordinate (x,y) in two dimensions. If it represents a line from the origin to this point, we can think of it as a line with a direction—in other words, a vector in two dimensions.

In the context of Rememberizer, a vector embedding is typically a list of several hundred numbers (often 768, 1024, or 1536) representing a vector in a high-dimensional space. This list of numbers can represent weights in a Transformer model that define the meaning in a phrase such as "A bolt of lightning out of the blue." This is fundamentally the same underlying representation of meaning used in models like GPT-4. As a result, a good vector embedding enables the same sophisticated understanding that we see in modern AI language models.

Beyond Text: Multimodal Embeddings

Vector embeddings can represent more than just text—they can also encode other types of data such as images or sound. With properly trained models, you can compare across media types, allowing a vector embedding of text to be compared to an image, or vice versa.

Currently, Rememberizer enables searches within the text component of user documents and knowledge. Text-to-image and image-to-text search capabilities are on Rememberizer's roadmap for future development.

Real-World Applications

Major technology companies leverage vector embeddings in their products:

How Rememberizer's Vector Search Differs from Keyword Search

Keyword search finds exact matches or predetermined synonyms. In contrast, Rememberizer's vector search finds content that's conceptually related, even when different terminology is used. For example:

  • A keyword search for "dog care" might miss a relevant document about "canine health maintenance"

  • Rememberizer's vector search would recognize these concepts as semantically similar and return both

This capability makes Rememberizer particularly powerful for retrieving relevant information from diverse knowledge sources.

Coming soon: Vector Search Process Visualization

This diagram will illustrate the complete semantic search workflow in Rememberizer:

  • Document chunking and preprocessing

  • Vector embedding generation process

  • Storage in vector database

  • Search query embedding

  • Similarity matching calculation

  • Side-by-side comparison with traditional keyword search

Technical Resources

To deeply understand how vector embeddings and vector databases work:

The Foundation of Modern AI

The technologies behind vector embeddings have evolved significantly over time:

One remarkable aspect of Transformer-based models is their scaling properties—as they use more data and have more parameters, their understanding and capabilities improve dramatically. This scaling property was observed with models like GPT-2 and has driven the rapid advancement of AI capabilities.

This makes vector embeddings a natural choice for discovering relevant knowledge to include in the context of AI model prompts. The technologies are complementary and conceptually related. For this reason, most providers of LLMs as a service also produce vector embeddings as a service (for example: or ).

Google uses vector embeddings to power both their text search (text-to-text) and image search (text-to-image) capabilities ()

Meta (Facebook) has implemented embeddings for their social network search ()

Snapchat utilizes vector embeddings to understand context and serve targeted advertising ()

Start with the

Pinecone (a vector database service) offers a good

Meta's FAISS library: "FAISS: A Library for Efficient Similarity Search and Clustering of Dense Vectors" by Johnson, Douze, and Jégou (2017) provides comprehensive insights into efficient vector similarity search ()

The 2017 paper "Attention Is All You Need" () introduced the Transformer architecture that powers modern LLMs and advanced embedding models

"Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality" (, ) established the theory for efficient similarity search in high-dimensional spaces

BERT (2018, ) demonstrated the power of bidirectional training for language understanding tasks

Earlier methods like GloVe (2014, ) and Word2Vec (2013, ) laid the groundwork for neural word embeddings

For technical implementation details and developer-oriented guidance on using vector stores with Rememberizer, see .

Google researchers were behind the original Transformer architecture described in "Attention Is All You Need" (), though many organizations have since built upon and extended this foundational work.

Together AI's embeddings endpoint
OpenAI's text and code embeddings
reference
reference
reference
overview from Hugging Face
introduction to vector embeddings
GitHub repository
reference
1998
2010
reference
reference
reference
Vector Stores
patent reference
Visualization of a multidimensional vector space
A multidimensional vector space visualization