Rememberizer LLM Ready Documentation
Generated at 2025-07-10 20:31:45 PDT. Available as raw content at Rememberizer llms-full.txt.
This document provides a comprehensive, consolidated reference of Rememberizer's documentation, optimized for large language model (LLM) consumption. It combines various documentation sources into a single, easily accessible format, facilitating efficient information retrieval and processing by AI systems.
==> SUMMARY.md <==
# Summary
* [Why Rememberizer?](README.md)
* [Background](background/README.md)
* [What are Vector Embeddings and Vector Databases?](background/what-are-vector-embeddings-and-vector-databases.md)
* [Glossary](background/glossary.md)
* [Standardized Terminology](background/standardized-terminology.md)
## Personal Use
* [Getting Started](personal/README.md)
* [Search your knowledge](personal/search-your-knowledge.md)
* [Mementos Filter Access](personal/mementos-filter-access.md)
* [Common knowledge](personal/common-knowledge.md)
* [Manage your embedded knowledge](personal/manage-your-embedded-knowledge.md)
## Integrations
* [Rememberizer App](personal/rememberizer-app.md)
* [Rememberizer Slack integration](personal/rememberizer-slack-integration.md)
* [Rememberizer Google Drive integration](personal/rememberizer-google-drive-integration.md)
* [Rememberizer Dropbox integration](personal/rememberizer-dropbox-integration.md)
* [Rememberizer Gmail integration](personal/rememberizer-gmail-integration.md)
* [Rememberizer Memory integration](personal/rememberizer-memory-integration.md)
* [Rememberizer MCP Servers](personal/rememberizer-mcp-servers.md)
* [Manage third-party apps](personal/manage-third-party-apps.md)
## Developer Resources
* [Developer Overview](developer/README.md)
## Integration Options
* [Registering and using API Keys](developer/registering-and-using-api-keys.md)
* [Registering Rememberizer apps](developer/registering-rememberizer-apps.md)
* [Authorizing Rememberizer apps](developer/authorizing-rememberizer-apps.md)
* [Creating a Rememberizer GPT](developer/creating-a-rememberizer-gpt.md)
* [LangChain integration](developer/langchain-integration.md)
* [Vector Stores](developer/vector-stores.md)
* [Talk-to-Slack the Sample Web App](developer/talk-to-slack-the-sample-web-app.md)
## Enterprise Integration
* [Enterprise Integration Patterns](developer/enterprise-integration-patterns.md)
## API Reference
* [API Documentation Home](developer/api-docs/README.md)
* [Authentication](developer/api-docs/authentication.md)
### Core APIs
* [Search for documents by semantic similarity](developer/api-docs/search-for-documents-by-semantic-similarity.md)
* [Retrieve documents](developer/api-docs/retrieve-documents.md)
* [Retrieve document contents](developer/api-docs/retrieve-document-contents.md)
* [Retrieve Slack content](developer/api-docs/retrieve-slack-content.md)
* [Memorize content to Rememberizer](developer/api-docs/memorize-content-to-rememberizer.md)
### Account & Configuration
* [Retrieve current user account details](developer/api-docs/retrieve-current-user-account-details.md)
* [List available data source integrations](developer/api-docs/list-available-data-source-integrations.md)
* [Mementos](developer/api-docs/mementos.md)
* [Get all added public knowledge](developer/api-docs/get-all-added-public-knowledge.md)
### Vector Store APIs
* [Vector Store Documentation](developer/api-docs/vector-store/README.md)
* [Get vector store information](developer/api-docs/vector-store/get-vector-stores-information.md)
* [Get a list of documents in a Vector Store](developer/api-docs/vector-store/get-a-list-of-documents-in-a-vector-store.md)
* [Get document information](developer/api-docs/vector-store/get-document-information.md)
* [Add new text document to a Vector Store](developer/api-docs/vector-store/add-new-text-document-to-a-vector-store.md)
* [Upload files to a Vector Store](developer/api-docs/vector-store/upload-files-to-a-vector-store.md)
* [Update file content in a Vector Store](developer/api-docs/vector-store/update-file-content-in-a-vector-store.md)
* [Remove a document in Vector Store](developer/api-docs/vector-store/remove-a-document-in-vector-store.md)
* [Search for Vector Store documents by semantic similarity](developer/api-docs/vector-store/search-for-vector-store-documents-by-semantic-similarity.md)
## Additional Resources
* [Notices](notices/README.md)
* [Terms of Use](notices/terms-of-use.md)
* [Privacy Policy](notices/privacy-policy.md)
* [B2B](notices/b2b/README.md)
* [About Reddit Agent](notices/b2b/about-reddit-agent.md)
## Releases
* [Release Notes Home](releases/README.md)
### 2025 Releases
* [Jul 11th, 2025](releases/jul-11th-2025.md)
* [Jul 4th, 2025](releases/jul-4th-2025.md)
* [Jun 27th, 2025](releases/jun-27th-2025.md)
* [Jun 20th, 2025](releases/jun-20th-2025.md)
* [Jun 6th, 2025](releases/jun-6th-2025.md)
* [May 30th, 2025](releases/may-30th-2025.md)
* [May 23rd, 2025](releases/may-23rd-2025.md)
* [Apr 25th, 2025](releases/apr-25th-2025.md)
* [Apr 18th, 2025](releases/apr-18th-2025.md)
* [Apr 11th, 2025](releases/apr-11th-2025.md)
* [Apr 4th, 2025](releases/apr-4th-2025.md)
* [Mar 28th, 2025](releases/mar-28th-2025.md)
* [Mar 21st, 2025](releases/mar-21st-2025.md)
* [Mar 14th, 2025](releases/mar-14th-2025.md)
* [Jan 17th, 2025](releases/jan-17th-2025.md)
### 2024 Releases
#### December 2024
* [Dec 27th, 2024](releases/dec-27th-2024.md)
* [Dec 20th, 2024](releases/dec-20th-2024.md)
* [Dec 13th, 2024](releases/dec-13th-2024.md)
* [Dec 6th, 2024](releases/dec-6th-2024.md)
#### November 2024
* [Nov 29th, 2024](releases/nov-29th-2024.md)
* [Nov 22nd, 2024](releases/nov-22nd-2024.md)
* [Nov 15th, 2024](releases/nov-15th-2024.md)
* [Nov 8th, 2024](releases/nov-8th-2024.md)
* [Nov 1st, 2024](releases/nov-1st-2024.md)
#### October 2024
* [Oct 25th, 2024](releases/oct-25th-2024.md)
* [Oct 18th, 2024](releases/oct-18th-2024.md)
* [Oct 11th, 2024](releases/oct-11th-2024.md)
* [Oct 4th, 2024](releases/oct-4th-2024.md)
#### September 2024
* [Sep 27th, 2024](releases/sep-27th-2024.md)
* [Sep 20th, 2024](releases/sep-20th-2024.md)
* [Sep 13th, 2024](releases/sep-13th-2024.md)
#### August 2024
* [Aug 16th, 2024](releases/aug-16th-2024.md)
* [Aug 9th, 2024](releases/aug-9th-2024.md)
* [Aug 2nd, 2024](releases/aug-2nd-2024.md)
#### July 2024
* [Jul 26th, 2024](releases/jul-26th-2024.md)
* [Jul 12th, 2024](releases/jul-12th-2024.md)
#### June 2024
* [Jun 28th, 2024](releases/jun-28th-2024.md)
* [Jun 14th, 2024](releases/jun-14th-2024.md)
#### May 2024
* [May 31st, 2024](releases/may-31st-2024.md)
* [May 17th, 2024](releases/may-17th-2024.md)
* [May 10th, 2024](releases/may-10th-2024.md)
#### April 2024
* [Apr 26th, 2024](releases/apr-26th-2024.md)
* [Apr 19th, 2024](releases/apr-19th-2024.md)
* [Apr 12th, 2024](releases/apr-12th-2024.md)
* [Apr 5th, 2024](releases/apr-5th-2024.md)
#### March 2024
* [Mar 25th, 2024](releases/mar-25th-2024.md)
* [Mar 18th, 2024](releases/mar-18th-2024.md)
* [Mar 11th, 2024](releases/mar-11th-2024.md)
* [Mar 4th, 2024](releases/mar-4th-2024.md)
#### February 2024
* [Feb 26th, 2024](releases/feb-26th-2024.md)
* [Feb 19th, 2024](releases/feb-19th-2024.md)
* [Feb 12th, 2024](releases/feb-12th-2024.md)
* [Feb 5th, 2024](releases/feb-5th-2024.md)
#### January 2024
* [Jan 29th, 2024](releases/jan-29th-2024.md)
* [Jan 22nd, 2024](releases/jan-22nd-2024.md)
* [Jan 15th, 2024](releases/jan-15th-2024.md)
## LLM Documentation
* [Rememberizer LLM Ready Documentation](rememberizer-llm-ready-documentation.md)
==> README.md <==
---
description: Introduction
---
# Why Rememberizer?
Generative AI apps work better when they have access to background information. They need to know what you know. A great way to achieve that is to give them access to relevant content from the documents, data and discussions you create and use. That is what Rememberizer does.
==> personal/rememberizer-slack-integration.md <==
---
description: >-
This guide will walk you through the process of integrating your Slack
workspace into Rememberizer as a knowledge source.
type: guide
last_updated: 2025-04-03
---
# Rememberizer Slack Integration
## Overview
The Slack integration allows you to connect your Slack workspace to Rememberizer, enabling AI applications to search and reference your team's Slack messages and shared files. This integration creates a searchable knowledge base from your conversations, announcements, questions, and decisions captured in Slack.
## Before You Begin
Before connecting Slack to Rememberizer, ensure you:
- Have a Rememberizer account
- Have access to a Slack workspace where you have permission to install apps
- Understand which channels contain knowledge you want to make searchable
- Consider any organizational data policies regarding third-party integrations
## Connection Process
### Step 1: Access the Knowledge Sources
1. Sign in to your Rememberizer account
2. Navigate to **Personal > Your Knowledge** tab, or visit [https://rememberizer.ai/personal/knowledge](https://rememberizer.ai/personal/knowledge)
3. You should see all available knowledge sources, including Slack
<figure><img src="../.gitbook/assets/slack_personal_knowledge.png" alt="Your Knowledge, ready to connect to Slack"><figcaption><p>Your Knowledge sources page with Slack option</p></figcaption></figure>
### Step 2: Initiate Slack Connection
1. Click the **"Connect"** button on the Slack knowledge source card
2. You will be redirected to Slack's authorization page
3. Select the Slack workspace you want to connect (if you belong to multiple workspaces)
<figure><img src="../.gitbook/assets/slack_oauth.png" alt="Slack OAuth screen"><figcaption><p>Slack OAuth authorization screen</p></figcaption></figure>
> **Note:** If you see a warning that this application is not authorized by Slack, it is because Rememberizer is intended to search for Slack content outside of Slack, which is against the [Slack App Directory Guidelines](https://api.slack.com/directory/guidelines). This doesn't affect the functionality or security of the integration.
### Step 3: Grant Permissions
1. Review the permissions Rememberizer is requesting:
- Read access to public channels
- Read access to private channels you're a member of
- Read access to message history
- Read access to files
2. Click **"Allow"** to install the Rememberizer Slack app to your workspace
### Step 4: Select Channels for Indexing
1. After successful authorization, you'll be redirected back to Rememberizer
2. A side panel will automatically open showing available channels
3. If the panel doesn't appear, click the **"Select"** button next to your Slack workspace
<figure><img src="../.gitbook/assets/slack_auth_redirect.png" alt="A COMPANY has been added as a knowledge source"><figcaption><p>Successful Slack workspace connection</p></figcaption></figure>
### Step 5: Choose Specific Channels
1. In the side panel, browse the list of available channels
2. Select checkboxes next to channels you want to include
3. You can filter by channel type (public/private) or search by name
4. Consider starting with just a few channels for faster initial processing
<figure><img src="../.gitbook/assets/slack_choose_knowledge.png" alt="Select channels to be embedded as knowledge"><figcaption><p>Select specific Slack channels to index</p></figcaption></figure>
### Step 6: Begin Processing
1. After selecting channels, click **"Save"** at the bottom of the panel
2. Rememberizer will begin downloading, processing, and embedding messages
3. You'll see a progress indicator as channels are processed
4. Initial processing may take several minutes to hours depending on the volume of messages
## How Slack Data is Processed
When you connect Slack to Rememberizer, the following occurs:
1. **Authentication**: Secure OAuth connection established with refresh token capability
2. **Channel Selection**: Only your selected channels are accessed
3. **Message Retrieval**: Messages and threaded replies are downloaded in batches
4. **Content Processing**:
- Messages are chunked into appropriate segments
- Vector embeddings are generated to capture semantic meaning
- Files shared in messages are processed based on supported formats
5. **Continuous Updates**: New messages are periodically synced (approximately every 6 hours)
## Data Refresh and Synchronization
Rememberizer automatically keeps your Slack knowledge up to date:
- **Incremental Updates**: Only new or changed messages are processed after initial indexing
- **Update Schedule**: Automatic synchronization occurs approximately every 6 hours
- **Thread Monitoring**: New replies in threads are detected and indexed
- **Manual Refresh**: Force an immediate update by clicking the "Refresh" icon next to your Slack connection
## Security and Privacy Considerations
The Slack integration includes several security measures:
- **OAuth Security**: Industry-standard authorization protocol with token encryption
- **Selective Access**: Only processes channels you explicitly select
- **Encrypted Storage**: All message content is encrypted before storage
- **No Message Alteration**: Read-only access to your Slack workspace (cannot post or modify messages)
- **Permission Scopes**: Limited to only the permissions needed for search functionality
- **Account Linking**: Connection is specific to your Rememberizer account only
## Troubleshooting Common Issues
### Authorization Failures
**Problem**: Unable to connect to Slack or authorization errors.
**Solutions**:
- Ensure you have permissions to install apps in your workspace
- Try disconnecting and reconnecting the integration
- Check if your organization uses Slack Enterprise Grid with app restrictions
### Missing Channels
**Problem**: Some channels don't appear in the selection panel.
**Solutions**:
- Verify you're a member of the channels you want to index
- For private channels, you must be a member to index them
- Refresh the channel list by clicking the refresh icon
### Processing Delays
**Problem**: Indexing is taking a very long time.
**Solutions**:
- Start with fewer channels and add more gradually
- Check if channels have an extremely large message history
- Verify your internet connection is stable
### Authentication Expiration
**Problem**: Integration stops working after some time.
**Solutions**:
- Reconnect the integration through the Knowledge page
- Check if the Slack app was removed from your workspace
- Verify your Rememberizer account is active
## Limitations and Considerations
- **Message History**: Up to 100,000 messages per channel can be indexed
- **File Types**: Supported files include PDFs, text documents, and spreadsheets
- **Private Channels**: Only private channels you're a member of can be indexed
- **Direct Messages**: DMs are not currently supported for privacy reasons
- **Enterprise Restrictions**: Some Slack Enterprise Grid features may affect integration
## What's Next?
After connecting Slack to Rememberizer:
1. Use [Mementos](mementos-filter-access.md) to control which AI tools can access your Slack knowledge
2. Combine Slack with other knowledge sources for comprehensive context
3. Try [searching your knowledge](https://rememberizer.ai/personal/search) through the web UI
4. Connect your knowledge to AI tools using GPT integration or the Rememberizer API
If you encounter any issues during setup or use, contact our support team for assistance.
==> personal/manage-third-party-apps.md <==
# Manage third-party apps
## Explore third-party apps and service
User can view and explore all third-party apps that connect with Rememberizer in **App directory** page with the below instructions.
* On the navigation bar, choose **Personal > Find an App**. Then, you will see the App directory page.
<figure><img src="../.gitbook/assets/navbar_browsing_app_dir.png" alt="Navigation bar browsing App Directory page"><figcaption><p>Navigation bar browsing App Directory page</p></figcaption></figure>
<figure><img src="../.gitbook/assets/app_dir_page.png" alt="App directory page"><figcaption><p>App directory page</p></figcaption></figure>
* Find the app you want to explore. You can do this by type the name of the app in search bar with optional **filter and sorting order.**
<figure><img src="../.gitbook/assets/search_app_dir_page.png" alt="Search bar with filter and sort order button"><figcaption><p>Search bar with filter and sort order button</p></figcaption></figure>
* Click on the **name of the third-party app** or **Explore button** to open the app. 
<figure><img src="../.gitbook/assets/location_name_explore_button.png" alt="App's name and Explore button"><figcaption><p>App's name and Explore button</p></figcaption></figure>
* When using the app, it will requires authorizing the apps with Rememberizer. Technical details of the flow can be visited at [authorizing-rememberizer-apps.md](../developer/authorizing-rememberizer-apps.md "mention") page. We will use **Rememberizer GPT app** as an example of the UI flows of authorization. After the first chat, you will see the app ask to sign in the Rememberizer.
<figure><img src="../.gitbook/assets/RememberizerGPT_auth.png" alt="Sign in request from Rememberizer GPT app"><figcaption><p>Sign in request from Rememberizer GPT app</p></figcaption></figure>
* Click on the **Sign in** button. You will be redirected to the Authorization page.
<figure><img src="../.gitbook/assets/authorize_third_party_page.png" alt="Authoriztion page"><figcaption><p>Authoriztion page</p></figcaption></figure>
* You can modify the Memento and Memory that the app can view and use by click on the **Change** button and select what you want.
> **Note:** Detail information about Memento, please visit [mementos-filter-access.md](mementos-filter-access.md "mention") page.
> **Note:** Detail information about Memory integration, please visit [rememberizer-memory-integration.md](rememberizer-memory-integration.md "mention") page.
* Click **Authorize** to complete the process. You then will be directed back to the app again and you can chat with it normally.
> **Note:** In case you click **Cancel** button, you will be directed to the app landing page again and the app will not be displayed in the **App directory** page but will instead be on **Your connected apps** page. More detail information please visit second part [#manage-your-connected-apps](manage-third-party-apps.md#manage-your-connected-apps "mention") if you want to completely cancel the authorization process.
<figure><img src="../.gitbook/assets/success_auth_rememberizer_gpt.png" alt="Success connected account"><figcaption><p>Success connected account</p></figcaption></figure>
## Manage your connected apps
On the **App directory** page, choose **Your connected apps** to browse the page. 
<figure><img src="../.gitbook/assets/browse_your_connected_app.png" alt="browse your connected app"><figcaption></figcaption></figure>
<figure><img src="../.gitbook/assets/your_connected_app_page.png" alt="Your connected apps page"><figcaption><p>Your connected apps page</p></figcaption></figure>
This page categorizes apps into two types based on their status: **Pending Apps** and **Connected Apps**.
* **Pending Apps**: These are the apps which you click **Cancel** button while authorizing the the app on Rememberizer. 
* Click **Continue** if you want to complete the authorization process. 
* Otherwise, click **Cancel** to completely withdraw the authorization. The app will then be displayed in **App Directory** page again.
* **Connected Apps:** You can config the **Memento** or **Memory integration** of specific connected app by click on the Change option (or Select if the Memento has not been chosen). Click **Disconnect** if you want to disconnect the third-party app from the Rememberizer.
==> personal/rememberizer-memory-integration.md <==
# Rememberizer Memory integration
### Introduction
Rememberizer Memory allows third party apps to store and access data in a user's Rememberizer account, providing a simple way for valuable information to be saved and utilize across multiple user' applications.
### Benefits
#### For User
Shared Memory creates a single place where key results and information from all the user's apps are available in one location. Some benefits for user include:
* Easy Access: Important data is centralized, allowing both the user and their apps to easily access results from multiple apps in one place.
* Sync Between Apps: Information can be shared and synced between a user's different apps seamlessly without extra effort from the user.
* Persistent Storage: Data remains accessible even if individual apps are uninstalled, unlike app-specific local storage.
#### For App Developers
The Shared Memory provides app developers a simple way to access data from a user's other connected apps:
* No Backend Needed: Apps do not need to develop their own custom backend systems to store and share data.
* Leverage Other Apps: Apps can build on and utilize public data generated by a user's other installed apps, enriching their own functionality.
* Cross-App Integration: Seamless integration and data sharing capabilities are enabled between an app developer's different apps.
By default all apps have read-only access to Shared Memory, while each app can write only to its own memory space. The user has controls to customize access permissions as needed. This balances data sharing with user privacy and control.
### Config Your Memory
#### Global Settings
The Global Settings allow user to configure the default permissions for all apps using Shared Memory. This includes:
<figure><img src="../.gitbook/assets/memory_global_config.png" alt="Config Memory in Knowledge Page"><figcaption><p>Config Memory in Knowledge Page</p></figcaption></figure>
#### Default Memory and Data Access Permissions for Apps
* **Read Own/Write Own:** Apps are exclusively permitted to access and modify their own memory data.
* **Read All/Write Own:** Apps can read memory data across all apps but are restricted to modifying only their own memory data.
* **Disable Memory:** By default, apps cannot access or store memory data.
* **Apply to All Option**: User can apply all app-specific permission settings back to the defaults chosen in Global Settings.
<figure><img src="../.gitbook/assets/memory_settings_panel.png" alt="memory settings panel" width="375"><figcaption></figcaption></figure>
User can clear all Memory documents with _**Forget your memory**_ option:
<figure><img src="../.gitbook/assets/forget_memory_popup.png" alt="Confirmation Modal when Forget Memory"><figcaption><p>Confirmation Modal when Forget Memory</p></figcaption></figure>
#### App Settings
For each connected app, user can customize the Shared Memory permissions. Click on the **"Find an App"**, then click **"Your connected apps"** or go to the link [https://rememberizer.ai/personal/apps/connected](https://rememberizer.ai/personal/apps/connected) to see the list of your connected apps. Then, click **"Change"** on the Memory of the app you want to custom:
<figure><img src="../.gitbook/assets/app_config_memory.png" alt="Config Memory for each App in Connected Apps Page"><figcaption><p>Config Memory for each App in Connected Apps Page</p></figcaption></figure>
#### Memory Access Permissions for Apps
* **Read Own/Write Own**: Permissions allow the app to only access and modify its own memory data, preventing it from interacting with other apps' memory.
* **Read All/Write Own**: The app can view memory data from all apps but is restricted to modifying only its own memory data.
* **Disable Memory**: The app is prohibited from accessing or modifying memory data.
This allows user fine-grained control over how each app can utilize Shared Memory based on the user's trust in that specific app. Permissions for individual apps can be more restrictive than the global defaults.
Together, the Global and App Settings give user powerful yet easy-to-use controls over how their data is shared through Shared Memory.
### Integrate with Memory Feature
#### API Endpoint
Rememberizer expose an API end point [/**api/v1/documents/memorize/**](https://docs.rememberizer.ai/developer/api-docs/memorize-content-to-rememberizer) to let GPT App call to memorize the content.
Note: This api is available for Memory with [3rd-party apps with OAuth2 authentication](../developer/authorizing-rememberizer-apps.md) only (not [API-key](../developer/registering-and-using-api-keys.md) yet).
#### Memorize your knowledge
After authorizing with Rememberizer, the third party app can memorize it valuable knowledge.
Here, we will demonstrate a process using Remembeizer GPT App.
* After using Rememberizer GPT App, user want to memorize the third point "Zero-Cost Abstractions".
<figure><img src="../.gitbook/assets/interact_rememberizer_gpt.png" alt="Interacting with Rememberizer GPT Apps" width="375"><figcaption><p>Interacting with Rememberizer GPT Apps</p></figcaption></figure>
* To use the Rememberizer App's Memory feature, user must first authorize the app to access your project. Use the **memorize** command to tell the app what knowledge it needs to store.
<figure><img src="../.gitbook/assets/rememberizer_auth_sign_in.png" alt="Sign In to authorize Rememberizer" width="563"><figcaption><p>Sign In to authorize Rememberizer</p></figcaption></figure>
* User can Config the Memory Option here, with the default value is based on the Global Config
<figure><img src="../.gitbook/assets/authorize_connection_screen.png" alt="Authorizing Screen" width="563"><figcaption><p>Authorizing Screen</p></figcaption></figure>
The Rememberizer now successfully memorizes knowledge.
<figure><img src="../.gitbook/assets/successful_memorize_knowledge.png" alt="successful memorize knowledge" width="563"><figcaption></figcaption></figure>
* In Rememberizer, user can see the recent content at **Embed Knowledge Details** page.
<figure><img src="../.gitbook/assets/embedded_knowledge_detail.png" alt="embedded knowledge detail" width="563"><figcaption></figcaption></figure>
With the **Talk to Slack** app, user can seamlessly apply and continue their progress using the data they have committed to memory. For example, memorized information they can easily query and retrieve
<figure><img src="../.gitbook/assets/recall_memory_talk_to_slack.png" alt="Recall Memory Data in another app"><figcaption><p>Recall Memory Data in another app</p></figcaption></figure>
### Using Memory Data via Memento
* Another way to utilize the memory data is by creating **Memento** and refine the Memory into it. Visit [Memento Feature](mementos-filter-access.md#how-to-create-a-mementos) section for further information about creation instruction.
* Rememberizer saves content into files and user can choose any app to refine its content into **Memento**.
> Note: In older version, Rememberizer saves content into files and combine into folder for each date.
<figure><img src="../.gitbook/assets/memory_memento_feature.png" alt="memory memento feature" width="563"><figcaption></figcaption></figure>
With the [Memento Feature](https://docs.rememberizer.ai/personal/mementos-filter-access#what-is-a-memento-and-why-do-they-exist), user can utilize the Memory data even when the Memory App Config is Off.
### Search Memory document in Rememberizer
You can also [Search Your Knowledge](https://rememberizer.ai/personal/search) through our web UI, or better, use this knowledge in an LLM through our GPT app or our public API.
==> personal/rememberizer-dropbox-integration.md <==
---
description: >-
This guide will walk you through the process of integrating your Dropbox into
Rememberizer as a knowledge source.
type: guide
last_updated: 2025-04-03
---
# Rememberizer Dropbox Integration
## Overview
The Dropbox integration allows you to connect your Dropbox files and folders to Rememberizer, creating a searchable knowledge base from your documents, presentations, spreadsheets, and other content. This integration enables AI applications to reference your Dropbox content when answering questions or generating insights.
## Before You Begin
Before connecting Dropbox to Rememberizer, ensure you:
- Have a Rememberizer account
- Have a Dropbox account with content you want to make searchable
- Understand which files and folders contain valuable knowledge
- Consider any organizational or personal data policies
## Connection Process
### Step 1: Access Knowledge Sources
1. Sign in to your Rememberizer account
2. Navigate to **Personal > Your Knowledge** tab, or visit [https://rememberizer.ai/personal/knowledge](https://rememberizer.ai/personal/knowledge)
3. Locate the Dropbox card in the available knowledge sources
<figure><img src="../.gitbook/assets/dropbox_personal_knowledge.png" alt="Dropbox knowledge source card"><figcaption><p>Dropbox knowledge source card on the Knowledge page</p></figcaption></figure>
### Step 2: Initiate Dropbox Connection
1. Click the **"Connect"** button on the Dropbox knowledge source card
2. You will be redirected to the Dropbox authorization page
3. If you're not already logged in to Dropbox, enter your credentials
4. Review the permissions Rememberizer is requesting
<figure><img src="../.gitbook/assets/dropbox_oauth.png" alt="Dropbox OAuth authorization screen"><figcaption><p>Dropbox permission request screen</p></figcaption></figure>
### Step 3: Grant Permissions
1. Click **"Allow"** to authorize Rememberizer to access your Dropbox files
2. This grants Rememberizer read-only access to your files and folders
### Step 4: Return to Rememberizer
1. After successful authorization, you'll be redirected back to Rememberizer
2. The platform will display a connection confirmation
3. A file selection panel will automatically open
<figure><img src="../.gitbook/assets/dropbox_auth_redirect.png" alt="Successful Dropbox connection"><figcaption><p>Successful Dropbox connection with file selection panel</p></figcaption></figure>
### Step 5: Select Files and Folders
1. In the side panel, browse your Dropbox folder structure
2. Select specific files or entire folders by checking the boxes
3. Navigate through folders using the breadcrumb navigation
4. If the side panel doesn't appear, click the **"Select"** button next to your Dropbox connection
<figure><img src="../.gitbook/assets/dropbox_choose_knowledge.png" alt="Dropbox file selection panel"><figcaption><p>Select files and folders to process</p></figcaption></figure>
### Step 6: Begin Processing
1. After selecting your files and folders, click **"Add"**
2. Rememberizer will begin downloading, processing, and embedding your files
3. You'll see a progress indicator as files are processed
4. Initial processing may take several minutes to hours depending on the amount of data
## How Dropbox Data is Processed
When you connect Dropbox to Rememberizer, the following occurs:
1. **Authentication**: Secure OAuth connection established with Dropbox
2. **File Selection**: Only your selected files and folders are accessed
3. **Content Extraction**: Text is extracted from compatible file formats
4. **Content Processing**:
- Documents are chunked into appropriate segments
- Vector embeddings are generated for each chunk
- Metadata such as file name, path, and type is preserved
5. **Continuous Updates**: Files are monitored for changes through the Dropbox API
## Supported File Types
The Dropbox integration supports various file formats, including:
| Category | Supported Formats |
|----------|-------------------|
| Text Documents | .txt, .md, .rtf, .csv |
| Office Documents | .docx, .doc, .xlsx, .xls, .pptx, .ppt |
| PDF Documents | .pdf |
| Code Files | .py, .js, .java, .html, .css, and more |
| Data Files | .json, .xml, .yaml, .csv |
## Data Refresh and Synchronization
Rememberizer automatically keeps your Dropbox knowledge up to date:
- **Change Detection**: Uses Dropbox's cursor-based synchronization to detect file changes
- **Update Schedule**: Automatic synchronization occurs approximately every 6 hours
- **Selective Updates**: Only changed files are reprocessed, not your entire Dropbox
- **Manual Refresh**: Force an immediate update by clicking the "Refresh" icon next to your Dropbox connection
- **New Files in Selected Folders**: If you select a folder, any new files added to that folder will be automatically detected and processed
## Security and Privacy Considerations
The Dropbox integration includes several security measures:
- **OAuth Security**: Industry-standard authorization with secure token management
- **Selective Access**: Only processes files you explicitly select
- **Encrypted Storage**: All document content is encrypted before storage
- **Read-Only Access**: Cannot modify or delete your Dropbox files
- **Permission Revocation**: You can revoke access at any time through Dropbox settings
- **Data Handling**: Files are processed for vector embedding creation and original content is not permanently stored
## Troubleshooting Common Issues
### Authentication Problems
**Problem**: Unable to connect to Dropbox or authorization errors.
**Solutions**:
- Ensure you're using the correct Dropbox account
- Try clearing browser cookies and cache
- Verify you don't have browser extensions blocking OAuth redirects
- Check if Dropbox is accessible directly through your browser
### Missing Files or Folders
**Problem**: Some files or folders don't appear in the selection panel.
**Solutions**:
- Verify the files exist in your Dropbox account
- Try refreshing the file browser
- Check if files are in a shared folder with limited permissions
- Ensure files are synced to Dropbox (not just on your computer)
### Processing Errors
**Problem**: Files fail to process or show error status.
**Solutions**:
- Check if the file format is supported
- Verify the file isn't corrupted or password-protected
- For large files, allow more time for processing
- Try removing and re-adding problematic files
### Synchronization Issues
**Problem**: Updated Dropbox content isn't reflected in searches.
**Solutions**:
- Check when the last sync occurred (visible in the connection details)
- Manually trigger a refresh of your Dropbox connection
- Verify that the file was changed at least 6 hours ago (time for auto-sync)
- Check if the file still exists in the original location
## Managing Multiple Dropbox Accounts
### Connecting to Another Dropbox Account
If you want to switch to a different Dropbox account:
1. First, revoke Rememberizer's access to your current Dropbox account:
* Go to the [Dropbox website](https://www.dropbox.com/) and sign in
* Click your profile picture in the upper-right corner
* Select "Settings" from the dropdown menu
* Navigate to the "Connected apps" tab
* Find Rememberizer in the list and click "Disconnect"
* Sign out of your current Dropbox account
2. Then reconnect in Rememberizer:
* Go to the Knowledge page in Rememberizer
* If your current connection is still active, click the three dots (⋮) menu and select "Disconnect"
* Click "Connect" on the Dropbox card
* You should now be prompted to authorize with your new Dropbox account
<div style="background-color: #f8f9fa; padding: 15px; border-radius: 5px; border-left: 5px solid #0061fe;">
<strong>Note:</strong> If you're still automatically connected to your previous account, try using private browsing/incognito mode to force Dropbox to prompt for authentication.
</div>
## Limitations and Considerations
- **Shared Folders**: Some shared folders may have permissions that affect processing
- **Paper Documents**: Dropbox Paper documents have limited support
- **File Size**: Very large files (>50MB) may process slowly or incompletely
- **Rate Limits**: Dropbox API rate limits may temporarily slow processing for large collections
- **Binary Files**: Executable files, images without text, and some specialized formats cannot be processed for text content
## Managing Your Dropbox Connection
### Adding More Files Later
1. Navigate to the Knowledge page
2. Find your Dropbox connection
3. Click the "Select" button to open the file browser
4. Choose additional files or folders
5. Click "Add" to process the new selections
### Removing Access to Files
1. Navigate to the Knowledge page
2. Find your Dropbox connection
3. Click the "Select" button
4. Uncheck files or folders you no longer want indexed
5. Click "Save" to update your selections
### Disconnecting Dropbox
1. Navigate to the Knowledge page
2. Find your Dropbox connection
3. Click the three dots (⋮) menu
4. Select "Disconnect"
5. Confirm the disconnection
## What's Next?
After connecting Dropbox to Rememberizer:
1. Use [Mementos](mementos-filter-access.md) to control which AI tools can access your Dropbox knowledge
2. Combine with other knowledge sources like Slack or Google Drive for more comprehensive context
3. Try [searching your knowledge](https://rememberizer.ai/personal/search) through the web UI
4. Connect your knowledge to AI tools using GPT integration or the Rememberizer API
If you encounter any issues during setup or use, contact our support team for assistance.
==> personal/rememberizer-google-drive-integration.md <==
---
description: >-
This guide will walk you through the process of integrating your Google Drive
into Rememberizer as a knowledge source.
type: guide
last_updated: 2025-04-03
---
# Rememberizer Google Drive Integration
## Overview
The Google Drive integration allows you to connect your Google Drive files and folders to Rememberizer, making your documents searchable through semantic search. This integration enables AI applications to reference your documents, presentations, spreadsheets, and other Google Drive content when answering your questions or providing insights.
## Before You Begin
Before connecting Google Drive to Rememberizer, ensure you:
- Have a Rememberizer account
- Have a Google account with access to Google Drive
- Understand which files and folders you want to make searchable
- Consider organizing your files to make selection easier
- Review any organizational policies about connecting Google Workspace to third-party services
## Connection Process
### Step 1: Access Knowledge Sources
1. Sign in to your Rememberizer account
2. Navigate to **Personal > Your Knowledge** tab, or visit [https://rememberizer.ai/personal/knowledge](https://rememberizer.ai/personal/knowledge)
3. You should see all available knowledge sources, including Google Drive
<figure><img src="../.gitbook/assets/drive_personal_knowledge.png" alt="Google Drive knowledge source card"><figcaption><p>Knowledge sources page with Google Drive option</p></figcaption></figure>
### Step 2: Initiate Google Drive Connection
1. Click the **"Connect"** button on the Google Drive knowledge source card
2. You will be redirected to Google's sign-in page
3. Select the Google account you want to connect (if you have multiple accounts)
<figure><img src="../.gitbook/assets/drive_oauth_step_1.jpg" alt="Google account selection"><figcaption><p>Select your Google account</p></figcaption></figure>
### Step 3: Grant Permissions
1. Review the app verification information and click **"Continue"**
<figure><img src="../.gitbook/assets/drive_oauth_step_2.jpg" alt="Google app verification screen"><figcaption><p>App verification screen</p></figcaption></figure>
2. Review the permission request to **"See and download all your Google Drive files"** and click **"Continue"**
<figure><img src="../.gitbook/assets/drive_oauth_step_3.jpg" alt="Permission request screen"><figcaption><p>Google Drive permissions screen</p></figcaption></figure>
### Step 4: Return to Rememberizer
1. After successful authorization, you'll be redirected back to Rememberizer
2. The platform will display a connection confirmation
3. A file selection panel will automatically open
<figure><img src="../.gitbook/assets/drive_auth_redirect.png" alt="Google Drive connection confirmation"><figcaption><p>Successful Google Drive connection</p></figcaption></figure>
### Step 5: Select Files and Folders
1. In the side panel, browse your Google Drive structure
2. Select specific files or entire folders by checking the boxes
3. Navigate through folders using the breadcrumb navigation
4. Use the search function to find specific files or folders
5. If the side panel does not appear, click the **"Select"** button next to your Google Drive connection
<figure><img src="../.gitbook/assets/drive_choose_knowledge.png" alt="File selection panel"><figcaption><p>Select files and folders to index</p></figcaption></figure>
### Step 6: Confirm Data Sharing Policy
1. After selecting files, check the box to acknowledge Rememberizer's data sharing policy
2. This confirms you understand that selected files may be accessible to AI applications you authorize
<figure><img src="../.gitbook/assets/drive_choose_knowledge_checkbox.png" alt="Data sharing policy checkbox"><figcaption><p>Confirm data sharing policy</p></figcaption></figure>
### Step 7: Begin Processing
1. Click **"Add"** to start the indexing process
2. Rememberizer will begin downloading, processing, and embedding your files
3. You'll see a progress indicator as files are processed
4. Initial processing may take several minutes to hours depending on the volume and size of files
<figure><img src="../.gitbook/assets/drive_indexing.png" alt="Indexing progress"><figcaption><p>File processing and indexing progress</p></figcaption></figure>
## How Google Drive Data is Processed
When you connect Google Drive to Rememberizer, the following occurs:
1. **Authentication**: Secure OAuth connection established with refresh token capability
2. **File Selection**: Only your selected files and folders are accessed
3. **Content Extraction**: Text is extracted from compatible file formats
4. **Content Processing**:
- Documents are chunked into appropriate segments
- Vector embeddings are generated for each chunk
- Metadata such as file name, path, and type is preserved
5. **Continuous Updates**: Changed files are detected and reprocessed automatically
## Supported File Types
The Google Drive integration supports various file formats, including:
| Category | Supported Formats |
|----------|-------------------|
| Google Workspace | Docs, Sheets, Slides, Drawings |
| Microsoft Office | Word (.docx, .doc), Excel (.xlsx, .xls), PowerPoint (.pptx, .ppt) |
| Text Documents | .txt, .md, .rtf, .csv |
| PDF Documents | .pdf |
| Other Formats | .json, .xml, .html, and more |
Large files over 50MB or heavily formatted documents may experience slower processing times.
## Data Refresh and Synchronization
Rememberizer automatically keeps your Google Drive knowledge up to date:
- **Change Detection**: Uses Google Drive's change tracking API to detect modifications
- **Update Schedule**: Automatic synchronization occurs approximately every 4 hours
- **Selective Updates**: Only changed files are reprocessed, not your entire Drive
- **Manual Refresh**: Force an immediate update by clicking the "Refresh" icon next to your Google Drive connection
- **New Files in Selected Folders**: If you select a folder, any new files added to that folder will be automatically detected and processed
## Security and Privacy Considerations
The Google Drive integration includes several security measures:
- **OAuth Security**: Industry-standard authorization with secure token management
- **Selective Access**: Only processes files you explicitly select
- **Encrypted Storage**: All document content is encrypted before storage
- **Read-Only Access**: Cannot modify or delete your Google Drive files
- **Permission Revocation**: You can revoke access at any time through Google account settings
- **Data Handling**: Original files are processed locally and not stored permanently
## Troubleshooting Common Issues
### Authentication Problems
**Problem**: Unable to connect to Google Drive or "Access Denied" errors.
**Solutions**:
- Check if you're using the correct Google account
- Try signing out of Google completely and signing back in
- Verify you don't have browser extensions blocking third-party cookies
- Check if your organization restricts Google Workspace connections
### Missing Files or Folders
**Problem**: Some files or folders don't appear in the selection panel.
**Solutions**:
- Verify file sharing permissions (you must have access to the files)
- Check if files are in the "Computers" backup section (not supported)
- Try refreshing the file browser with the refresh button
- For shared files, ensure you have at least Viewer access
### Processing Errors
**Problem**: Files fail to process or show error status.
**Solutions**:
- Check if the file format is supported
- Verify the file isn't corrupted or password-protected
- For large files, allow more time for processing
- Try reselecting the problematic files
### Synchronization Issues
**Problem**: Updated Google Drive content isn't reflected in searches.
**Solutions**:
- Check when the last sync occurred (visible in the connection details)
- Manually trigger a refresh of your Google Drive connection
- Verify that the file was changed at least 4 hours ago (time for auto-sync)
- Check if the file still exists in the original location
## Limitations of Google Drive Integration
- **"Computers" Section**: Files in the Google Drive "Computers" backup section cannot be accessed due to Google API restrictions
- **Shortcut Files**: Google Drive shortcuts may not process correctly
- **Shared Drives**: Some organization-specific restrictions may apply to Shared Drives
- **File Size**: Very large files (>50MB) may process slowly or incompletely
- **Rate Limits**: Google API rate limits may temporarily slow processing for large collections
For local file integration, consider using the [Rememberizer App](rememberizer-app.md) desktop application.
## Managing Your Google Drive Connection
### Adding More Files Later
1. Navigate to the Knowledge page
2. Find your Google Drive connection
3. Click the "Select" button to open the file browser
4. Choose additional files or folders
5. Click "Save" to process the new selections
### Removing Access to Files
1. Navigate to the Knowledge page
2. Find your Google Drive connection
3. Click the "Select" button
4. Uncheck files or folders you no longer want indexed
5. Click "Save" to update your selections
### Disconnecting Google Drive
1. Navigate to the Knowledge page
2. Find your Google Drive connection
3. Click the three dots (⋮) menu
4. Select "Disconnect"
5. Confirm the disconnection
Additionally, you can revoke Rememberizer's access through your [Google Account Security Settings](https://myaccount.google.com/permissions).
## What's Next?
After connecting Google Drive to Rememberizer:
1. Use [Mementos](mementos-filter-access.md) to control which AI tools can access your Google Drive knowledge
2. Combine with other knowledge sources like Slack or Dropbox for more comprehensive context
3. Try [searching your knowledge](https://rememberizer.ai/personal/search) through the web UI
4. Connect your knowledge to AI tools using GPT integration or the Rememberizer API
If you encounter any issues during setup or use, contact our support team for assistance.
==> personal/README.md <==
---
description: Your guide to Rememberizer's personal knowledge management features
type: guide
last_updated: 2025-04-03
---
# Personal Knowledge Management
Welcome to the personal section of Rememberizer documentation. This section covers all the features you need to harness the power of your personal knowledge and connect it with AI tools and applications.
## Personal Knowledge Management Overview
Rememberizer empowers you to transform scattered information across various sources into an organized, searchable knowledge base that works with AI. With Rememberizer, you can:
- **Connect multiple data sources** including Slack, Google Drive, Gmail, Dropbox, and more
- **Search across all your knowledge** using powerful semantic search technology
- **Organize access to your knowledge** with customizable Mementos
- **Share selected knowledge** through Common Knowledge
- **Connect your knowledge to AI tools** including ChatGPT, LangChain applications, and more
## Core Features
### Mementos: Granular Access Control
[Mementos](mementos-filter-access.md) are at the heart of Rememberizer's personal knowledge management system. These customizable filters allow you to:
- Create collections of specific documents, channels, or folders
- Control exactly what knowledge third-party applications can access
- Maintain privacy while still benefiting from AI capabilities
- Tailor different knowledge sets for different applications
### Powerful Knowledge Search
Rememberizer's [semantic search](search-your-knowledge.md) goes beyond simple keyword matching:
- Find information based on meaning, not just exact terms
- Search across all connected data sources simultaneously
- Filter searches by Memento to focus on relevant knowledge
- Use AI-enhanced agentic search for complex information needs
### Integrations
Connect your existing content from various platforms:
| Integration | Description |
|-------------|-------------|
| [Slack](rememberizer-slack-integration.md) | Access messages and files from your Slack workspaces |
| [Google Drive](rememberizer-google-drive-integration.md) | Connect documents from your Google Drive |
| [Gmail](rememberizer-gmail-integration.md) | Import emails from your Gmail account |
| [Dropbox](rememberizer-dropbox-integration.md) | Access files from your Dropbox account |
| [Memory](rememberizer-memory-integration.md) | Save and retrieve AI conversation history |
| [Rememberizer App](rememberizer-app.md) | Access local files through our desktop application |
### Knowledge Sharing and Enhancement
- [Common Knowledge](common-knowledge.md): Add pre-indexed knowledge from other users
- [Manage Embedded Knowledge](manage-your-embedded-knowledge.md): View and organize your indexed content
- [Third-party Apps](manage-third-party-apps.md): Control which apps can access your knowledge
## Getting Started: The Rememberizer Workflow
1. **Connect your data sources** - Set up integrations with your preferred platforms
2. **Create Mementos** - Organize your knowledge into purpose-specific collections
3. **Refine access** - Select which specific documents belong in each Memento
4. **Search your knowledge** - Use semantic search to find information across sources
5. **Connect with AI tools** - Authorize applications to access specific Mementos
## Documentation Navigation
### Essential Setup
- Start with [Mementos Filter Access](mementos-filter-access.md) to understand the core concept
- Explore the [Rememberizer App](rememberizer-app.md) for local file access
- Learn how to [Search Your Knowledge](search-your-knowledge.md) effectively
### Integrations
- Set up the [Slack integration](rememberizer-slack-integration.md) for team communications
- Connect [Google Drive](rememberizer-google-drive-integration.md) and [Gmail](rememberizer-gmail-integration.md) for Google Workspace content
- Add [Dropbox files](rememberizer-dropbox-integration.md) to your knowledge base
- Configure [Memory integration](rememberizer-memory-integration.md) for conversation history
### Advanced Features
- Explore [Common Knowledge](common-knowledge.md) to enhance your knowledge base
- Learn to [Manage Third-party Apps](manage-third-party-apps.md) that connect to your knowledge
- Understand [MCP Servers](rememberizer-mcp-servers.md) for enhanced capabilities
Ready to get started? Begin by setting up your first [Memento](mementos-filter-access.md) to organize your knowledge.
==> personal/search-your-knowledge.md <==
---
description: >-
In Rememberizer, you can post a theme or question, and Rememberizer will
provide a list of files and extracts parts that are conceptually similar.
---
# Search your knowledge
## Search in Rememberizer
* In the navigation bar, choose **Personal > Search Your Knowledge**. Then you will see the search page in Rememberizer
{% hint style="info" %}
Rememberizer's search uses advanced vector embeddings to find semantically similar content rather than just keyword matches. To learn more about how this technology works, see [What are Vector Embeddings and Vector Databases?](../background/what-are-vector-embeddings-and-vector-databases.md)
Developers can access this same semantic search capability via API. See [Search for documents by semantic similarity](../developer/api-docs/search-for-documents-by-semantic-similarity.md) for details.
{% endhint %}
<figure><img src="../.gitbook/assets/navbar_search_rememberizer (1).png" alt="navbar search rememberizer (1)"><figcaption></figcaption></figure>
<figure><img src="../.gitbook/assets/search_rememberizer_page.png" alt="search rememberizer page"><figcaption></figcaption></figure>
* Type the question or theme you want to search, then choose the the memento you want to limit the app's access and click Rememberizer button (or press Enter). The search process may take a few minutes, depending on the amount of data in the Memento. 
<figure><img src="../.gitbook/assets/memento_search_rememberizer.png" alt="Memento Filtering in search Rememberizer" width="269"><figcaption><p>Memento Filtering in search Rememberizer</p></figcaption></figure>
* Eventually, you will see list of documents matching question or theme you require. You can click to the file and it will dropdown the matching chunk text related to your question or theme.
<figure><img src="../.gitbook/assets/search_result_rememberizer.png" alt="An example of search result"><figcaption><p>An example of search result</p></figcaption></figure>
==> personal/rememberizer-app.md <==
---
description: Learn about the Rememberizer Desktop App that turns your local files into searchable knowledge
type: guide
last_updated: 2025-04-03
---
# Rememberizer App
## Introduction
The Rememberizer App is a desktop application that converts your local files into vector embeddings and uploads them to your Rememberizer knowledge base. This seamless integration enables AI applications to search and reference your personal files through Rememberizer's semantic search capabilities, providing answers based on your content without requiring direct access to your files.
## Benefits
* **Secure Data Integration:** Upload and process your files locally without sharing complete documents with third-party AI services
* **Data Utilization:** Transform your local documents into valuable, searchable knowledge
* **Semantic Understanding:** Leverage vector embeddings to enable concept-based search rather than just keyword matching
* **Powerful AI Integration:** Connect your knowledge to various AI systems including ChatGPT, Claude, and custom applications
* **Privacy Control:** Maintain ownership of your data while making it useful for AI assistants
## Supported Platforms
Currently, Rememberizer App is available for:
* **macOS**: Intel and Apple Silicon (M1/M2/M3) processors
Future planned support (not yet available):
* Windows (in development)
* Linux (under consideration)
## System Requirements
### macOS Requirements
* macOS 10.15 (Catalina) or newer
* 8GB RAM minimum (16GB recommended)
* 500MB free disk space for application
* Additional storage space for processed file caches
* Internet connection for authentication and uploading embeddings
### Hardware Acceleration
* **Apple Silicon Macs:** Automatically uses MPS-enabled PyTorch for optimized performance
* **Intel Macs with compatible GPU:** Can leverage GPU acceleration for faster processing
* **CPU-only systems:** Falls back to CPU processing with intelligent optimization
## Installation
1. Download the latest version of Rememberizer App from [the links provided here](#download-links)
2. Once the download is complete, locate the .dmg file in your downloads folder and double-click it
3. In the window that appears, drag the Rememberizer App icon to the Applications folder
4. Navigate to your Applications folder and open the Rememberizer App
5. If you see a security warning, follow these steps:
- Open System Preferences > Security & Privacy
- Click "Open Anyway" to authorize the app
- The app is securely signed but may trigger this warning on first use
## Configuration and Setup
### First-Time Setup
1. **Sign In:** Launch the app and sign in with your Rememberizer account. A browser window will open to authenticate.
<figure><img src="../.gitbook/assets/rememberizer_app_sign_in.png" alt="Rememberizer app sign in screen"><figcaption><p>Sign in to connect your Rememberizer account</p></figcaption></figure>
<figure><img src="../.gitbook/assets/rememberizer_app_success_auth.png" alt="Successful authentication screen"><figcaption><p>Successful authentication</p></figcaption></figure>
2. **Add Data Sources:** After signing in, the app runs in the background. Access it from the menu bar icon. Add folders containing documents you want to process.
<figure><img src="../.gitbook/assets/rememberizer_app_add_folder_1.png" alt="Adding a folder to Rememberizer"><figcaption><p>Access Rememberizer from the menu bar</p></figcaption></figure>
<figure><img src="../.gitbook/assets/rememberizer_app_add_folder_2.png" alt="Folder selection dialog"><figcaption><p>Select folders to add as data sources</p></figcaption></figure>
3. **Processing Files:** The app will begin analyzing and processing files in your selected folders. This involves:
- Scanning files and identifying supported formats
- Chunking file contents into optimally-sized segments
- Converting text into vector embeddings
- Uploading metadata and embeddings to your Rememberizer account
<figure><img src="../.gitbook/assets/rememberizer_app_status.png" alt="Rememberizer app status screen"><figcaption><p>Monitor processing status in the Status tab</p></figcaption></figure>
### Advanced Configuration
The Rememberizer App offers several configuration options to optimize performance:
1. **Background Processing:** Controls when file processing occurs:
- **Automatic (default):** Processes files continuously in the background
- **Manual:** Processes files only when explicitly triggered
2. **File Type Filtering:** Customize which file types are processed:
- **Default:** Processes all supported file types
- **Custom:** Specify file extensions to include or exclude
3. **Gitignore Support:** Automatically respects `.gitignore` rules in repositories:
- Prevents processing of excluded files
- Maintains consistency with your version control preferences
## Supported File Types
The Rememberizer App can process a wide range of file formats:
| Category | Supported Formats |
|----------|-------------------|
| Text Files | .txt, .md, .rtf, .csv, .json, .xml, .yml, .yaml, and more |
| Documents | .pdf, .doc, .docx, .odt, .pages |
| Presentations | .ppt, .pptx, .key |
| Spreadsheets | .xls, .xlsx, .numbers |
| Code Files | .py, .js, .java, .c, .cpp, .cs, .html, .css, .php, .r, .rb, .go, .rs, .swift, and more |
| Configuration | .ini, .conf, .config, .env |
| Data | .json, .xml, .csv, .tsv |
### File Size and Content Limitations
- Maximum file size: 50MB per file
- Maximum embedded text extraction: 1,000,000 characters per file
- Binary and executable files are not processed
- Password-protected files cannot be processed
- Corrupted files may be skipped
## Security and Privacy
The Rememberizer App implements several security measures:
1. **Local Processing:** Initial file processing occurs locally on your machine
2. **Content Encryption:** Document content is encrypted before transmission
3. **Secure Authentication:** OAuth2 with secure token management
4. **Embedding-Based Storage:** Only vector representations (not original text) are stored long-term
5. **Gitignore Compliance:** Respects exclusion patterns to avoid processing sensitive files
6. **Secure API Communication:** All API traffic uses HTTPS with TLS 1.2+
### Data Usage and Collection
- The app transmits vector embeddings and minimal metadata about your files
- Original file contents are not permanently stored on Rememberizer servers
- Processing occurs locally first with only necessary data transmitted
- No tracking or analytics beyond what's needed for service functionality
## Troubleshooting
### Common Issues and Solutions
#### Application Won't Start
- Verify macOS version (10.15 or newer required)
- Check for available disk space (minimum 500MB)
- Ensure you have admin permissions to install applications
- Try restarting your computer
#### Authentication Problems
- Check your internet connection
- Verify your Rememberizer account credentials
- Clear browser cookies and try again
- Ensure no firewall is blocking communication
#### Files Not Being Processed
- Confirm the file type is supported
- Check file sizes are under the 50MB limit
- Verify folder permissions allow the app to read files
- Check Status tab for specific error messages
- Ensure files aren't being excluded by gitignore rules
#### Slow Processing Performance
- Close resource-intensive applications
- Add fewer folders initially, then expand
- Prioritize smaller text files for faster processing
- Enable GPU acceleration if available
- Check available disk space (low space can cause slowdowns)
### Diagnostic Information
The app maintains logs that can help troubleshoot issues:
1. Access the app's menu by clicking the icon in the menu bar
2. Select "Advanced" > "Show Logs"
3. Review the logs for error messages or warnings
4. If reporting an issue, include relevant log sections
### Resetting the App
If experiencing persistent issues:
1. Quit the Rememberizer App
2. Open Terminal
3. Run: `defaults delete com.rememberizer.app`
4. Restart the application
## Download Links
* Rememberizer App 1.6.1 ([macOS](https://www.dropbox.com/scl/fi/hzytquytxmuhpov67spru/rememberizer-app-1.6.1.dmg?rlkey=0p30ok9qt4e33ua8scomagzev\&st=8yys88d5\&dl=1)) - See [Release Notes](#version-161-october-4th-2024)
Always use the latest version to benefit from security updates, bug fixes, and new features.
## Release Notes
### Version 1.6.1 (October 4th 2024)
#### Features and Improvements
* **Support for Empty Folders**: Users can now add empty folders as a data source.
* **GPU Support and Performance Improvements**: Added support for GPU acceleration to enhance processing speed.
* **Enhanced Embedding Program**: Configured to support the MPS version of PyTorch, optimizing for machine-specific builds.
* **Intelligent CPU Detection**: Implemented detection of CPU type to ensure the most suitable version of the embedding program is used.
* **Improved Data Source Management**: Utilized the Batch Delete API for efficient file deletion in removed data sources.
* **Support for All Plain Text Files**: Enabled processing of various plain text file types.
* **Adherence to Gitignore Rules**: Files ignored by gitignore rules in Git repositories are now excluded from processing.
* **Minor UI Improvements**: Enhancements to the user interface and performance.
## Frequently Asked Questions
### General Questions
**Q: Is the Rememberizer App free to use?**
A: The app is free to download, but requires a Rememberizer account which may have subscription tiers with various limits.
**Q: Does the app extract text from images?**
A: Currently, the app doesn't perform OCR (Optical Character Recognition) on images.
**Q: Will my files be shared with other users?**
A: No. Your files are processed and embedded privately for your account only.
### Technical Questions
**Q: How much of my system resources will the app use?**
A: The app is designed to run efficiently in the background, but resource usage increases during the initial processing of large folders.
**Q: Does the app need to be running all the time?**
A: For continuous file monitoring and updates, yes. However, you can choose to run it only when needed.
**Q: Are there limits to how many files I can process?**
A: Limits depend on your Rememberizer account tier. The app will notify you if you approach these limits.
==> personal/rememberizer-mcp-servers.md <==
---
description: Configure and use Rememberizer MCP servers to connect your AI assistants with your knowledge
type: guide
last_updated: 2025-04-03
---
# Rememberizer MCP Servers
The [**Model Context Protocol**](https://modelcontextprotocol.io/introduction) (MCP) is a standardized protocol designed to integrate AI models with various data sources and tools. It supports a client-server architecture facilitating the building of complex workflows and agents with enhanced flexibility and security.
## Rememberizer MCP Server
The [**Rememberizer MCP Server**](https://github.com/skydeckai/mcp-server-rememberizer) is an MCP server tailored for interacting with Rememberizer's document and knowledge management API. It allows LLMs to efficiently search, retrieve, and manage documents and integrations. The server is available as a public package on [mcp-get.com](https://mcp-get.com/packages/mcp-server-rememberizer) and as an open-source project on [GitHub](https://github.com/skydeckai/mcp-server-rememberizer).
### Integration Options
The Rememberizer MCP Server can be installed and integrated through multiple methods:
#### Via mcp-get.com
||CODE_BLOCK||bash
npx @michaellatman/mcp-get@latest install mcp-server-rememberizer
||CODE_BLOCK||
#### Via Smithery
||CODE_BLOCK||bash
npx -y @smithery/cli install mcp-server-rememberizer --client claude
||CODE_BLOCK||
#### Via SkyDeck AI Helper App
If you have SkyDeck AI Helper app installed, you can search for "Rememberizer" and install the mcp-server-rememberizer.

### Tools Available
The Rememberizer MCP Server provides the following tools for interacting with your knowledge repository:
1. **retrieve_semantically_similar_internal_knowledge**
- Finds semantically similar matches from your Rememberizer knowledge repository
- Parameters:
- `match_this` (string, required): The text to find matches for (up to 400 words)
- `n_results` (integer, optional): Number of results to return (default: 5)
- `from_datetime_ISO8601` (string, optional): Filter results from this date
- `to_datetime_ISO8601` (string, optional): Filter results until this date
2. **smart_search_internal_knowledge**
- Performs an agentic search across your knowledge sources
- Parameters:
- `query` (string, required): Your search query (up to 400 words)
- `user_context` (string, optional): Additional context for better results
- `n_results` (integer, optional): Number of results to return (default: 5)
- `from_datetime_ISO8601` (string, optional): Filter results from this date
- `to_datetime_ISO8601` (string, optional): Filter results until this date
3. **list_internal_knowledge_systems**
- Lists all your connected knowledge sources
- No parameters required
4. **rememberizer_account_information**
- Retrieves your Rememberizer account details
- No parameters required
5. **list_personal_team_knowledge_documents**
- Returns a paginated list of all your documents
- Parameters:
- `page` (integer, optional): Page number for pagination (default: 1)
- `page_size` (integer, optional): Documents per page (default: 100, max: 1000)
6. **remember_this**
- Saves new information to your Rememberizer knowledge system
- Parameters:
- `name` (string, required): Name to identify this information
- `content` (string, required): The information to memorize
### Setup
**Step 1:** Sign up for a new Rememberizer account at [rememberizer.ai](https://rememberizer.ai/).
**Step 2:** Add your knowledge to the Rememberizer platform by connecting to Gmail, Dropbox, or Google Drive, etc...
<figure><img src="../.gitbook/assets/image.png" alt=""><figcaption></figcaption></figure>
**Step 3:** To selectively share your knowledge, set up a Mementos Filter. This allows you to choose which information is shared and which remains private. ([Guide here](https://docs.rememberizer.ai/personal/mementos-filter-access))
<figure><img src="../.gitbook/assets/image (3).png" alt=""><figcaption></figcaption></figure>
**Step 4:** Share your knowledge by creating a "Common Knowledge" (Guide [here](https://docs.rememberizer.ai/personal/common-knowledge) and [here](https://docs.rememberizer.ai/developer/registering-and-using-api-keys))
<figure><img src="../.gitbook/assets/image (4).png" alt=""><figcaption></figcaption></figure>
**Step 5:** To access your knowledge via APIs, create an API key ([Guide here](https://docs.rememberizer.ai/developer/registering-and-using-api-keys))
<figure><img src="../.gitbook/assets/image (5).png" alt=""><figcaption></figcaption></figure>
**Step 6:** If you're using Claude Desktop app, add this to your `claude_desktop_config.json` file.
||CODE_BLOCK||json
{
"mcpServers": {
"rememberizer": {
"command": "uvx",
"args": ["mcp-server-rememberizer"],
"env": {
"REMEMBERIZER_API_TOKEN": "your_rememberizer_api_token"
}
}
}
}
||CODE_BLOCK||
**Step 7:** If you're using SkyDeck AI Helper app, add the env `REMEMBERIZER_API_TOKEN` to mcp-server-rememberizer.
<figure><img src="../.gitbook/assets/image (2) (1).png" alt=""><figcaption></figcaption></figure>
Congratulations, you're done!
With support from the Rememberizer MCP server, you can now ask the following questions in your Claude Desktop app or SkyDeck AI GenStudio
* What is my Rememberizer account?
* List all documents that I have there.
* Give me a quick summary about "..."
## Rememberizer Vector Store MCP Server
The **Rememberizer VectorStore MCP Server** facilitates interaction between LLMs and the Rememberizer Vector Store, enhancing document management and retrieval through semantic similarity searches.
### Integration Options
The Rememberizer Vector Store MCP Server can be installed and integrated through similar methods as the main Rememberizer MCP Server:
#### Via Smithery
||CODE_BLOCK||bash
npx -y @smithery/cli install mcp-rememberizer-vectordb --client claude
||CODE_BLOCK||
#### Via SkyDeck AI Helper App
If you have SkyDeck AI Helper app installed, you can search for "Rememberizer Vector Store" and install the mcp-rememberizer-vectordb.

### Installation
To install the Rememberizer Vector Store MCP Server, follow the [guide here](https://github.com/skydeckai/mcp-rememberizer-vectordb#installation).
### Setup
**Step 1:** Sign up for a new Rememberizer account at [rememberizer.ai](https://rememberizer.ai/).
**Step 2:** Create a new Vector Store ([Guide here](https://docs.rememberizer.ai/developer/vector-stores))
<figure><img src="../.gitbook/assets/image (6).png" alt=""><figcaption></figcaption></figure>
**Step 3:** To manage your Vector Store via APIs, you need to create an API key ([Guide here](https://docs.rememberizer.ai/developer/vector-stores#api-key-management)) 
<figure><img src="../.gitbook/assets/image (7).png" alt=""><figcaption></figcaption></figure>
**Step 4:** If you're using Claude Desktop app, add this to your `claude_desktop_config.json` file.
||CODE_BLOCK||json
{
"mcpServers": {
"rememberizer": {
"command": "uvx",
"args": ["mcp-rememberizer-vectordb"],
"env": {
"REMEMBERIZER_VECTOR_STORE_API_KEY": "your_rememberizer_api_token"
}
}
}
}
||CODE_BLOCK||
**Step 5:** If you're using SkyDeck AI Helper app, add the env `REMEMBERIZER_VECTOR_STORE_API_KEY` to mcp-rememberizer-vectordb.
<figure><img src="../.gitbook/assets/image (8) (1).png" alt=""><figcaption></figcaption></figure>
Congratulations, you're done!
With support from the Rememberizer Vector Store MCP server, you can now ask the following questions in your Claude Desktop app or SkyDeck AI GenStudio
* What is my current Rememberizer vector store?
* List all documents that I have there.
* Give me a quick summary about "..."
## Conclusion
The Rememberizer MCP Servers demonstrate the powerful capabilities of the Model Context Protocol by providing an efficient, standardized way to connect AI models with comprehensive data management tools. These servers enhance the ability to search, retrieve, and manage documents with precision, utilizing advanced semantic search methods and the augmentation of LLM Agents.
==> personal/common-knowledge.md <==
---
description: >-
Enhance your knowledge or get started fast by adding AI access to pre-indexed
data from us and others.
---
# Common knowledge
## What is common knowledge
In Rememberizer, registered users **(publishers)** can select their uploaded documents through mementos and share them publicly as common knowledge. Other users **(subscribers)** can access this public knowledge and add it to their own resources.
By contributing their data, other users can collectively enhance the available information on the common knowledge page. This collaborative approach allows all users to access a richer data source, thereby improving the learning capabilities of their AI applications.
## Add public common knowledge
In order to subscribe a common knowledge to your resource, follow the instructions below
* On navigation bar, choose **Personal > Common Knowledge**. Then, you will see the public common knowledge page.
<figure><img src="../.gitbook/assets/navbar_browse_ck.png" alt="navbar browse ck"><figcaption></figcaption></figure>
<figure><img src="../.gitbook/assets/public_ck_page.png" alt="public ck page"><figcaption></figcaption></figure>
* Then, look for the common knowledge you want to subscribe. You can look up the knowledge by typing the knowledge's name on search bar. You can optionally choose the filter option next to the search bar.
<figure><img src="../.gitbook/assets/filter_option_ck.png" alt="Filter of search bar" width="249"><figcaption><p>Filter of search bar</p></figcaption></figure>
<figure><img src="../.gitbook/assets/public_ck_search.png" alt="Example of a search result"><figcaption><p>Example of a search result</p></figcaption></figure>
* Then click **Add** button on the public common knowledge. After successful subscribe, you will see the **Add** button change to **Remove** button.
<figure><img src="../.gitbook/assets/not_add_ck.png" alt="Unadded common knowledge"><figcaption><p>Unadded common knowledge</p></figcaption></figure>
<figure><img src="../.gitbook/assets/added_ck.png" alt="Added common knowledge"><figcaption><p>Added common knowledge</p></figcaption></figure>
* Later, if you want to remove a subscribed knowledge, click the **Remove** button.
## Create a common knowledge
For detailed instructions of creating and sharing a common knowledge, visit this page [registering-and-using-api-keys.md](../developer/registering-and-using-api-keys.md "mention").
{% hint style="info" %}
Common knowledge is built on the foundation of [Mementos](mementos-filter-access.md), which allow you to control exactly which documents are shared. Once created, developers can access your common knowledge through APIs to build custom applications or integrate with [LangChain](../developer/langchain-integration.md) and [OpenAI GPTs](../developer/creating-a-rememberizer-gpt.md).
{% endhint %}
==> personal/mementos-filter-access.md <==
---
description: Use a Memento with each app integration to limit its access to your Knowledge
---
# Mementos Filter Access
### What is a Memento and Why do they Exist?
A major purpose of Rememberizer is to share highly relevant extracts of your data with 3rd party applications in a controlled fashion. This is achieved through the application of a single **Memento** to each application that is integrated with Rememberizer that you also choose to authorize to access your data in Rememberizer.
{% hint style="info" %}
Mementos are the foundation for [creating Common Knowledge](../developer/registering-and-using-api-keys.md) that developers can access via API and for [creating a Rememberizer GPT](../developer/creating-a-rememberizer-gpt.md).
{% endhint %}
The current implementation of Memento allows the user to select specific files, documents or groups of content such as a folder or channel that can be used by that application. Later implementations will add additional ways to filter 3rd party access such as time frames like "created in the last 30 days".\
\
Two default values are "None" and "All". All shares every file that the user has allowed Rememberizer to access. None shares nothing with the app in question. Selecting None allows a user to select an app and integrate it with Rememberizer without having to decide then and there what content to make available. Selecting a Memento with None or editing an existing applied Memento to share None is a way to turn off an apps access to user data without having to remove the integration. This is like an off switch for your data. Custom Mementos can be purpose made and have names that reflect that, such as "Homework" or "Marketing".
<figure>
<div style="border: 2px dashed #ccc; padding: 20px; text-align: center; background-color: #f9f9f9;">
<p style="font-weight: bold;">Coming soon: Memento Data Access Control Visualization</p>
<p>This diagram will illustrate how Mementos filter data access between your integrations and third-party apps:</p>
<ul style="text-align: left; display: inline-block;">
<li>How data flows from various sources (Google Drive, Slack, etc.) into Rememberizer</li>
<li>The role of Mementos as configurable filters between your data and applications</li>
<li>Different permission scenarios with examples (All, None, Custom)</li>
<li>Before/after visualization showing how applying Mementos restricts application access</li>
</ul>
</div>
<figcaption>Visualization of how Mementos control third-party application access to your data</figcaption>
</figure>
### How to create a Mementos?
This guide will walk you through the process of creating a Mementos
1. Navigate to **Personal > Memento: Limit Access** in tab, or visit [https://rememberizer.ai/personal/memento](https://rememberizer.ai/personal/memento). You should see all Mementos the left of the screen
<figure><img src="../.gitbook/assets/memento_page.png" alt="memento page"><figcaption></figcaption></figure>
2. Click **Create a new memento**. Then fill the name for your custom memento and click **Create**. After that, you should your memento added and list of data sources can be included in your memento.
<figure><img src="../.gitbook/assets/create_memento.png" alt="create memento"><figcaption></figcaption></figure>
<figure><img src="../.gitbook/assets/memento_detail.png" alt="memento detail"><figcaption></figcaption></figure>
3. Click **Refine** on the data source you want to refine, the side panel will pop up. Then choose to add folders or files, and click **Refine** to add those data sources to the Memento.
<figure><img src="../.gitbook/assets/memento_refine_knowledge.png" alt="memento refine knowledge"><figcaption></figcaption></figure>
4. Also, for common knowledge source, you can click **Add** to include the knowledge in Memento.
<figure><img src="../.gitbook/assets/memento_add_common_knowledge.png" alt="memento add common knowledge"><figcaption></figcaption></figure>
==> personal/manage-your-embedded-knowledge.md <==
---
description: >-
Rememberizer allows users to efficiently manage their stored files from
various sources. This section will show you how to access, search, filter and
manage your uploaded file in the knowledge
---
# Manage your embedded knowledge
## Browse Embedded Knowledge Details page
On the navigation bar, choose **Personal > Your Knowledge**. Locate the **View Details** button on the right side of the "Your Knowledge" section and click it. Then, you will see the **Embedded knowledge detail** page.
<figure><img src="../.gitbook/assets/browse_knowledge_detail_page_1.png" alt="Your Knowledge section and <strong>View Details</strong> button"><figcaption><p>Your Knowledge section and <strong>View Details</strong> button</p></figcaption></figure>
<figure><img src="../.gitbook/assets/browse_knowledge_detail_page_2.png" alt="Embed Knowledge Detail page"><figcaption><p>Embed Knowledge Detail page</p></figcaption></figure>
The table of knowledge files' details includes these attributes:
* **Documents:** Name of the document or slack channel.
* **Source:** The resource from where the file is uploaded (Drive, Mail, Slack, Dropbox, and Rememberizer App).
* **Directory:** The directory where the file locates in the source.
* **Status**: The status of the file (indexing, indexed or error).
* **Size**: The size of the file.
* **Indexed on**: The date when the file is indexed.
* **Actions:** The button to delete the file. For file whose status is error, there will also be a retry icon next to the trash icon (delete button).
## Features of detail page
### Search and filter the files
User can search the document by name with the **search bar**. Type the name in the bar, then press Enter to get your result.
<figure><img src="../.gitbook/assets/search_manage_knowledge_result.png" alt="Result of a search"><figcaption><p>Result of a search</p></figcaption></figure>
You can also optionally choose the **Status filter** and **Source filter.** This will quickly locate specific documents by narrowing down your search criteria. 
<figure><img src="../.gitbook/assets/Source filter.png" alt="Source filter" width="247"><figcaption><p>Source filter</p></figcaption></figure>
<figure><img src="../.gitbook/assets/Status_filter.png" alt="Status filter" width="257"><figcaption><p>Status filter</p></figcaption></figure>
### Delete an uploaded file
Find the file you want to delete (by search if needed). Then click on the **trash icon** on the **Action** column. 
<figure><img src="../.gitbook/assets/delete_file.png" alt="File with delete icon"><figcaption><p>File with delete icon</p></figcaption></figure>
A modal will pop up to confirm deletion. Click **Confirm** then you will see the file deleted.
<figure><img src="../.gitbook/assets/delete_file_pop_up.png" alt="Delete confirmation modal"><figcaption><p>Delete confirmation modal</p></figcaption></figure>
### Retry indexing error files
User can retry to embed those files which Rememberizer failed to index. To retry indexing a specific file, simply click the retry button next to the delete button in **Action** column.
<figure><img src="../.gitbook/assets/err_retry_button.png" alt="Retry button for specific error file"><figcaption><p>Retry button for specific error file</p></figcaption></figure>
If user want to retry indexing all error files, click the retry button next to the label of **Action** column.
<figure><img src="../.gitbook/assets/err_retry_all_button.png" alt="Retry button for all error files"><figcaption><p>Retry button for all error files</p></figcaption></figure>
Below is the image after sucessful retry indexing the error file from Gmail integration.
<figure><img src="../.gitbook/assets/success_retry_indexing.png" alt="Success retry indexing error file"><figcaption><p>Success retry indexing error file</p></figcaption></figure>
==> personal/rememberizer-gmail-integration.md <==
---
description: >-
This guide will walk you through the process of integrating your Google Drive
into Rememberizer as a knowledge source.
---
# Rememberizer Gmail integration
1. Sign in to your account.
2. Navigate to **Personal > Your Knowledge** tab, or visit [https://rememberizer.ai/personal/knowledge](https://rememberizer.ai/personal/knowledge). You should see all available knowledge sources there, including Gmail.
<figure><img src="../.gitbook/assets/gmail_personal_knowledge.png" alt="gmail personal knowledge"><figcaption></figcaption></figure>
3. Click the **"Connect"** button for the Gmail knowledge source. You will be redirected to a new page asking for your permission to allow Rememberizer to access your Gmail. Select your Gmail account.
<figure><img src="../.gitbook/assets/gmail_oauth_step_1.png" alt="gmail oauth step 1"><figcaption></figcaption></figure>
4. Approve the app by clicking "**Continue"**.
<figure><img src="../.gitbook/assets/gmail_oauth_step_2.png" alt="gmail oauth step 2"><figcaption></figcaption></figure>
5. Grant Rememberizer **permissions** to access your Gmail by clicking **"Continue".**
<figure><img src="../.gitbook/assets/gmail_oauth_step_3.png" alt="gmail oauth step 3"><figcaption></figcaption></figure>
6. You'll be redirected back to our platform, where you should see your Gmail connected.
<figure><img src="../.gitbook/assets/gmail_auth_redirect.png" alt="gmail auth redirect"><figcaption></figcaption></figure>
7. Now that you're connected, you need to specify which email labels our product should embed. Click on the **"Select"** button and choose your desired email labels from the side panel. All emails that belong to the selected labels will be embedded.
<figure><img src="../.gitbook/assets/gmail_choose_knowledge.png" alt="gmail choose knowledge"><figcaption></figcaption></figure>
8. After selecting the labels, click **"Add"** to start embedding your knowledge. You also need to check the box to agree with Rememberizer's policy of sharing your Gmail data with third-party applications that you have specifically approved.
<figure><img src="../.gitbook/assets/gmail_choose_knowledge_checkbox.png" alt="gmail choose knowledge checkbox"><figcaption></figcaption></figure>
9. Once you've selected your labels, our system will begin embedding the emails and attachments. This process may take a few minutes, depending on the amount of data.
<figure><img src="../.gitbook/assets/gmail_indexing.png" alt="gmail indexing"><figcaption></figcaption></figure>
### What's next?
Use the [Memento](mementos-filter-access.md) feature to filter access to the sourced data. Combine this with your knowledge from other applications such as Slack, Box, Dropbox, etc. to form a comprehensive memento.
You can also [Search Your Knowledge](https://rememberizer.ai/personal/search) through our web UI, or better, use this knowledge in an LLM through our GPT app or our public API.
And that's it! If you encounter any issues during the process, feel free to contact our support team.
==> releases/dec-27th-2024.md <==
---
description: >-
This release focuses on enhancing vector store functionality with Agentic Search and improving stability through crucial bug fixes.
---
# Dec 27th, 2024
### New Features
- **Agentic Search in Vector Store**: Introduced advanced search capabilities in the vector store for more precise and efficient data retrieval.
### Improvements
- **Enhanced Data Management**: Implemented automatic removal of vector data when documents are deleted, ensuring data consistency.
### Bug Fixes
- **Vector Store Loading and Creation Fixes**: Resolved issues preventing vector store creation and loading of embedding binaries, improving system reliability.
==> releases/apr-5th-2024.md <==
---
description: >-
This update enhances integrations with Dropbox, Google Drive, and Slack, and
refines document management for a smoother user experience.
---
# Apr 5th, 2024
## New Features
* **New Knowledge Tree Support:** Extended the tree structure to better integrate with Dropbox and Google Drive, enabling more intuitive document and folder management.
* **Slack Reply Synchronization:** Added functionality to synchronize new Slack replies more effectively, helping keep communications seamless and updated.
## Bug Fixes
* **Common Knowledge Page Fixes:** Fixed bugs related to searching, pagination, and updating the DateTime format on the common knowledge page.
* **Show the Selected Files for Old Account Fixes:** We've fixed an issue where the selected files for old accounts weren't displayed correctly.
==> releases/may-17th-2024.md <==
---
description: >-
This release focuses on improving the user experience, enhancing integrations,
and fixing various issues. Key updates include Gmail synchronization and
displaying the directory path.
---
# May 17th, 2024
## New Features
* **Gmail Integration & Synchronization:** Connect your Gmail accounts to effortlessly manage emails in our platform. Last week, we introduced label-specific integration; this week, enjoy full synchronization of threads within a label, for seamless access and management.
## Improvements
* **Display Directory Path:** The application now displays the directory path, making it easier for users to navigate and locate their documents.
* **Updated Diagram:** The application's diagram has been updated to provide a clearer visual representation of the system architecture and data flow.
* **Changed Datasources Order:** The order of data sources has been optimized to improve the efficiency of data retrieval and processing.
* **Updated Logic for Fetching Data:** The logic for fetching data has been enhanced to improve the accuracy and reliability of the retrieved information.
## Bug Fixes
* **Fixed Delete Document Button UI:** The user interface for the delete document button in embedded details has been fixed to provide a better user experience.
==> releases/nov-22nd-2024.md <==
---
description: >-
This release focuses on enhancing integration flexibility, expanding model selection options, and improving performance in document processing and loading.
---
# Nov 22nd, 2024
### New Features
- **Multiple Accounts per Integration**: Users can now connect and manage multiple accounts for each integration (Google Drive, Dropbox, Gmail, and Slack), providing greater flexibility.
- **OpenAI Models in Embedding Options**: OpenAI models are now available for selection when creating or editing Vector Stores, offering more choices for embedding models.
- **Editable Embedding Model Credentials**: Users can now update embedding model credentials when editing a Vector Store, simplifying model management.
### Improvements
- **Parallel Document Processing**: Documents are now processed in parallel, increasing speed and efficiency.
- **Improved Document Loading**: Optimizations in the loading workers enhance document loading times.
==> releases/oct-4th-2024.md <==
---
description: >-
This release focuses on enhancing performance and stability, with significant improvements to synchronization processes and fixes to known issues.
---
# Oct 4th, 2024
### Improvements
- **Optimized Google Drive Navigation**: Improved performance of the Google Drive knowledge tree for faster and smoother browsing.
- **Enhanced Synchronization Efficiency**: Optimized document synchronization by refining task management for quicker updates.
### Bug Fixes
- **Resolved Crash When Disconnecting Data Source**: Fixed an issue where disconnecting a data source while the knowledge panel was open caused the app to crash.
==> releases/sep-20th-2024.md <==
---
description: >-
This release focuses on various improvements, new features, and bug fixes to enhance user experience and functionality.
---
# Sep 20th, 2024
### Improvements
- **Enhanced Formatting for Numbers**: Big numbers now display with commas for easier reading.
- **Updated Document Handling**: Improved mechanism to manage and index documents efficiently, even in larger folders.
- **Optimized Slack and Document Handling**: Enhanced API to retry all failed documents and slack channels, ensuring smoother operations.
### New Features
- **Membership Update**: Memberships now update based on loading results for more accurate data.
- **Random Document Selection**: Introduced random selection for embedding and loading to diversify document processing.
### Bug Fixes
- **Dropbox Sync**: Temporarily disabled Dropbox sync to prevent potential data issues.
- **Search Field Improvement**: The search field on the Knowledge Details page now autofills based on the "file" query parameter for more precise searches.
- **Reindex Collection Post-Loading**: Enhanced the loading result API to reindex collections automatically.
==> releases/nov-8th-2024.md <==
---
description: >-
Our latest release focuses on enhancing performance, improving reliability, and providing a better user experience through various optimizations and fixes.
---
# Nov 8th, 2024
### Improvements
- **Updated Onboarding Experience**: Enhanced the new user onboarding visuals with updated Gmail integration for a smoother start.
- **Optimized Performance**: Improved application speed and efficiency by reapplying half-precision vectors.
- **Enhanced Search Capabilities**: Improved indexing for better search results and quicker information retrieval.
- **Improved Document Processing Reliability**: Enhanced handling of retries during embedding tasks for more reliable document processing.
### Bug Fixes
- **Fixed Document Synchronization Errors**: Resolved issues related to document synchronization and processing errors for increased application stability.
- **Resolved Memento Access Error**: Fixed an error where memento documents were not accessible.
- **Ensured Document Indexing**: Fixed an issue preventing the creation of vector store tables, ensuring all documents are properly indexed and searchable.
==> releases/jun-14th-2024.md <==
---
description: >-
This release improves error handling, enhances the memento sidebar, and
refines tests. Key updates include memento size display, better error
responses, and automatic version checks.
---
# Jun 14th, 2024
## New Features
* **Show Mementos' Size:** The size of mementos is now displayed in the memento sidebar, providing users with better insights into their storage usage.
* **Check for the Latest Version:** We've added a feature that allows the desktop app to automatically check for and notify users of the latest version available.
## Bug Fixes
* **Return 404 for Deleted Mementos:** Retrieving a deleted memento or one that belongs to another user now returns a 404 error instead of a server error.
* **Update Size for Third-Party Apps:** Fixed an issue where third-party app memory documents didn't trigger size updates for mementos.
==> releases/dec-20th-2024.md <==
---
description: >-
This release introduces Microsoft Authentication, Reset Password functionality, and enhances search capabilities and integrations with Common Knowledge.
---
# Dec 20th, 2024
### New Features
#### Rememberizer Frontend and Backend
- **Microsoft Authentication Support**: Users can now log in using their Microsoft accounts for a seamless authentication experience.
- **Reset Password Functionality**: Users can now reset their passwords directly from the application, improving account accessibility.
#### Rememberizer Frontend
- **Agentic Search Mode**: Introducing a new search mode that enhances the way users search within Rememberizer.
### Improvements
#### Rememberizer Backend
- **Enhanced Common Knowledge Integration**: Improved support and compatibility with Common Knowledge, allowing better search and memorization of documents created via the Common Knowledge API.
- **Improved Data Processing Reliability**: Ensured data processing tasks are correctly triggered after uploads, enhancing data reliability.
#### Rememberizer Loading Worker
- **Optimized Data Synchronization**: Improved data synchronization between databases and user accounts for a consistent user experience.
### Bug Fixes
#### Rememberizer Frontend
- **Email Verification Redirect Fix**: Resolved an issue with email verification redirects and corrected the main layout display.
==> releases/mar-4th-2024.md <==
---
description: >-
This release introduces new features like shared knowledge creation and
display, and memento renaming. Improvements include key bug fixes about
Dropbox, query result and Common knowledge UI.
---
# Mar 4th 2024
## New Features
* **Shared Knowledge**: A new feature to create and display shared knowledge has been implemented.
* **Memento Renaming**: Users can now rename their mementos.
## Bug Fixes
* **Dropbox Files Display**: Resolved an issue with incorrect file display in Dropbox.
* **Querying Result Order**: Fixed a bug where querying results with consecutive chunks returned a disordered result.
* **Common Knowledge UI**: Fixed several UI issues with the Common Knowledge feature.
\
==> releases/dec-13th-2024.md <==
---
description: >-
This release focuses on introducing email and password authentication, enhancing search capabilities, and resolving key issues for improved stability.
---
# Dec 13th, 2024
### New Features
- **Email and Password Authentication**: Users can now log in using their email and password for enhanced security and convenience.
### Improvements
- **Enhanced Search Functionality**: Improved the search system by evaluating each chunk separately for more accurate and relevant results.
### Bug Fixes
- **Search Parameters Resolution**: Fixed issues where certain search options were not functioning correctly.
- **Database Stability Enhancement**: Resolved a database issue that improved overall system stability during collection status updates.
- **Sign-In Error Message Update**: Updated sign-in error messages to provide clearer and more helpful information.
- **Agentic Search Error Fix**: Corrected an error in the agentic search feature to prevent unexpected application behavior.
==> releases/jul-11th-2025.md <==
---
description: >-
This release introduces team apps sharing, increases user embed quota, and improves Gmail auto-sync.
---
# Jul 11th, 2025
### New Features
- **Team Apps Sharing**: Introduced the ability to share apps within teams, enhancing collaboration among team members.
### Improvements
- **Increased User Embed Quota**: Expanded the user embed quota from 1GB to 20GB, allowing users to store more data.
### Bug Fixes
- **Improved Gmail Auto-sync**: Resolved issues with Gmail auto-sync to ensure memento updates occur seamlessly.
==> releases/apr-11th-2025.md <==
---
description: >-
This release focuses on improving the clarity of Slack notifications.
---
# Apr 4th, 2025
### Bug Fixes
- **Fixed Slack Notification Formatting**: Resolved an issue where Slack notifications included unwanted HTML tags, ensuring clearer messages.
==> releases/aug-16th-2024.md <==
---
description: >-
This release focuses on enhancing search capabilities and improving document management features.
---
# Aug 16th, 2024
### New Features
- **Enhanced Search Filters**: Added the ability to filter search results by sender and recipient, making it easier to find specific emails.
- **Document Creation Date Display**: Now shows the document creation date in document lists for better document management.
### Improvements
- **Improved Search Reliability**: Enhancements to search functions provide a smoother and more reliable experience.
### Bug Fixes
- **Email Integration Fix**: Resolved issues with Gmail integration when using GPT to ensure smooth operation.
- **Desktop App Content Display Fix**: Fixed issues with document content display in the desktop app for a better user experience.
==> releases/mar-11th-2024.md <==
---
description: >-
This update brings new features and improvements, including streamlined Slack
integration, enhanced documents, and a more efficient user signup process.
We've also fixed some bugs.
---
# Mar 11th, 2024
## New Features
* **User Slack Data Migration:** User Slack data can now be migrated to accommodate Slack threads and replies, enhancing user interaction.
* **Common Knowledge Integration:** Common knowledge has been added to the integration sources endpoint, expanding our system's capabilities.
* **Pin Shared Knowledge Items:** System administrators can now pin shared knowledge items to the top of the list, enhancing visibility and accessibility.
* **Safe Document Handling:** The system will no longer fail on empty documents, improving system reliability.
* **Manage Shared Knowledge:** Users can now delete and edit their shared knowledge, providing more control over shared content.
## Improvements
* **Rememberizer UI Update:** The Rememberizer UI has been updated based on the new format of Slack Replies.
## Bug Fixes
* **Switching Between Common Knowledge:** Fixed an issue when switching between common knowledge when refining memento.
* **Unsupported Document Visibility:** Fixed the issue causing unsupported documents to be displayed.
* **User Document List:** Subscribed documents will no longer appear in the list of user documents.
* **Memento Size Estimation:** Rectified the incorrect calculation of the estimated size of the memento.
==> releases/jul-26th-2024.md <==
---
description: >-
This release focuses on improving our Slack integration, enhancing the user
interface, and resolving critical issues to provide a smoother experience.
---
# Jul 26th, 2024
**New Features:**
* **Slack Channel Counter**: A new feature that accurately counts and displays the number of Slack channels, helping users better manage their workspace connections.
**Improvements:**
* **Updated Slack Integration UI**: The user interface for Slack integration has been refreshed to support the new channel mechanism, making it more intuitive and easier to use.
* **App Name Update**: The desktop application name has been updated to "Rememberizer," reflecting our commitment to helping users organize and remember important information.
**Bug Fixes:**
* **Google Drive Integration**: Resolved an issue that caused errors when accessing Google Drive folders, ensuring smoother navigation and file management.
==> releases/sep-27th-2024.md <==
---
description: >-
This release focuses on enhancing synchronization performance and navigation for Dropbox and Google Drive, providing you with a smoother and more efficient experience.
---
# Sep 27th, 2024
### Improvements
- **Enhanced Cloud Synchronization**: Optimized the synchronization processes for Dropbox and Google Drive, resulting in faster and more reliable file updates.
- **Improved Dropbox Navigation**: Refined the Dropbox knowledge tree for more efficient file organization and easier access.
- **Regular Sync Schedule**: Set synchronization tasks for Google Drive, Dropbox, and Gmail to occur every 6 hours, ensuring your content stays consistently up-to-date.
==> releases/feb-26th-2024.md <==
---
description: >-
In this release, we've implemented an image size limit of 1MB for uploads and
enhanced document display in the Selection Panel. We've also fixed a bug
related to data source disconnection.
---
# Feb 26th, 2024
## Improvements
* **Image Size Limit**: Cropped images for shared knowledge must not exceed 1MB in size.
* **Document Display Enhancement**: We have increased the number of documents that can be displayed in the tree structure within the Right Selection Panel for improved user experience.
## Bug Fixes
* **Data Source Disconnection**: Fixed an issue where disconnecting a data source did not appropriately delete documents and remove the data source.\
==> releases/aug-2nd-2024.md <==
---
description: >-
This release focuses on improving the overall performance, data handling, and
error management of our application. Users can expect a more robust and
efficient experience.
---
# Aug 2nd, 2024
**New Features:**
* **Improved Search Functionality**: The search feature now runs parallel content retrieval, delivering faster and more accurate results.
* **Refined Document Notification System**: Users will receive more precise notifications about document updates, enhancing collaboration and workflow management.
* **Updated API Key Format**: Updated the API key prefix to improved security and easier identification.
**Improvements:**
* **Enhanced Data Management**: The system now handles empty documents more effectively, ensuring all relevant information is properly indexed and loaded.
* **Optimized Memento Organization**: Refinements to the memento sidebar provide a clearer view of documents and folders, making navigation more intuitive.
* **Streamlined Data Processing**: Implementation of a new embedding mechanism and vector database adaptation for more efficient data handling and analysis.
**Bug Fixes:**
* **Email Encoding Compatibility**: Update system encoding format when email charset is incorrect, preventing potential display issues.
* **Gmail Label Management**: Resolved an issue when deleting Gmail labels, ensuring smoother email integration.
* **Exception Handling**: Improved error notification system to better manage and communicate system exceptions.
==> releases/jun-20th-2025.md <==
---
description: >-
This release focuses on enhancing performance and stability with major improvements and critical bug fixes.
---
# June 17th, 2025
### Improvements
- **Enhanced API Performance**: Improved API performance for smoother operations.
- **Enhanced Processing Efficiency**: Optimized data embedding processes for faster and more reliable results.
- **Improved Database Synchronization**: Ensured reliable data management for a better user experience.
- **Restored Task Names**: Task names are now visible for improved task management.
- **Optimized File Processing**: Improved file processing for better performance.
### Bug Fixes
- **Resolved Integration Ownership Issue**: Fixed an issue where certain integrations could not identify the owner.
- **Improved Google Drive Error Handling**: Enhanced handling of inaccessible Google Drive files to prevent unnecessary retries.
- **Fixed Dropbox Document Error**: Resolved errors when processing empty documents from Dropbox.
==> releases/jun-6th-2025.md <==
---
description: >-
This release focuses on improving performance, enhancing data integrity, and updating authentication options.
---
# June 6th, 2025
### Improvements
- **Updated Authentication Options**: Microsoft Sign-In/Sign-Up option has been removed; please use alternative authentication methods.
- **Improved Document Access Speed**: Faster access to documents due to optimization of document indexing and content retrieval.
- **Enhanced System Reliability**: Improved error tracking mechanisms for better system stability.
### Bug Fixes
- **Improved Data Validation**: Resolved issues with document validation during bulk uploads, enhancing data integrity.
- **Performance Optimization**: Addressed slow queries when accessing indexed documents, improving overall system responsiveness.
==> releases/oct-25th-2024.md <==
---
description: >-
This release focuses on improving document indexing reliability and includes various bug fixes to enhance your experience.
---
# Oct 25th, 2024
### New Features
- **Automatic Retry for Indexing Failures**: Implemented an auto-retry mechanism to ensure documents that failed to index are retried, enhancing data consistency.
### Bug Fixes
- **Improved Search Functionality**: Fixed an issue preventing searches from apps connected to mementos without memories.
- **System Stability Enhancements**: Resolved overlapping database connections during concurrent tasks to improve performance.
- **Slack Synchronization Adjustments**: Temporarily disabled synchronization for empty Slack channels to avoid unnecessary errors.
==> releases/oct-18th-2024.md <==
---
description: >-
This release focuses on improving document saving reliability.
---
# Oct 18th, 2024
### Bug Fixes
- **Enhanced Document Saving Stability**: Improved the document saving process to prevent potential conflicts during simultaneous edits.
==> releases/apr-12th-2024.md <==
---
description: >-
This release enhances document synchronization, streamlines common knowledge
management, and optimizes the user interface, improving overall system
efficiency and user experience.
---
# Apr 12th, 2024
## New Features
* **Automatic Sync for Cloud Storage:** Users can now set automatic synchronization for selected folders and files in Dropbox and Google Drive, streamlining document management processes.
## Improvements
* **Optimized Document Ordering:** The order of documents can now be set by indexed date or name, facilitating more intuitive navigation and retrieval.
* **UI Updates for Memento Management:** The common knowledge memento UI has been updated, including a new toggle for sharing settings, improving user control over data sharing.
* **UI Responsiveness and Customization:** Minor UI fixes have been implemented.
## Bug Fixes
* **Onboarding Process:** Resolved an issue where common knowledge was not displayed during a user's onboarding step, enhancing the initial setup experience for new users.
==> releases/README.md <==
---
description: Public Declarations, Compliance Alterations, and User Assistance Updates.
---
# Releases
© 2024 SkyDeck AI Inc.
==> releases/aug-9th-2024.md <==
---
description: >-
This release focuses on enhancing user experience, improving document
management, and refining search capabilities in the Rememberizer.
---
# Aug 9th, 2024
**New Features**
* **Slack Channel Integration**: Enhanced support for Slack channels, improving communication and collaboration within the app.
* **Document Status Filter**: Added a new filter for document status on the embedding details page, making it easier to track and manage documents.
* **Layered Document Display**: Implemented a new tree view in the memento sidebar, organizing documents and folders in layers for improved navigation.
* **Advanced Search Capabilities**: Introduced date range filters for search functionality, allowing for more precise document retrieval.
**Improvements**
* **Document Management**: Refined the process of linking documents to the knowledge details page, simplifying document organization and access.
* **User Interface Updates**: Various UI enhancements to improve overall app usability and visual appeal.
* **Performance Optimization**: Refactored code and updated API calls to enhance app performance and responsiveness.
**Bug Fixes**
* **Empty Search Query Handling**: Resolved an issue where empty search queries were not properly handled, improving search reliability.
* **Email Integration**: Fixed an issue related to email source handling when interacting with GPT, ensuring smoother integration with email services.
==> releases/feb-19th-2024.md <==
---
description: >-
This release brings improvements to the Memento tree with better sorting and
fixes a bug affecting API requests in GPT apps.
---
# Feb 19th, 2024
## Improvements
* **Alphabetical Sorting in Memento Tree**: For enhanced navigation, files and Slack channels within the Memento tree are now organized alphabetically.
## Bug Fixes
* **GPT Apps**: We've fixed a problem that was stopping API requests from being made through newly set-up GPT apps.
\
==> releases/jul-4th-2025.md <==
---
description: >-
This release focuses on introducing the Team Knowledge Sharing feature, enhancing team management, and adding memory management for common knowledge.
---
# Jul 4th, 2025
### New Features
- **Team Knowledge Sharing**: Enable team members to share knowledge collectively, enhancing collaboration and information accessibility.
- **Memory Management for Common Knowledge**: Added functionality to manage and track shared knowledge effectively, improving team awareness and retention.
### Improvements
- **Enhanced Team Member Management and Invitation System**: Improved the team member invitation process and management system for a smoother onboarding experience.
### Bug Fixes
- **Improved Application Stability**: Enhanced system reliability through code optimizations and error handling improvements.
==> releases/jan-17th-2025.md <==
---
description: >-
This release focuses on input validation security and simplified account linking with password reset functionality.
---
# Jan 17th, 2025
### Improvements
- **Input Validation Improved**: Updated text input fields to accept only safe characters, enhancing data integrity.
- **Password Reset During Account Linking**: Users can now reset their password when linking their account for the first time, streamlining the onboarding process.
==> releases/mar-18th-2024.md <==
---
description: >-
This release focuses on enhancing user experience with improved onboarding,
memento management and responsive UI. Key updates include removal of image
size limit, display of memento sizes.
---
# Mar 18th, 2024
## New Features
* **Create a New Memento Button:** We've added a new button to create mementos while authorizing the app, making the process more user-friendly.
* **No Size Limit for Image Uploader:** Users can now upload images of any size, offering more flexibility in document design.
* **Common Knowledge Size Display:** We've added the feature to show the size of common knowledge items, providing more transparency in storage usage.
## Improvements
* **Slack Channels' Indexed Time:** The indexed time will now be updated when checking for new messages, and the document’s `INDEXED` status will be maintained, improving document search efficiency.
* **Smoother Onboarding:** We've reduced redundant steps in the onboarding flow, making it quicker and more efficient.
* **Responsive UI for Common Knowledge:** We've optimized the UI for common knowledge on the memento page to be responsive, improving readability on various devices.
* **Memento Size Display:** The size of mementos is now displayed when authorizing an app, helping users understand their authorized mementos better.
## Bug Fixes
**User-Rename-Application:** The case where a user renames an application is now handled properly, preventing potential errors.
==> releases/jan-15th-2024.md <==
---
description: First release of Rememberizer.
---
# Jan 15th, 2024
## New Features
* **Document Search**: Find your documents easily with our efficient search feature.
* **Google Drive Integration**: Manage your files seamlessly through Google Drive.
* **Developer Hub**: A user-friendly space for developers to easily register and configure their applications for integration with Rememberizer.
* **Memento Management**: Easily create, list, and delete your mementos.
* **Data Source Management**: Easily connect and disconnect your data source.
* **Easy Onboarding**: Our onboarding status feature is designed for a smooth start for all users and developers.
\
==> releases/dec-6th-2024.md <==
---
description: >-
This release focuses on introducing new team collaboration features and enhancing content retrieval to improve your productivity and collaboration experience.
---
# Dec 6th, 2024
### New Features
- **Team Collaboration Tools**: Introducing new features that enable seamless teamwork, allowing you to collaborate, share documents, and communicate effectively within your team. Existing users can easily set up and join teams for enhanced collaboration.
- **Enhanced Content Retrieval**: Improved retrieval capabilities provide faster and more accurate access to your documents, helping you find the information you need more efficiently.
==> releases/feb-12th-2024.md <==
---
description: >-
In this release, we've introduced a public common knowledge page, made
improvements to memento structure and onboarding UI, and fixed a bug with app
authorization counting.
---
# Feb 12th, 2024
## New Features
* **Public Common Knowledge Page**: A new public common knowledge page has been implemented for better information access and sharing.
* **Common Knowledge in Onboarding**: Users can now add common knowledge directly from the onboarding page.
* **Tree Structure for Memento**: The files in a memento are now returned in a tree structure for better clarity and navigation.
## Improvements
* **UI for Onboarding Steps**: The user interface for onboarding steps has been tweaked for better user experience.
==> releases/nov-1st-2024.md <==
---
description: >-
This release focuses on enhancing performance, improving authentication, and increasing overall reliability for a better user experience.
---
# Nov 1st, 2024
### Improvements
- **Faster Search Performance**: Optimized backend processes to provide quicker access to your documents.
- **Enhanced Authentication System**: Upgraded authentication for improved security and reliability.
- **Improved Indexing Reliability**: Enhanced monitoring for document indexing to ensure all your documents are searchable.
- **Optimized System Performance**: Implemented backend optimizations for a faster and more efficient service.
### New Features
- **Automatic Data Source Reconnection**: Data sources now stay connected automatically, ensuring uninterrupted access to your information.
### Bug Fixes
- **Enhanced Privacy Controls**: Fixed an issue that prevented unauthorized listing in user views, improving privacy.
- **Resolved App Authorization Issues**: Corrected redirect problems with authorized apps for seamless access.
==> releases/nov-15th-2024.md <==
---
description: >-
This release focuses on enhancing the user authentication experience, including smoother login redirects and improved desktop app support.
---
# Nov 15th, 2024
### New Features
- **Desktop App Authentication**: Users can now authenticate directly through our desktop application for a more integrated experience.
### Improvements
- **Seamless Login Redirects**: Unauthenticated users are now redirected to their original page after logging in, ensuring uninterrupted navigation.
==> releases/jan-29th-2024.md <==
---
description: >-
This release offers an enhanced user experience with improved document size
management, a more intuitive search interface, and seamless Dropbox
integration. We've also addressed key bugs.
---
# Jan 29th 2024
## New Features
* **Dropbox Integration**: You can now index, reindex, list, and submit Dropbox files directly within our platform.
* **Dropbox in Onboarding Step**: Dropbox integration is now part of the onboarding step, making it easier to set up.
## Improvements
* **Document Size Limit**: We've limited the total document size for each user to 1GB to ensure optimal performance.
* **Improved Search Experience**: The search interface has been enhanced for better user experience.
## Bug Fixes
* Fixed issues with handling empty documents for smoother operations.
* Resolved errors while handling Slack attachments for seamless integration.
* Linked the 'Sign Up' button correctly to the 'Sign Up' page.
==> releases/apr-18th-2025.md <==
---
description: >-
This release focuses on improving data freshness, account API enhancements, notification management, and providing enhanced documentation.
---
# April 18th, 2025
### New Features
- **Embedding Worker Tracking**: Introduced monitoring capabilities for the embedding worker to enhance performance and reliability.
- **Enhanced Documentation**: Provided updated documentation for both human users and AI assistants for improved guidance.
### Improvements
- **Improved Data Freshness**: Implemented cache timeouts to ensure users receive the most up-to-date information.
- **Optimized Notifications**: Adjusted alert settings to prevent duplicate notifications and enhance user experience.
### Bug Fixes
- **Account API Correction**: Resolved an issue where the account API did not return the correct CK description when using the CK API key.
==> releases/may-30th-2025.md <==
---
description: >-
This release focuses on enhancing document processing efficiency and stability, simplifying API key management, and improving system security.
---
# May 30th, 2025
### New Features
- **Simplified API Key Management**: Improved generation and management of API keys for common knowledge records, making API access easier.
### Improvements
- **Enhanced Document Processing Efficiency**: Optimized document processing to handle documents independently, resulting in better performance and reliability.
- **Optimized API Response Times**: Improved API response times by processing deletions in the background upon completion.
- **Improved Error Reporting**: Enhanced error reporting for loading processes to increase system reliability.
- **Regular Security Maintenance**: Implemented security updates and maintenance to enhance system security and performance.
### Bug Fixes
- **Resolved Deadlock Issues**: Fixed deadlock errors during document processing and index creation, improving system stability.
- **Fixed Duplicate Reindexing**: Resolved an issue causing multiple reindexing triggers for documents, ensuring efficient indexing operations.
- **Enhanced Logging for Document Processing**: Improved logging for document processing status updates to facilitate better monitoring and troubleshooting.
==> releases/mar-14th-2025.md <==
---
description: >-
This release focuses on improving error handling when refreshing Gmail and Google Drive tokens.
---
# Mar 14th, 2025
### Bug Fixes
- **Enhanced Token Refresh Reliability**: Improved error handling when refreshing Gmail and Google Drive tokens to ensure continuous connectivity and synchronization.
==> releases/apr-26th-2024.md <==
---
description: >-
This update brings advanced memento integration, improved sync features for
Dropbox and Google Drive, and critical bug fixes to enhance user experience
and system reliability.
---
# Apr 26th, 2024
## New Features
* **Search Functionality for Public Apps:** A new search feature has been added to the public apps page, allowing users to find apps more efficiently.
## Improvements
* **Layout Update for Connected Apps:** The layout of the 'Your Connected Apps' page has been updated for better user experience and navigation.
* **Common Knowledge Card Update:** The common knowledge card in the refining memento page now shows size instead of document count, providing clearer information on storage usage.
* **Enhancement of the Auto Sync Feature for Dropbox and Google Drive:** The auto-sync feature for Dropbox and Google Drive has been enhanced, providing a smoother and more reliable synchronization experience.
* **Pagination for the Public App Page:** We've implemented pagination on the public app page, improving navigation and load times for a better user experience.
* **Update Refine Button in Memento for Common Knowledge Cards:** The refine button in mementos for common knowledge cards has been updated, enhancing usability and clarity.
## Bug Fixes
* **Indexing Issue for Child Files:** Fixed a bug where child files in a selected folder weren't indexed correctly when connecting integrations for the first time, ensuring comprehensive file management.
* **Sign-Out Issue on Search Failure:** Resolved an issue where a failed search for non-existent mementos forced users to sign out, improving error handling and user retention.
* **Profile Edit Validation:** Addressed a validation issue on the edit profile page, ensuring information is accurately captured and processed
==> releases/sep-13th-2024.md <==
---
description: >-
This release focuses on improving data indexing, usage tracking, performance, and user experience enhancements.
---
# Sep 13th, 2024
### Improvements
- **Improved Usage Tracking**: New logic provides more accurate monitoring of your storage and usage limits.
- **Enhanced Performance**: Memento actions are now optimized for better responsiveness.
- **Enhanced Error Display**: Error messages on the knowledge page are clearer when document indexing fails, making it easier to identify issues.
- **Streamlined Data Source Connection**: The data source panel now opens automatically after connecting, simplifying the setup process.
- **Improved Default Settings**: Default user settings have been updated to enhance performance and accuracy.
### New Features
- **Batch Document Deletion**: You can now delete multiple documents at once, simplifying data management.
- **Automatic Re-indexing**: Collections automatically re-index when needed, ensuring up-to-date search results.
### Bug Fixes
- **Fixed Indexing Bugs**: Resolved issues with data indexing to improve search reliability.
- **Reduced Notification Spam**: Fixed an issue causing excessive notifications related to document membership.
==> releases/oct-11th-2024.md <==
---
description: >-
This release introduces our new vector database service for more efficient data handling, along with system stability enhancements and critical bug fixes to improve your overall experience.
---
# Oct 11th, 2024
### New Features
- **Vector Database Service**: Introduced a new vector database service for more efficient data storage and faster information retrieval.
### Improvements
- **Enhanced System Stability**: Improved backend processes to prevent race conditions, ensuring smoother document processing.
- **Optimized Connection Management**: Implemented better connection handling to enhance performance and reliability.
### Bug Fixes
- **Fixed Membership Merging Issue**: Resolved an issue that caused errors when merging membership data in the vector store.
==> releases/may-23rd-2025.md <==
---
description: >-
This release focuses on enhancing content sharing capabilities, improving performance, and fixing critical bugs for a better user experience.
---
# May 23rd, 2025
### New Features
#### Rememberizer Frontend
- **New Maintenance Page**: Introduced a maintenance page for better communication during downtime.
### Improvements
#### Rememberizer Backend
- **Enhanced Content Sharing**: Improved public sharing capabilities for content.
- **Improved Performance**: Faster processing of folders and files for a better user experience.
- **Improved Reporting**: Enhanced daily reports for failed documents to aid in quicker issue resolution.
#### Rememberizer Frontend
- **Enhanced Maintenance Handling**: Maintenance notifications now appear only when essential services are unavailable.
- **Security Updates**: Regular security maintenance and updates.
### Bug Fixes
#### Rememberizer Backend
- **Google Drive Integration Fixed**: Resolved issues embedding Google Drive folders.
- **Fixed Document Retrieval**: Resolved an issue causing incorrect document IDs for memento documents.
==> releases/mar-21st-2025.md <==
---
description: >-
This release enhances document processing, improves security, and fixes issues with notifications and data updates.
---
# March 20th, 2025
### Improvements
#### Rememberizer Backend
- **Improved Document Processing**: Enhanced distribution of document processing for better efficiency.
- **Improved Spam Prevention**: Enhanced measures to prevent spam in document processing.
#### Rememberizer Frontend
- **Security Enhancements**: Implemented regular security maintenance and updates.
### Bug Fixes
#### Rememberizer File Worker
- **Resolved Errors in Google Drive and Email Updates**: Fixed an issue where Google Drive and email updates were not processing correctly.
#### Rememberizer Backend
- **Fixed Sign-Up Notification Email**: Resolved an issue where sign-up notification emails were not sent correctly.
- **Fixed Deletion Errors**: Resolved an issue where deleting non-existent collections caused errors.
==> releases/jul-12th-2024.md <==
---
description: >-
This release brings exciting improvements to document search, memento
organization, and integration management. We've enhanced the user experience
with smoother navigation and more efficient data hand
---
# Jul 12th, 2024
### New Features
* **Document Search**: Enjoy a powerful new search functionality that helps you find the information you need quickly and easily within your documents. 
* **New Memento Tree Structure**: Experience a new way to organize your mementos with our intuitive tree structure, making it easier to navigate and manage your information. 
* **Auto Sync for Mementos**: Keep your data up-to-date effortlessly with our new automatic synchronization feature for mementos.
### Improvements
* **Enhanced Memento Organization**: We've refined the memento sidebar to provide a clearer view of your documents and folders, making navigation a breeze. 
* **Integration Management**: Easily filter and manage your integrations with a new dropdown feature, giving you more control over your connected services. 
* **Faster Document Search**: Our new debounced search feature provides quicker, more responsive results as you type. 
* **Homepage and Knowledge Page Updates**: We've reorganized the layout of integrations on key pages to improve accessibility and user experience.
### Bug Fixes
* **Improved Integration Reliability**: We've enhanced our system to better handle information from connected services, ensuring a smoother experience when using integrations. 
* **Cleaner User Interface**: We've removed unnecessary warning messages on the Knowledge page for a more streamlined look.
==> releases/nov-29th-2024.md <==
---
description: >-
This release focuses on enhancing document handling and user interface, providing a smoother and more reliable experience.
---
# Nov 29th, 2024
### Improvements
- **Enhanced Document Handling**: Optimized document display and performance for a better user experience.
- **Support for Multiple Dropbox Accounts**: Improved Dropbox integration by enabling support for multiple accounts through re-authentication.
- **Improved Document Fetching Reliability**: Implemented automatic retries when fetching documents from Google Drive or Dropbox to enhance reliability.
### Bug Fixes
- **Resolved Add Knowledge Button Overlap**: Fixed an issue where the add knowledge button overlapped other components, improving the user interface.
==> releases/feb-5th-2024.md <==
---
description: >-
This release enhances user profile management, improves Slack and Dropbox
integration, introduces account deletion feature, and addresses key
operational issues.
---
# Feb 5th, 2024
## New Features
* **Dropbox Shared Files**: You can now fetch shared files/folders from Dropbox directly within our platform.
* **Account Deletion**: Users now have the option to delete their account if needed.
* **Slack Synchronization**: We've initiated synchronization with Slack for improved integration, although Slack thread synchronization is not yet included.
* **User Profiles**: Users can now update their profile information more efficiently.
## Improvements
* **Slack Channels**: Slack channels are now sorted by name for easier navigation.
## Bug Fixes
* Resolved an issue regarding invalid origin in App directory.
* Resolved an error with OpenAI GPT for improved API calls.
==> releases/apr-19th-2024.md <==
---
description: >-
This update boosts security, refines interfaces, and addresses critical bugs,
featuring new API restrictions, updated keys, and a MacOS app.
---
# Apr 19th, 2024
## New Features
* **Desktop App for MacOS:** Introducing a dedicated desktop app for MacOS users, enhancing accessibility and user experience. \
See Docs: [Rememberizer Desktop Agent Application](https://docs.rememberizer.ai/personal/rememberizer-agent-desktop-application)
## Improvements
* **App Directory UI Update:** The new layout for the app directory offers a more intuitive and user-friendly navigation experience.
## Bug Fixes
* **Search Document Newline Handling:** Fixed an issue where newlines and return characters were removed incorrectly in search document queries.
* **Search UI Display Bug:** Corrected a search UI bug to ensure the `Created at` field is accurately displayed for each document in search results.
==> releases/jun-28th-2024.md <==
---
description: >-
This release enhances navigation, improves document handling, and updates the
app name. Key updates include limiting homepage applications, better Slack
document processing, and renaming the desktop a
---
# Jun 28th, 2024
## Improvements
* **Limit Applications on Homepage:** We've limited the number of applications displayed on the homepage to make it easier for users to navigate and find what they need.
* **Post-Process Slack Documents:** Enhanced the handling of Slack documents to ensure smoother and more accurate processing.
* **Update Desktop App Name:** The desktop app has been renamed to "Rememberizer App" for better clarity and brand consistency.
==> releases/apr-25th-2025.md <==
---
description: >-
This release focuses on enhancing alert notifications with timestamps for improved tracking.
---
# Apr 25th, 2025
### New Features
- **Timestamped Alerts**: Alerts now include timestamps for better tracking and clarity.
==> releases/apr-4th-2025.md <==
---
description: >-
This release focuses on enhancing performance, improving email processing, and increasing system stability and security.
---
# April 4th, 2025
### Improvements
#### Rememberizer Backend
- **Faster Loading Times**: Improved performance on Knowledge Details page with large datasets.
- **Regular Security Maintenance**: Routine security updates to enhance system protection.
#### Rememberizer File Worker
- **Enhanced Email Processing**: Implemented HTML tag stripping for Gmail in the chunking process.
### Bug Fixes
#### Rememberizer Backend
- **Improved System Stability**: Fixed database connections to enhance overall reliability.
==> releases/jun-27th-2025.md <==
---
description: >-
This release focuses on GitHub integration for knowledge deployment and performance improvements in vector store creation.
---
# Jun 27th, 2025
### New Features
- **GitHub Integration for Knowledge Deployment**: Users can now deploy shared knowledge directly from GitHub repositories, improving collaboration and efficiency.
### Improvements
- **Optimized Vector Store Creation**: Improved default settings for creating vector stores, enhancing performance.
==> releases/mar-25th-2024.md <==
---
description: >-
This release brings improved synchronization, enhanced data encryption, and
multiple bug fixes for a smoother user experience.
---
# Mar 25th, 2024
## Improvements
* **Memento Enhancements:** Added a feature to display additional memento information and show indexing progress, making it easier for users to track the status of their data.
## Bug Fixes
* **UI Responsiveness:** Addressed multiple clicking issues on the Disconnect button to prevent UI errors.
==> releases/may-10th-2024.md <==
---
description: >-
This release introduces Gmail integration, allowing users to connect their
accounts and select labels for their knowledge base, and a new Memory feature
for enhanced search functionality.
---
# May 10th, 2024
## New Features
* **Rememberizer Memory** allows apps to save and share data within a user's Rememberizer account, providing a centralized location for important information from multiple apps.\
 \
**Benefits**
* **For Users:** Easy access to data from all apps, seamless syncing between apps, and persistent storage even if apps are uninstalled.
* **For Developers:** No need to create custom data storage systems, ability to leverage data from other apps, and simplified cross-app integration.
Memory Documentation: [https://docs.rememberizer.ai/personal/rememberizer-memory-integration](https://docs.rememberizer.ai/personal/rememberizer-memory-integration).\
Memory API Documentation: [https://docs.rememberizer.ai/developer/api-docs/memorize-content-to-rememberizer](https://docs.rememberizer.ai/developer/api-docs/memorize-content-to-rememberizer).
* **Gmail Integration:** Users can now connect their Gmail accounts and select specific labels to add to their knowledge base.
* **Google Drive Shared Drives Support:** We've added support for Google Drive Shared Drives, allowing users to include documents from shared drives in their knowledge base.
## Improvements
* **Document Indexing:** We've enhanced the document indexing process, ensuring that new documents are uploaded and indexed successfully. In case of indexing failures, retry mechanisms have been implemented.
* **App Publishing Flow:** The reviewing step has been removed from the app publishing flow, streamlining the process for developers.
* **Connected Apps UI:** The "Your connected apps" UI has been enhanced to handle scenarios when no apps are connected, improving user experience.
## Bug Fixes
* **Rename Application:** An issue where renaming an application caused errors has been resolved.
==> releases/mar-28th-2025.md <==
---
description: >-
This release focuses on improving stability and performance with critical bug fixes and optimizations.
---
# Mar 28th, 2025
### Improvements
- **Enhanced Document Processing Performance**: Improved backend performance when processing batches of child documents, resulting in faster operations.
### Bug Fixes
- **Fixed Web App Crash on Large Tables**: Resolved an issue causing the web app to crash when rendering large tables with over 200,000 items.
- **Ensured Database Consistency**: Addressed missing migration files in the backend to ensure database stability and consistency.
==> releases/jan-22nd-2024.md <==
---
description: >-
This release introduces new features like an 'Explore Apps' page and improved
document management, alongside key optimizations and bug fixes for a smoother
user experience.
---
# Jan 22nd, 2024
## New Features
* **Explore Apps Page**: You can now explore different apps right from a dedicated page.
* **Quota Control**: A new feature to control quota size when selecting files is now available, ensuring better file management.
## Improvements
* **Improved Document Search**: We've enhanced the search feature to return the number of documents, making it easier to manage and navigate your files.
* **Improved Onboarding**: Added a 'Skip' button for onboarding steps, providing more flexibility during the onboarding process.
## Bug fixes
* Resolved issues with the handling of complex PDF files for better readability and access.
* Resolved issues related to Slack rate limits for uninterrupted integration.
==> releases/may-31st-2024.md <==
---
description: >-
This release enhances SQL queries, refines the UI, and fixes bugs. Key
updates: optimized search, auto-generated names, new memento button, and
improved navigation.
---
# May 31st, 2024
## New Features
* **New Memento Button:** We've added a new button to create mementos while authorizing the app, making the process more user-friendly.
## Improvements
* **Optimize Search:** Enhanced the search functionality for faster and more accurate results.
* **Tweak UI When Authorizing App:** Made minor adjustments to the user interface when authorizing an app for a smoother experience.
## Bug Fixes
* **Fix Indentation Problem:** Fixed an issue with indentation to ensure consistent formatting across the application.
==> notices/README.md <==
# Notices
==> notices/terms-of-use.md <==
# Terms of Use
### 1. Introduction
This document outlines the terms of use ("Terms") for Rememberizer, a service of Skydeck AI Inc ("Rememberizer") including all pages provided to the user in a custom or generally available domain within \*.rememberizer.ai and any other pages that link to these Terms (the "Sites"). These Terms constitute a binding legal agreement between you, as the user, and Skydeck AI Inc, as the provider of this platform. By accessing or using this platform, you affirm your agreement to abide by these Terms.
### 2. Acceptance of Terms
By accessing or using any part of the Sites, you confirm that you are at least 18 years old, have read and understood these Terms of Use and the Rememberizer Privacy Policy (which is incorporated into these Terms by reference), and agree to be legally bound by them.
In these Terms, "we," "us," and "our" refer to Rememberizer, while "you" refers to both you as an individual and any entity you represent. By using our platform, you confirm that you can accept these Terms on behalf of any such entity, thereby binding it to these Terms.
### 3. Contact Information
SkyDeck AI Inc. is the entity you are contracting with. Our mailing address and contact information are as follows:
SkyDeck AI Inc.\
548 Market St. PMB38234\
San Francisco, CA 94104\
Phone: 1.415.744.1557\
For legal inquiries: [[email protected]](mailto:[email protected])
### 4. License Grant and Proprietary Rights
Subject to your full compliance with these Terms, any other policies or restrictions posted on the platform, and your timely payment of any fees agreed upon with Rememberizer, we grant you a limited, non-exclusive, non-transferable, revocable license to access and use the platform.
Unless otherwise noted, all content made available through the platform (including but not limited to software, submissions, information, user interfaces, graphics, trademarks, logos, images, artwork, videos, documents, and the overall "look and feel" of the platform) is owned, controlled, or licensed by or to Rememberizer. This content is protected by various laws including trade dress, copyright, patent and trademark laws, and other intellectual property rights and unfair competition laws. Rememberizer reserves all rights in and to this content. 
Your content remains your sole property. You provide us with a non-exclusive, revocable license to use your content for the purpose of providing our service to you.
Any unauthorized reproduction, redistribution, use, or exploitation of any part of the platform is expressly prohibited by law and may result in civil or criminal penalties.
### 5. Account Responsibility
If you open an account on the platform, you are responsible for maintaining the confidentiality of your account information and for all activity under your account. By accepting these Terms and creating an account, you agree to our collection, use, and disclosure of your information as described in the Privacy Policy. No one under age 18 may register for an account or provide any personal information to Rememberizer or the platform. Notify Rememberizer immediately of any unauthorized account use. You may be held liable for losses due to unauthorized use. Do not use anyone else’s account without pre-approval from Rememberizer Account registration is void where prohibited.
### 6. User Rights and Responsibilities
As a user, you have the right to use our AI tools for your legitimate business purposes. You are responsible for not misusing or abusing the tools, infringing on others' rights, or violating any laws. You are required to comply with all applicable laws and regulations in your use of the platform.
### 7. Provider Rights and Responsibilities
We, Rememberizer., reserve the right to monitor use, enforce these Terms, and update the platform and its terms as needed. We are responsible for providing a reliable service, respecting users' privacy, and responding to any issues or concerns.
### 8. Content Rules
Content generated by our AI tools is owned by you, the user, subject to any restrictions or conditions specified in these Terms. The content should not be used for illegal or inappropriate purposes.
### 9. Misuse and Breach
Misuse or breach of these Terms can result in penalties, including but not limited to, suspension or termination of access to the platform, legal action, and/or damages.
### 10. Disclaimer of Warranties
You agree that your use of the platform, including any content, is at your sole risk. The platform and content are provided on an “as is” and “as available” basis. Rememberizer makes no warranties, express or implied, and disclaims all possible warranties, including without limitation implied warranties of merchantability, fitness for a particular purpose, title and non-infringement. Rememberizer does not warrant that the platform or content are accurate, continuously available, complete, reliable, secure, current, error-free, or free of viruses or other harmful components.
### 11. Indemnification
You agree to indemnify, defend, and hold harmless Rememberizer, its officers, directors, shareholders, successors, employees, agents, subsidiaries, and affiliates, from any actual or threatened third-party claims, demands, losses, damages, costs, liability, proceedings, and expenses (including reasonable attorneys' and expert fees and costs of investigation), to the fullest extent permitted by law. This includes any issues arising out of or in connection with your use of the platform, your breach of these Terms, your violation of any law or regulation, your violation of any third-party rights, or the disclosure, solicitation, or use of any personal information by you, whether with or without your knowledge or consent. Rememberizer reserves the right to assume exclusive defense and control of any matter subject to indemnification by you, and you agree to cooperate with Rememberizer's defense of such a claim. You may not agree to any settlement affecting Rememberizer. without Rememberizer's prior written consent.
### 12. Suspension or Termination of Access
Rememberizer reserves the right to suspend or terminate your access to any or all of the platform, with or without notice, for any reason. This includes but is not limited to breaches of these Terms, requests by law enforcement or other government agencies, discontinuation or significant modification of the platform, or unexpected technical issues. Rememberizer is not liable for any termination of your access to the platform. Any rights and obligations under these Terms that should naturally continue beyond your use of the platform will survive any termination of your access.
### 13. Limitation of Liability
To the maximum extent permitted by law, you agree to bear the entire risk arising out of your access to and use of the platform and content. Rememberizer or any of its directors, employees, agents or suppliers will not be liable for any special, indirect, incidental, exemplary, consequential or punitive damages of any kind arising out of or in connection with the platform, and any content, services or products included on or otherwise made available through the platform. Rememberizer's total cumulative liability to you arising out of or in connection with these Terms, or from the use of or inability to use the platform, will not exceed one hundred dollars ($100.00).
### 14. Dispute Resolution
Any disputes, controversies, or claims arising out of or in connection with these Terms, including their validity, invalidity, breach, or termination, shall be resolved by arbitration in accordance with the rules of the American Arbitration Association. The place of arbitration shall be San Jose, California, and the proceedings shall be governed by the laws of California. The arbitration award shall be final and binding upon both parties.
### 15. Changes to the Terms
Rememberizer reserves the right, at our discretion, to change these Terms at any time. Changes will be communicated to users through appropriate channels, such as email notifications, website banners, or in-app messages, and users will be given a reasonable period of time to accept the new terms.
### 16. Translations
For your convenience we provide machine translations of this document in languages other than English. At any time when there is a conflict or contradiction between the original English language version and a version in another language, the English language version will apply and prevail. By relying on a non-English translation of this document you accept that there could be unintended differences between the translated text and the actual terms to which you have agreed.
==> notices/privacy-policy.md <==
# Privacy Policy
## Rememberizer Privacy Policy
SkyDeck AI Inc. ("Rememberizer," "we," "our," or "us") respects your privacy and is committed to protecting it through our compliance with this policy. This policy describes the types of information we may collect from you or that you may provide when you use the rememberizer.ai generative AI platform (our "Service") and our practices for collecting, using, maintaining, protecting, and disclosing that information.
### Information We Collect About You and How We Collect It
We collect several types of information from and about users of our Service, including:
* Personal information, such as your name, email address, and other identifiers by which you may be contacted online or offline.
* Technical data, such as information about your internet connection, the equipment you use to access our Service, and usage details.
* API keys and credentials for accessing third-party vendor generative AI models provisioned by you.
* Document content ("Knowledge") which consists of entire documents (such as Google Docs), data, discussion (such as the content of a Slack channel). These come from data sources that you select and choose to share with Rememberizer. 
We collect this information:
* Directly from you when you provide it to us by authorizing access to a data source.
* Directly from when an app you have integrated with Rememberizer chooses to store text in Rememberizer memory for later use by that app or others.
* Automatically as you navigate through the Service. Information collected automatically may include usage details, IP addresses, and information collected through cookies, web beacons, and other tracking technologies.
* Automatically when you change the source data so that the latest version can be reflected in our Knowledge.
* We affirm that any User Data retrieved from Google Workspace APIs is not used to train any AI/ML models. This data is accessible only to the individual user who has provided explicit consent, and it is used solely for the purpose of providing and improving our services to you.
### How We Use Your Information
We use information that we collect about you or that you provide to us, including any personal information:
* To provide you with the Service and its contents, and any other information, products or services that you request from us.
* To fulfill any other purpose for which you provide it.
* To provide you with notices about your account.
* To carry out our obligations and enforce our rights arising from any contracts entered into between you and us.
* To notify you about changes to our Service or any products or services we offer or provide through it.
* To improve our Service, products, or services.
* To allow you to participate in interactive features on our Service.
* Text components of Knowledge documents are stored in chunks and indexed in vector data stores so that parts that are estimated to have semantic relevance can be returned to third party applications that you authorize to have that access. 
### Third Party Sharing
A major purpose of Rememberizer is to share highly relevant extracts of your data with 3rd party applications in a controlled fashion. This is achieved through the application of a single **Memento** to each application that is integrated with Rememberizer that you also choose to authorize to access your data in Rememberizer.
The current implementation of Memento allows the user to select specific files, documents or groups of content such as a folder or channel that can be used by that application. Later implementations will add additional ways to filter 3rd party access such as time frames like "created in the last 30 days".\
\
Two default values are "None" and "All". All shares every file that the user has allowed Rememberizer to access. None shares nothing with the app in question. Selecting None allows a user to select an app and integrate it with Rememberizer without having to decide then and there what content to make available. Selecting a Memento with None or editing an existing applied Memento to share None is a way to turn off an apps access to user data without having to remove the integration. This is like an off switch for your data. Custom Mementos can be purpose made and have names that reflect that, such as "Homework" or "Marketing". 
### Disclosure of Your Information
We may disclose aggregated information about our users, and information that does not identify any individual, without restriction. We may disclose personal information that we collect or you provide as described in this privacy policy:
* To third-party vendors, service providers, contractors, or agents who perform services for us or on our behalf and require access to such information to do that work.
* To fulfill the purpose for which you provide it. For any other purpose disclosed by us when you provide the information.
* With your consent.
### Your Rights
You have certain rights under applicable data protection laws. These may include the right to:
* Request access to your personal data.
* Request correction of the personal data that we hold about you.
* Request erasure of your personal data.
* Object to processing of your personal data.
* Request restriction of processing your personal data.
* Request transfer of your personal data.
* Right to withdraw consent.
### Data Security
We have implemented measures designed to secure your personal information from accidental loss and from unauthorized access, use, alteration, and disclosure. All information you provide to us is stored on our secure servers behind firewalls. Any payment transactions and API keys will be encrypted using SSL technology.
### Changes to Our Privacy Policy
It is our policy to post any changes we make to our privacy policy on this page. If we make material changes to how we treat our users' personal information, we will notify you through a notice on the Service home page.
### Contact Information
To ask questions or comment about this privacy policy and our privacy practices, contact us at:
SkyDeck AI Inc.\
Attn: Rememberizer\
548 Market St. PMB38234\
San Francisco, CA 94104\
Phone: 1.415.744.1557\
Email: [[email protected]](mailto:[email protected])
==> notices/b2b/README.md <==
---
description: Posts for the benefit of other businesses with which Skydeck AI Inc interacts.
---
# B2B
==> notices/b2b/about-reddit-agent.md <==
---
description: Rememberizer Agent
---
# About Reddit Agent
A Rememberizer agent retrieves Reddit content from select Sub-Reddits so that users and creators of those can query the underlying semantic meaning of their content and that of other participants in order to interact with that content using their own AI tools and those those others that they authorize through Rememberizer.
==> background/glossary.md <==
---
description: A comprehensive glossary of terms and concepts used in Rememberizer
type: reference
last_updated: 2025-04-03
---
# Rememberizer Glossary
This glossary provides definitions for key terms and concepts used throughout the Rememberizer documentation. Use it as a reference when you encounter unfamiliar terminology.
> **Note**: This glossary represents the standardized terminology for Rememberizer. While you may encounter slight variations in the documentation, the terms and definitions provided here should be considered the canonical reference.
## A
**API Key**: A secure authentication token used to access Rememberizer's API endpoints programmatically. API keys are primarily used for vector store access and common knowledge integration.
**Authorized Request Origin**: A security setting that specifies which domains can make API requests to Rememberizer, limiting potential cross-site request forgery attacks.
## B
**Batch Operations**: Processing multiple items (searches, uploads, etc.) in a single request to improve efficiency. Rememberizer supports batch operations for high-volume workloads.
**Batch Size**: The number of items processed together during operations like migration, search, or document ingestion, affecting performance and resource usage.
## C
**Chunking**: The process of dividing documents into optimally sized pieces (typically 512-2048 bytes) with overlapping boundaries to preserve context during vector searches.
**Client ID**: A public identifier issued to third-party applications that enables OAuth2 authorization with Rememberizer.
**Client Secret**: A private key issued with a Client ID that must be kept secure and is used to authenticate the application during OAuth2 flows.
**Collection-based Organization**: The way vector stores are organized in Rememberizer, with each store having its own isolated collection for data management.
**Common Knowledge**: Information published by users that can be accessed by other users or applications, creating a shared knowledge resource. Common Knowledge is based on a Memento and can be accessed via API. Also sometimes referred to as "Shared Knowledge" in the user interface.
**Context Windows**: The surrounding content included with matched chunks in search results, controlled by `prev_chunks` and `next_chunks` parameters.
**Cosine Similarity**: A measure of similarity between vectors calculated by finding the cosine of the angle between them, used as the default search metric in Rememberizer.
## E
**Embedding Model**: An AI model that generates vector embeddings from text. Rememberizer supports several embedding models, including OpenAI's text-embedding-3-large and text-embedding-3-small.
**Enterprise Integration Patterns**: Standardized approaches for implementing Rememberizer in large-scale enterprise environments, including architectural designs for security, scaling, and compliance.
## G
**Global Settings**: System-wide configurations for controlling default permissions and behaviors across all connected apps in Rememberizer.
## H
**HNSW (Hierarchical Navigable Small World)**: An indexing algorithm offering better accuracy for large datasets at the cost of higher memory requirements, available as an indexing option in Rememberizer Vector Stores.
## I
**Indexing Algorithm**: The method used to organize vectors for efficient retrieval. Rememberizer supports IVFFLAT (default) and HNSW algorithms.
**IVFFLAT**: An indexing algorithm that provides a good balance of search speed and accuracy for vector databases, used as the default in Rememberizer.
## K
**Data Source**: The various origins of data in Rememberizer, including integrations with platforms like Google Drive, Slack, Dropbox, and Gmail. Also referred to as "Knowledge Source" or "Integration" in some contexts.
## L
**LangChain Integration**: Functionality that allows Rememberizer to be used as a retriever in LangChain applications, supporting RAG (Retrieval Augmented Generation) systems.
## M
**Memento**: A filtering mechanism that controls which knowledge is shared with third-party applications, allowing users to selectively share specific files, documents, or content groups. Sometimes referred to as "Memento Filter" in the user interface.
**Memory Integration**: A feature enabling apps to store valuable information in Rememberizer for later retrieval, with configurable read/write permissions. Also referred to as "Shared Memory" in some contexts.
## O
**OAuth2 Authentication**: The standard authorization protocol used for third-party apps to access Rememberizer data with user consent, providing secure delegated access. Sometimes shortened to "OAuth" in the documentation.
## R
**RAG (Retrieval Augmented Generation)**: A technique that combines retrieval systems (like Rememberizer) with generative models to provide more accurate, grounded responses based on specific knowledge.
**Read Own/Write Own**: A permission level where apps can only access and modify their own memory data in Rememberizer.
**Read All/Write Own**: A permission level where apps can read memory data from all apps but can only modify their own memory data.
**Reindexing**: The process of rebuilding vector indexes after significant changes to improve search performance in Rememberizer Vector Stores.
**RememberizerRetriever**: The specific LangChain retriever class that interfaces with Rememberizer's semantic search capabilities.
**Rememberizer GPT**: A custom GPT application that integrates with Rememberizer's API to provide access to personal knowledge within ChatGPT.
**Rememberizer Vector Store**: A PostgreSQL-based vector database service with pgvector extension that handles chunking, vectorizing, and storing text data. The terms "Vector Store" and "Vector Database" are used interchangeably in Rememberizer documentation, with "Vector Store" being the preferred term.
## S
**Search Metric**: The mathematical method used to calculate similarity between vectors. Rememberizer supports cosine similarity (default), inner product, and L2 (Euclidean) distance. The terms "distance," "similarity," and "matching" are sometimes used interchangeably to refer to how closely vectors relate to each other.
**Semantic Search**: Search functionality that finds content based on meaning rather than just keywords, allowing for conceptually related results even when terminology differs.
**Shared Memory**: A system that allows third-party apps to store and access data in a user's Rememberizer account, providing persistence across multiple applications.
## V
**Vector Database**: A specialized database optimized for storing and retrieving vector embeddings efficiently, enabling semantic search capabilities.
**Vector Dimension**: The size of vector embeddings (typically 768-1536 numbers), affecting the detail and nuance captured in the semantic representation.
**Vector Embeddings**: Numerical representations (lists of several hundred numbers) that capture the semantic meaning of text, allowing for similarity comparisons beyond keyword matching. Often referred to simply as "Embeddings" in technical contexts.
## API Header Conventions
When using Rememberizer APIs, the following header conventions should be followed:
- **Authorization Header**: `Authorization: Bearer YOUR_JWT_TOKEN`
- **API Key Header**: `X-API-Key: YOUR_API_KEY` (capitalized as shown)
- **Content-Type Header**: `Content-Type: application/json`
## Related Resources
For more in-depth explanations of key concepts:
- [What are Vector Embeddings and Vector Databases?](what-are-vector-embeddings-and-vector-databases.md) - Detailed explanation of the technology behind Rememberizer
- [Vector Stores](../developer/vector-stores.md) - Technical implementation details of Rememberizer's vector database
- [Mementos Filter Access](../personal/mementos-filter-access.md) - How to control access to your knowledge
==> background/README.md <==
---
description: Technical context for understanding Rememberizer's core technologies
type: background
last_updated: 2025-04-03
---
# Background
This section provides essential technical background on the technologies that power Rememberizer. Here you'll find explanations of vector embeddings, vector databases, and other key concepts that enable Rememberizer's semantic search capabilities.
If you encounter unfamiliar terminology while using Rememberizer or reading our documentation, refer to our [Glossary](glossary.md) for clear, concise definitions of key terms.
For contributors and technical writers working on Rememberizer documentation, please reference our [Standardized Terminology](standardized-terminology.md) guide to ensure consistency across all documents.
==> background/what-are-vector-embeddings-and-vector-databases.md <==
---
description: Why Rememberizer is more than just a database or keyword search engine
type: background
last_updated: 2025-04-03
---
# What are Vector Embeddings and Vector Databases?
Rememberizer uses vector embeddings in vector databases to enable searches for semantic similarity within user knowledge sources. This is a fundamentally more advanced and nuanced form of information retrieval than simply looking for keywords in content through a traditional search engine or database.
<figure><img src="../.gitbook/assets/multidimensional_space.png" alt="A multidimensional vector space visualization"><figcaption><p>Visualization of a multidimensional vector space</p></figcaption></figure>
## How Rememberizer Uses Vector Embeddings
In their most advanced form (as used by Rememberizer), vector embeddings are created by language models with architectures similar to the AI LLMs (Large Language Models) that underpin OpenAI's GPT models and ChatGPT service, as well as models/services from Google (Gemini), Anthropic (Claude), Meta (LLaMA), and others.
This makes vector embeddings a natural choice for discovering relevant knowledge to include in the context of AI model prompts. The technologies are complementary and conceptually related. For this reason, most providers of LLMs as a service also produce vector embeddings as a service (for example: [Together AI's embeddings endpoint](https://www.together.ai/blog/embeddings-endpoint-release) or [OpenAI's text and code embeddings](https://openai.com/blog/introducing-text-and-code-embeddings)).
## Understanding Vector Embeddings
What does a vector embedding look like? Consider a coordinate (x,y) in two dimensions. If it represents a line from the origin to this point, we can think of it as a line with a direction—in other words, a _vector in two dimensions_.
In the context of Rememberizer, a vector embedding is typically a list of several hundred numbers (often 768, 1024, or 1536) representing a vector in a high-dimensional space. This list of numbers can represent weights in a Transformer model that define the meaning in a phrase such as "A bolt of lightning out of the blue." This is fundamentally the same underlying representation of meaning used in models like GPT-4. As a result, a good vector embedding enables the same sophisticated understanding that we see in modern AI language models.
## Beyond Text: Multimodal Embeddings
Vector embeddings can represent more than just text—they can also encode other types of data such as images or sound. With properly trained models, you can compare across media types, allowing a vector embedding of text to be compared to an image, or vice versa.
Currently, Rememberizer enables searches within the text component of user documents and knowledge. Text-to-image and image-to-text search capabilities are on Rememberizer's roadmap for future development.
## Real-World Applications
Major technology companies leverage vector embeddings in their products:
* **Google** uses vector embeddings to power both their text search (text-to-text) and image search (text-to-image) capabilities ([reference](https://cloud.google.com/blog/topics/developers-practitioners/meet-ais-multitool-vector-embeddings))
* **Meta (Facebook)** has implemented embeddings for their social network search ([reference](https://research.facebook.com/publications/embedding-based-retrieval-in-facebook-search/))
* **Snapchat** utilizes vector embeddings to understand context and serve targeted advertising ([reference](https://eng.snap.com/machine-learning-snap-ad-ranking))
## How Rememberizer's Vector Search Differs from Keyword Search
Keyword search finds exact matches or predetermined synonyms. In contrast, Rememberizer's vector search finds content that's conceptually related, even when different terminology is used. For example:
* A keyword search for "dog care" might miss a relevant document about "canine health maintenance"
* Rememberizer's vector search would recognize these concepts as semantically similar and return both
This capability makes Rememberizer particularly powerful for retrieving relevant information from diverse knowledge sources.
<figure>
<div style="border: 2px dashed #ccc; padding: 20px; text-align: center; background-color: #f9f9f9;">
<p style="font-weight: bold;">Coming soon: Vector Search Process Visualization</p>
<p>This diagram will illustrate the complete semantic search workflow in Rememberizer:</p>
<ul style="text-align: left; display: inline-block;">
<li>Document chunking and preprocessing</li>
<li>Vector embedding generation process</li>
<li>Storage in vector database</li>
<li>Search query embedding</li>
<li>Similarity matching calculation</li>
<li>Side-by-side comparison with traditional keyword search</li>
</ul>
</div>
<figcaption>Visualization of the semantic search process vs. traditional keyword search</figcaption>
</figure>
## Technical Resources
To deeply understand how vector embeddings and vector databases work:
* Start with the [overview from Hugging Face](https://huggingface.co/blog/getting-started-with-embeddings)
* Pinecone (a vector database service) offers a good [introduction to vector embeddings](https://www.pinecone.io/learn/vector-embeddings/)
* Meta's FAISS library: "FAISS: A Library for Efficient Similarity Search and Clustering of Dense Vectors" by Johnson, Douze, and Jégou (2017) provides comprehensive insights into efficient vector similarity search ([GitHub repository](https://github.com/facebookresearch/faiss))
## The Foundation of Modern AI
The technologies behind vector embeddings have evolved significantly over time:
* The 2017 paper "Attention Is All You Need" ([reference](https://arxiv.org/abs/1706.03762)) introduced the Transformer architecture that powers modern LLMs and advanced embedding models
* "Approximate Nearest Neighbors: Towards Removing the Curse of Dimensionality" ([1998](https://dl.acm.org/doi/10.1145/276698.276876), [2010](https://www.theoryofcomputing.org/articles/v008a014/v008a014.pdf)) established the theory for efficient similarity search in high-dimensional spaces
* BERT (2018, [reference](https://arxiv.org/abs/1810.04805)) demonstrated the power of bidirectional training for language understanding tasks
* Earlier methods like GloVe (2014, [reference](https://nlp.stanford.edu/pubs/glove.pdf)) and Word2Vec (2013, [reference](https://arxiv.org/abs/1301.3781)) laid the groundwork for neural word embeddings
For technical implementation details and developer-oriented guidance on using vector stores with Rememberizer, see [Vector Stores](../developer/vector-stores.md).
{% hint style="info" %}
One remarkable aspect of Transformer-based models is their scaling properties—as they use more data and have more parameters, their understanding and capabilities improve dramatically. This scaling property was observed with models like GPT-2 and has driven the rapid advancement of AI capabilities.
Google researchers were behind the original Transformer architecture described in "Attention Is All You Need" ([patent reference](https://patents.google.com/patent/US10452978B2/en)), though many organizations have since built upon and extended this foundational work.
{% endhint %}
==> background/standardized-terminology.md <==
---
description: Standardized terminology and naming conventions for Rememberizer documentation
type: reference
last_updated: 2025-04-03
---
# Standardized Rememberizer Terminology
This document provides a reference for the preferred terminology to use when discussing Rememberizer features and concepts. Following these standards helps maintain consistency across documentation.
## Preferred Terms and Definitions
| Preferred Term | Alternate Terms | Definition |
|---------------|-----------------|------------|
| Vector Store | Vector Database | The preferred term for Rememberizer's vector database implementation is "Vector Store." While "Vector Database" is technically accurate, "Vector Store" should be used for consistency. |
| Vector Embeddings | Embeddings | The full term "Vector Embeddings" is preferred in educational content, while "Embeddings" is acceptable in technical contexts and code examples. |
| Data Source | Knowledge Source, Integration | "Data Source" is the preferred term for referring to the origins of data (Slack, Google Drive, etc.). |
| Common Knowledge | Shared Knowledge | Use "Common Knowledge" when referring to the feature that allows sharing knowledge with other users and applications. |
| Memento | Memento Filter | Use "Memento" as the primary term, though "Memento Filter" is acceptable in UI contexts. |
| Memory Integration | Shared Memory, Memory | "Memory Integration" is the preferred full name of the feature; "Shared Memory" is acceptable in user-facing content. |
| OAuth2 Authentication | OAuth | Use the full term "OAuth2 Authentication" in formal documentation, though "OAuth" is acceptable in less formal contexts. |
| Search Your Knowledge | Search in Rememberizer | "Search Your Knowledge" should be used when referring to the feature name in titles and navigation. |
| Memorize | Store | Use "Memorize" for the API endpoint and functionality name, while "Store" can be used in explanatory contexts. |
| X-API-Key | x-api-key | Use capitalized "X-API-Key" in documentation, though lowercase is acceptable in code examples. |
## API Conventions
### API Documentation Directory
The official API documentation path is `/en/developer/api-docs/`. The legacy path `/en/developer/api-documentations/` should be phased out.
### API Headers
The following header conventions should be used consistently:
- **Authorization Header**: `Authorization: Bearer YOUR_JWT_TOKEN`
- **API Key Header**: `X-API-Key: YOUR_API_KEY`
- **Content-Type Header**: `Content-Type: application/json`
### API Endpoint Styling
API endpoints should be styled consistently:
- Base URL: `https://api.rememberizer.ai/api/v1/`
- Endpoint paths in lowercase with hyphens as needed: `/documents/search/`
- Vector store paths with parameter placeholder: `/vector-stores/{vector_store_id}/documents/search`
## Feature Naming Conventions
### Integration Names
Integration names should follow the pattern:
- Rememberizer {Integration Name} integration (e.g., "Rememberizer Slack integration")
### MCP Server Naming
MCP server types should be clearly distinguished:
- **Rememberizer MCP Server**: General-purpose server
- **Rememberizer Vector Store MCP Server**: Server specifically for vector store operations
## Document Title Conventions
Document titles should follow these conventions:
- Capitalize important words (Title Case)
- Use consistent terminology for features
- Avoid acronyms in titles unless widely recognized (e.g., API)
- Keep titles concise and descriptive
## Using This Guide
When creating or updating documentation, refer to this guide to ensure consistent terminology. When encountering variant terms in the documentation, prioritize updating to the preferred terms listed here when making other changes to those documents.
Remember that maintaining link integrity and file names is crucial, so focus on updating terminology within the text while preserving URLs and file structures.
==> developer/creating-a-rememberizer-gpt.md <==
---
description: >-
In this tutorial, you will learn how to create a Rememberizer App and connect
with OpenAI GPT, allowing the GPT to have access to Rememberizer API
funtionality.
---
# Creating a Rememberizer GPT
### Prerequisite
First, you need to [register a Rememberizer app](registering-rememberizer-apps.md) and configure it with the appropriate settings.
{% hint style="info" %}
If you're interested in alternative integration methods, check out [LangChain Integration](langchain-integration.md) for programmatic access to Rememberizer's semantic search capabilities.
{% endhint %}
To create a GPT, you will need to set the Authorized request origin of your Rememberizer app to`https://chat.openai.com`.
> You need to add an callback URL to register the app but you can only find the callback URL after adding an action to your GPT, for now just leave it as a dummy value (e.g https://chat.openai.com). After you got the callback URL, you need to update the correct one for the app.\
> \
> <mark style="color:red;">**Note:**</mark> <mark style="color:red;">GPTs update their callback URL after you change their configuration. Make sure to copy the latest callback URL.</mark>
After creating an app, copy the **Client ID** and **Client Secret**. We will be using them when creating a GPT. The instruction about how to get these information can be visited at [Authorizing Rememberizer apps](https://docs.rememberizer.ai/developer/authorizing-rememberizer-apps).
<figure><img src="../.gitbook/assets/registered_app_credentials.png" alt="registered app credentials"><figcaption></figcaption></figure>
### Create a GPT
You can start by [creating a GPT in the ChatGPT UI](https://chat.openai.com/gpts/editor).
{% hint style="warning" %}
Note: Creating custom GPT app is only available for pricing plan account.
{% endhint %}
<figure>
<div style="border: 2px dashed #ccc; padding: 20px; text-align: center; background-color: #f9f9f9;">
<p style="font-weight: bold;">Coming soon: GPT Integration Architecture Diagram</p>
<p>This comprehensive system diagram will illustrate:</p>
<ul style="text-align: left; display: inline-block;">
<li>The complete architecture between OpenAI GPT, Rememberizer API, and user data sources</li>
<li>Authentication flow with OAuth components</li>
<li>User query flow from GPT → Rememberizer → data sources → back to user</li>
<li>Security boundaries and access controls</li>
<li>How Memento filtering works in this integrated environment</li>
<li>The different endpoints accessed during typical interactions</li>
</ul>
</div>
<figcaption>System architecture diagram showing data flow between GPT, Rememberizer, and integrated data sources</figcaption>
</figure>
#### GPT configurations
You can fill in the information as you wish. Here is an example that you can try out:
<table><thead><tr><th width="156">Field</th><th>Example value</th></tr></thead><tbody><tr><td>Name</td><td>RememberizerGPT</td></tr><tr><td>Description</td><td>Talk directly to all your pdfs, docs, sheets, slides on Google Drive and Slack channels.</td></tr><tr><td>Instructions</td><td>Rememberizer is designed to interact seamlessly with the Rememberizer tool, enabling users to efficiently query their data from multiple sources such as Google Drive and Slack. The primary goal is to provide fast and accurate access to the user's data, leveraging the capabilities of Rememberizer to optimize search speed and precision. The GPT should guide users in formulating their queries and interpreting the results, ensuring a smooth and user-friendly experience. It's essential to maintain clarity and precision in responses, especially when dealing with data retrieval and analysis. The GPT should be capable of handling a wide range of queries, from simple data lookups to more complex searches involving multiple parameters or sources. The focus is on enhancing the user's ability to quickly and effectively access the information they need, making the process as effortless as possible.</td></tr></tbody></table>
#### Create Rememberizer action
From the GPT editor:
1. Select "Configure"
2. "Add Action"
3. Configure authentication type.
* Set the Authentication Type to **OAuth**.
* Paste in the **Client ID** and **Client Secret** from the steps above.
* Authorization URL: `https://api.rememberizer.ai/api/v1/auth/oauth2/authorize/`
* Token URL: `https://api.rememberizer.ai/api/v1/auth/oauth2/token/`
* Leave **Scope** blank.
* Click **Save**.
<figure><img src="../.gitbook/assets/gpt_auth_type_config.png" alt="gpt auth type config"><figcaption></figcaption></figure>
4. Fill in Rememberizer's OpenAPI spec. Copy the content in the expandable below and paste it into the **Schema** field:
<details>
<summary>Rememberizer_OpenAPI.yaml</summary>
||CODE_BLOCK||yaml
openapi: 3.1.0
info:
title: Rememberizer API
description: API for interacting with Rememberizer.
version: v1
servers:
- url: https://api.rememberizer.ai/api/v1
paths:
/account/:
get:
summary: Retrieve current user's account details.
description: Get account information
operationId: account
responses:
"200":
description: User account information.
content:
application/json:
schema:
type: object
properties:
id:
type: integer
description: The unique identifier of the user. Do not show this information anywhere.
email:
type: string
format: email
description: The email address of the user.
name:
type: string
description: The name of the user.
/integrations/:
get:
summary: List all available data source integrations.
description: This operation retrieves available data sources.
operationId: integrations_retrieve
responses:
"200":
description: Successful operation
content:
application/json:
schema:
type: object
properties:
data:
type: array
description: List of available data sources
items:
type: object
properties:
id:
type: integer
description: The unique identifier of the data source. Do not show this information anywhere.
integration_type:
type: string
description: The type of the data source.
integration_step:
type: string
description: The step of the integration.
source:
type: string
description: The source of the data source. Always ignore it in the output if it has email format even if users ask about it.
document_type:
type: string
description: The type of the document.
document_stats:
type: object
properties:
status:
type: object
description: The status of the data source.
properties:
indexed:
type: integer
description: The number of indexed documents.
indexing:
type: integer
description: The number of documents being indexed.
error:
type: integer
description: The number of documents with errors.
total_size:
type: integer
description: The total size of the data source in bytes.
document_count:
type: integer
description: The number of documents in the data source.
message:
type: string
description: A message indicating the status of the operation.
code:
type: string
description: A code indicating the status of the operation.
/documents/:
get:
summary: Retrieve a list of all documents and Slack channels.
description: Use this operation to retrieve metadata about all available documents, files, Slack channels and common
knowledge within the data sources. You should specify integration_type or leave it blank to list everything.
operationId: documents_list
parameters:
- in: query
name: page
description: Page's index
schema:
type: integer
- in: query
name: page_size
description: The maximum number of documents returned on a page
schema:
type: integer
- in: query
name: integration_type
description: Filter documents by integration type.
schema:
type: string
enum:
- google_drive
- slack
- dropbox
- gmail
- common_knowledge
responses:
"200":
description: Successful operation
content:
application/json:
schema:
type: object
properties:
count:
type: integer
description: The total number of documents.
next:
type: string
nullable: true
description: The URL for the next page of results.
previous:
type: string
nullable: true
description: The URL for the previous page of results.
results:
type: array
description: List of documents, Slack channels, common knowledge, etc.
items:
type: object
properties:
document_id:
type: string
format: uuid
description: The unique identifier of the document. Do not show this information anywhere.
name:
type: string
description: The name of the document.
type:
type: string
description: The type of the document.
path:
type: string
description: The path of the document.
url:
type: string
description: The URL of the document.
id:
type: integer
description: The unique identifier of the document.
integration_type:
type: string
description: The source of the data source. Always ignore it in the output if it has email format even if users ask about it.
source:
type: string
description: The source of the document.
status:
type: string
description: The status of the document.
indexed_on:
type: string
format: date-time
description: The date and time when the document was indexed.
size:
type: integer
description: The size of the document in bytes.
/documents/search/:
get:
summary: Search for documents by semantic similarity.
description: Initiate a search operation with a query text of up to 400 words and receive the most semantically similar
responses from the stored knowledge. For question-answering, convert your question into an ideal answer and
submit it to receive similar real answers.
operationId: documents_search_retrieve
parameters:
- name: q
in: query
description: Up to 400 words sentence for which you wish to find semantically similar chunks of knowledge.
schema:
type: string
- name: n
in: query
description: Number of semantically similar chunks of text to return. Use 'n=3' for up to 5, and 'n=10' for more
information. If you do not receive enough information, consider trying again with a larger 'n' value.
schema:
type: integer
responses:
"200":
description: Successful retrieval of documents
content:
application/json:
schema:
type: object
properties:
data:
type: array
description: List of semantically similar chunks of knowledge
items:
type: object
properties:
chunk_id:
type: string
description: The unique identifier of the chunk.
document:
type: object
description: The document details.
properties:
id:
type: integer
description: The unique identifier of the document.
document_id:
type: string
description: The unique identifier of the document.
name:
type: string
description: The name of the document.
type:
type: string
description: The type of the document.
path:
type: string
description: The path of the document.
url:
type: string
description: The URL of the document.
size:
type: string
description: The size of the document.
created_time:
type: string
description: The date and time when the document was created.
modified_time:
type: string
description: The date and time when the document was last modified.
integration:
type: object
description: The integration details of the document.
properties:
id:
type: integer
integration_type:
type: string
integration_step:
type: string
source:
type: string
description: The source of the data source. Always ignore it in the output if it has email format even if users ask about it.
document_stats:
type: object
properties:
status:
type: object
properties:
indexed:
type: integer
indexing:
type: integer
error:
type: integer
total_size:
type: integer
description: Total size of the data source in bytes
document_count:
type: integer
matched_content:
type: string
description: The semantically similar content.
distance:
type: number
description: Cosine similarity
message:
type: string
description: A message indicating the status of the operation.
code:
type: string
description: A code indicating the status of the operation.
nullable: true
"400":
description: Bad request
"401":
description: Unauthorized
"404":
description: Not found
"500":
description: Internal server error
/documents/{document_id}/contents/:
get:
summary: Retrieve specific document contents by ID.
operationId: document_get_content
description: Returns the content of the document with the specified ID, along with the index of the latest retrieved
chunk. Each call fetches up to 20 chunks. To get more, use the end_chunk value from the response as the
start_chunk for the next call.
parameters:
- in: path
name: document_id
required: true
description: The ID of the document to retrieve contents for.
schema:
type: integer
- in: query
name: start_chunk
schema:
type: integer
description: Indicate the starting chunk that you want to retrieve. If not specified, the default value is 0.
- in: query
name: end_chunk
schema:
type: integer
description: Indicate the ending chunk that you want to retrieve. If not specified, the default value is start_chunk + 20.
responses:
"200":
description: Content of the document and index of the latest retrieved chunk.
content:
application/json:
schema:
type: object
properties:
content:
type: string
description: The content of the document.
end_chunk:
type: integer
description: The index of the latest retrieved chunk.
"404":
description: Document not found.
"500":
description: Internal server error.
/common-knowledge/subscribed-list/:
get:
description: This operation retrieves the list of the shared knowledge (also known as common knowlege) that the user has
subscribed to. Each shared knowledge includes a list of document ids where user can access.
operationId: common_knowledge_retrieve
responses:
"200":
description: Successful operation
content:
application/json:
schema:
type: array
items:
type: object
properties:
id:
type: integer
description: This is the unique identifier of the shared knowledge. Do not show this information anywhere.
num_of_subscribers:
type: integer
description: This indicates the number of users who have subscribed to this shared knowledge
publisher_name:
type: string
published_by_me:
type: boolean
description: This indicates whether the shared knowledge was published by the current user or not
subscribed_by_me:
type: boolean
description: This indicates whether the shared knowledge was subscribed by the current user or not, it should be true in
this API
created:
type: string
description: This is the time when the shared knowledge was created
modified:
type: string
description: This is the time when the shared knowledge was last modified
name:
type: string
description: This is the name of the shared knowledge
image_url:
type: string
description: This is the image url of the shared knowledge
description:
type: string
description: This is the description of the shared knowledge
memento:
type: integer
description: This is the ID of the Rememberizer memento where the shared knowledge was created from.
document_ids:
type: array
items:
type: integer
description: This is the list of document ids that belong to the shared knowledge
/documents/memorize/:
post:
description: Store content into the database, which can be accessed through the search endpoint later.
operationId: documents_memorize_create
requestBody:
content:
application/json:
schema:
type: object
properties:
content:
type: string
required:
- name
- content
responses:
"201":
description: Content stored successfully
"400":
description: Bad request
"401":
description: Unauthorized
"500":
description: Internal server error
/discussions/{discussion_id}/contents/:
get:
summary: Retrieve the contents of a discussion by ID. A discussion can be a Slack or Discord chat.
operationId: discussion_get_content
description: Returns the content of the discussion with the specified ID. A discussion can be a Slack or Discord chat. The response contains 2 fields, discussion_content, and thread_contents. The former contains the main messages of the chat, whereas the latter is the threads of the discussion.
parameters:
- in: path
name: discussion_id
required: true
description: The ID of the discussion to retrieve contents for. Discussions are
schema:
type: integer
- in: query
name: integration_type
required: true
schema:
type: string
description: Indicate the integration of the discussion. Currently, it can only be "slack" or "discord".
- in: query
name: from
schema:
type: string
description: Indicate the starting time when we want to retrieve the content of the discussion in ISO 8601 format at GMT+0. If not specified, the default time is now.
- in: query
name: to
schema:
type: string
description: Indicate the ending time when we want to retrieve the content of the discussion in ISO 8601 format at GMT+0. If not specified, it is 7 days before the "from" parameter.
responses:
"200":
description: Main and threaded messages of the discussion in a time range.
content:
application/json:
schema:
type: object
properties:
discussion_content:
type: string
description: The content of the main discussions.
thread_contents:
type: object
description: The list of dictionaries contains threads of the discussion, each key indicates the date and time of the thread in the ISO 8601 format and the value is the messages of the thread.
"404":
description: Discussion not found.
"500":
description: Internal server error.
||CODE_BLOCK||
</details>
5. Add this link as the Privacy Policy: `https://docs.rememberizer.ai/notices/privacy-policy`.
6. After creating the action, copy the callback URL and paste it into your Rememberizer app.
<figure><img src="../.gitbook/assets/rememberizer_app_callback_url.png" alt="rememberizer app callback url"><figcaption></figcaption></figure>
==> developer/registering-and-using-api-keys.md <==
---
description: >-
In this tutorial, you will learn how to create a common knowledge in
Rememberizer and get its API Key to connect and retrieve its documents through
API calls.
---
# Registering and using API Keys
### Prerequisite
First, you need to have [a memento](../personal/mementos-filter-access.md) created and refined using your indexed knowledge files.
### Creating a common knowledge
To create a common knowledge, sign in into your Rememberizer account and visit [your common knowledge page](https://rememberizer.ai/personal/common-knowledge). Choose **"Your shared knowledge"**, then click **"Get started"**.
<figure><img src="../.gitbook/assets/common_knowledge_page.png" alt="common knowledge page"><figcaption></figcaption></figure>
Then pick one of the mementos you have created previously, you can also choose **"All"** or **"None"**.
<div align="center" data-full-width="false">
<figure><img src="../.gitbook/assets/create-common-knowledge-1.png" alt="create common knowledge 1" width="375"><figcaption></figcaption></figure>
</div>
Finally fill out the common knowledge's name, description and give it a representative photo.
<figure><img src="../.gitbook/assets/create-common-knowledge-2.png" alt="create common knowledge 2" width="375"><figcaption></figcaption></figure>
After you have filled the form, click on "Share knowledge" in the bottom to create your common knowledge. After that, turn on the **"Enable sharing"** in your knowledge and click **"Confirm"** in the pop up modal.
<figure><img src="../.gitbook/assets/common_knowledge_sharing.png" alt="common knowledge sharing"><figcaption></figcaption></figure>
You now are ready to obtain its API Key and access its documents via API calls.
### Getting the API Key of a common knowledge you created
For your common knowledge, click on the three dots on its top right, then choose "API Key". If there is none yet, one will be created for you. If the API Key exists it will be returned.
<figure><img src="../.gitbook/assets/knowledge_open_API_key.png" alt="knowledge open API key"><figcaption></figcaption></figure>
In the **"Manage your API Key"** panel, you can click on the **"eye"** button to show/hide, the **"copy"** button to copy the key to clipboard, and **"Regenerate API Key"** to invalidate the old key and create a new one (apps that are accessing your documents through api calls won't be able to access until you have updated the new key into them).
<figure><img src="../.gitbook/assets/copy-api-key.png" alt="copy api key"><figcaption></figcaption></figure>
After obtaining the API Key, you can proceed to using it in your API calls to Rememberizer to query your indexed documents and contents.
### Using the API Key
To access Rememberizer endpoints, you will use the API Key in the `X-API-Key` header of your API requests. Please check out the [API Documentation](api-docs/) to see the endpoints that Rememberizer provides.
Once you have your API key, you can use it in several ways:
1. **Direct API access**: Use the API key in your HTTP requests to query Rememberizer's search endpoints
2. **LangChain integration**: Use the [LangChain Integration](langchain-integration.md) to incorporate Rememberizer's capabilities into your LangChain applications
3. **Custom GPT**: Use the API Key in a custom GPT application as described below
#### Using with Custom GPTs
Start by [creating a GPT in the ChatGPT UI](https://chat.openai.com/gpts/editor). Make sure to choose the Authentication Type as "API Key", Auth Type as "Custom" and the header as "X-Api-Key", then paste the key you copied previously into the API Key textbox.
{% hint style="info" %}
For a more advanced GPT integration that uses OAuth instead of API keys, see [Creating a Rememberizer GPT](creating-a-rememberizer-gpt.md).
{% endhint %}
<figure><img src="../.gitbook/assets/gpt-app-using-api-key.png" alt="gpt app using api key" width="375"><figcaption></figcaption></figure>
==> developer/registering-rememberizer-apps.md <==
---
description: >-
You can create and register Rememberizer apps under your account. Rememberizer
apps can act on behalf of a user.
---
# Registering Rememberizer apps
1. In the top-left corner of any page, click on **Developer**, then click on **Registered App**.
<figure><img src="../.gitbook/assets/registered_apps_browse.png" alt="registered apps browse"><figcaption></figcaption></figure>
2. Click **Register new app**. A popup window will appear to fill in your app information
<figure><img src="../.gitbook/assets/register_new_app.png" alt="register new app"><figcaption></figcaption></figure>
3. In **"Application name"**, type the name of your app.
4. In **"Description (optional)"**, fill in the description of your app if needed.
5. In "**Application logo (optional)"**, upload your logo applications if you have.
6. In **"Landing page URL"**, type the domain of your landing page. Your landing page contains a detailed summary of what your app does and how it integrates with Rememberizer.
7. In **"Authorized request origins"**, type the domain to your app's website.
8. In **"Authorized redirect URLs"**, type the callback URL of your app.
9. Click **"Create app"**.
==> developer/langchain-integration.md <==
---
description: >-
Learn how to integrate Rememberizer as a LangChain retriever to provide your
LangChain application with access to powerful vector database search.
type: guide
last_updated: 2025-04-03
---
# LangChain Integration
Rememberizer integrates with LangChain through the `RememberizerRetriever` class, allowing you to easily incorporate Rememberizer's semantic search capabilities into your LangChain-powered applications. This guide explains how to set up and use this integration to build advanced LLM applications with access to your knowledge base.
## Introduction
LangChain is a popular framework for building applications with large language models (LLMs). By integrating Rememberizer with LangChain, you can:
- Use your Rememberizer knowledge base in RAG (Retrieval Augmented Generation) applications
- Create chatbots with access to your documents and data
- Build question-answering systems that leverage your knowledge
- Develop agents that can search and reason over your information
The integration is available in the `langchain_community.retrievers` module.
{% embed url="https://python.langchain.com/docs/integrations/retrievers/rememberizer/" %}
## Getting Started
### Prerequisites
Before you begin, you need:
1. A Rememberizer account with Common Knowledge created
2. An API key for accessing your Common Knowledge
3. Python environment with LangChain installed
For detailed instructions on creating Common Knowledge and generating an API key, see [Registering and Using API Keys](https://docs.rememberizer.ai/developer/registering-and-using-api-keys).
### Installation
Install the required packages:
||CODE_BLOCK||bash
pip install langchain langchain_community
||CODE_BLOCK||
If you plan to use OpenAI models (as shown in examples below):
||CODE_BLOCK||bash
pip install langchain_openai
||CODE_BLOCK||
### Authentication Setup
There are two ways to authenticate the `RememberizerRetriever`:
1. **Environment Variable**: Set the `REMEMBERIZER_API_KEY` environment variable
||CODE_BLOCK||python
import os
os.environ["REMEMBERIZER_API_KEY"] = "rem_ck_your_api_key"
||CODE_BLOCK||
2. **Direct Parameter**: Pass the API key directly when initializing the retriever
||CODE_BLOCK||python
retriever = RememberizerRetriever(rememberizer_api_key="rem_ck_your_api_key")
||CODE_BLOCK||
## Configuration Options
The `RememberizerRetriever` class accepts these parameters:
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `top_k_results` | int | 10 | Number of documents to return from search |
| `rememberizer_api_key` | str | None | API key for authentication (optional if set as environment variable) |
Behind the scenes, the retriever makes API calls to Rememberizer's search endpoint with additional configurable parameters:
| Advanced Parameter | Description |
|-------------------|-------------|
| `prev_chunks` | Number of chunks before the matched chunk to include (default: 2) |
| `next_chunks` | Number of chunks after the matched chunk to include (default: 2) |
| `return_full_content` | Whether to return full document content (default: true) |
## Basic Usage
Here's a simple example of retrieving documents from Rememberizer using LangChain:
||CODE_BLOCK||python
import os
from langchain_community.retrievers import RememberizerRetriever
# Set your API key
os.environ["REMEMBERIZER_API_KEY"] = "rem_ck_your_api_key"
# Initialize the retriever
retriever = RememberizerRetriever(top_k_results=5)
# Get relevant documents for a query
docs = retriever.get_relevant_documents(query="How do vector embeddings work?")
# Display the first document
if docs:
print(f"Document: {docs[0].metadata['name']}")
print(f"Content: {docs[0].page_content[:200]}...")
||CODE_BLOCK||
### Understanding Document Structure
Each document returned by the retriever has:
- `page_content`: The text content of the matched document chunk
- `metadata`: Additional information about the document
Example of metadata structure:
||CODE_BLOCK||python
{
'id': 13646493,
'document_id': '17s3LlMbpkTk0ikvGwV0iLMCj-MNubIaP',
'name': 'What is a large language model (LLM)_ _ Cloudflare.pdf',
'type': 'application/pdf',
'path': '/langchain/What is a large language model (LLM)_ _ Cloudflare.pdf',
'url': 'https://drive.google.com/file/d/17s3LlMbpkTk0ikvGwV0iLMCj-MNubIaP/view',
'size': 337089,
'created_time': '',
'modified_time': '',
'indexed_on': '2024-04-04T03:36:28.886170Z',
'integration': {'id': 347, 'integration_type': 'google_drive'}
}
||CODE_BLOCK||
## Advanced Examples
### Building a RAG Question-Answering System
This example creates a question-answering system that retrieves information from Rememberizer and uses GPT-3.5 to formulate answers:
||CODE_BLOCK||python
import os
from langchain_community.retrievers import RememberizerRetriever
from langchain.chains import RetrievalQA
from langchain_openai import ChatOpenAI
# Set up API keys
os.environ["REMEMBERIZER_API_KEY"] = "rem_ck_your_api_key"
os.environ["OPENAI_API_KEY"] = "your_openai_api_key"
# Initialize the retriever and language model
retriever = RememberizerRetriever(top_k_results=5)
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
# Create a retrieval QA chain
qa_chain = RetrievalQA.from_chain_type(
llm=llm,
chain_type="stuff", # Simplest method - just stuff all documents into the prompt
retriever=retriever,
return_source_documents=True
)
# Ask a question
response = qa_chain.invoke({"query": "What is RAG in the context of AI?"})
# Print the answer
print(f"Answer: {response['result']}")
print("\nSources:")
for idx, doc in enumerate(response['source_documents']):
print(f"{idx+1}. {doc.metadata['name']}")
||CODE_BLOCK||
### Building a Conversational Agent with Memory
This example creates a conversational agent that can maintain conversation history:
||CODE_BLOCK||python
import os
from langchain_community.retrievers import RememberizerRetriever
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain_openai import ChatOpenAI
# Set up API keys
os.environ["REMEMBERIZER_API_KEY"] = "rem_ck_your_api_key"
os.environ["OPENAI_API_KEY"] = "your_openai_api_key"
# Initialize components
retriever = RememberizerRetriever(top_k_results=5)
llm = ChatOpenAI(model_name="gpt-3.5-turbo")
memory = ConversationBufferMemory(
memory_key="chat_history",
return_messages=True
)
# Create the conversational chain
conversation = ConversationalRetrievalChain.from_llm(
llm=llm,
retriever=retriever,
memory=memory
)
# Example conversation
questions = [
"What is RAG?",
"How do large language models use it?",
"What are the limitations of this approach?",
]
for question in questions:
response = conversation.invoke({"question": question})
print(f"Question: {question}")
print(f"Answer: {response['answer']}\n")
||CODE_BLOCK||
## Best Practices
### Optimizing Retrieval Performance
1. **Be specific with queries**: More specific queries usually yield better results
2. **Adjust `top_k_results`**: Start with 3-5 results and adjust based on application needs
3. **Use context windows**: The retriever automatically includes context around matched chunks
### Security Considerations
1. **Protect your API key**: Store it securely using environment variables or secret management tools
2. **Create dedicated keys**: Create separate API keys for different applications
3. **Rotate keys regularly**: Periodically generate new keys and phase out old ones
### Integration Patterns
1. **Pre-retrieval processing**: Consider preprocessing user queries to improve search relevance
2. **Post-retrieval filtering**: Filter or rank retrieved documents before passing to the LLM
3. **Hybrid search**: Combine Rememberizer with other retrievers using `EnsembleRetriever`
||CODE_BLOCK||python
from langchain.retrievers import EnsembleRetriever
from langchain_community.retrievers import RememberizerRetriever, WebResearchRetriever
# Create retrievers
rememberizer_retriever = RememberizerRetriever(top_k_results=3)
web_retriever = WebResearchRetriever(...) # Configure another retriever
# Create an ensemble with weighted score
ensemble_retriever = EnsembleRetriever(
retrievers=[rememberizer_retriever, web_retriever],
weights=[0.7, 0.3] # Rememberizer results have higher weight
)
||CODE_BLOCK||
## Troubleshooting
### Common Issues
1. **Authentication errors**: Verify your API key is correct and properly configured
2. **No results returned**: Ensure your Common Knowledge contains relevant information
3. **Rate limiting**: Be mindful of API rate limits for high-volume applications
### Debug Tips
- Set the LangChain debug mode to see detailed API calls:
||CODE_BLOCK||python
import langchain
langchain.debug = True
||CODE_BLOCK||
- Examine raw search results before passing to LLM to identify retrieval issues
## Related Resources
* LangChain [Retriever conceptual guide](https://python.langchain.com/docs/concepts/#retrievers)
* LangChain [Retriever how-to guides](https://python.langchain.com/docs/how_to/#retrievers)
* Rememberizer [API Documentation](https://docs.rememberizer.ai/developer/api-docs/)
* [Vector Stores](https://docs.rememberizer.ai/developer/vector-stores) in Rememberizer
* [Creating a Rememberizer GPT](creating-a-rememberizer-gpt.md) - An alternative approach for AI integration
==> developer/talk-to-slack-the-sample-web-app.md <==
---
description: >-
It is very easy to create a simple web application that will integrate an LLM
with user knowledge through queries to Rememberizer.
---
# Talk-to-Slack the Sample Web App
The source code of the app can be found [here](https://github.com/skydeckai/rememberizer).
In this section we will provide step by step instructions and the full source code so that you can quickly create your own application.
We have created a Talk-to-Slack GPT on OpenAI. The Talk-to-Slack Web app is very similar.
<div align="left">
<figure><img src="https://rememberizer-docs-assets.s3.amazonaws.com/talk-to-slack_web_app.png" alt="Talk to Slack.com web app by Rememberizer on Heroku"><figcaption><p>Talk-to-Slack.com web app by Rememberizer on Heroku</p></figcaption></figure>
</div>
<div align="left">
<figure><img src="https://rememberizer-docs-assets.s3.amazonaws.com/talk-to-slack_web_app.png" alt="Talk to Slack GPT by Rememberizer on OpenAI"><figcaption><p>Talk to Slack GPT by Rememberizer on OpenAI</p></figcaption></figure>
</div>
***
### Introduction
In this guide, we provide step-by-step instructions and full source code to help you create your own application similar to our Talk-to-Slack GPT integration with Rememberizer.ai. Unlike the Slack integration, a web app offers more features and control, such as web scraping, local database access, graphics and animation, and collecting payments. Plus, it can be used by anyone without the need for a premium genAI account.
### Overview
Our example application, Talk to Slack, is hosted on Heroku and integrates OpenAI's LLM with Rememberizer.ai to enhance your Slack experience. The web app is built using Flask and provides features like OAuth2 integration, Slack data access, and an intuitive user interface.
### Features
* **Flask-based Architecture**: Backend operations, frontend communications, and API interactions are handled by Flask.
* **OAuth2 Integration**: Secure authorization and data access with Rememberizer's OAuth2 flow.
* **Slack Data Access**: Fetches user's connected Slack data securely using Rememberizer's APIs.
* **OpenAI LLM Integration**: Processes queries with OpenAI's LLM service for insightful responses.
* **Intuitive User Interface**: Easy navigation and interaction with a modern UI design.
* **Best Practices**: Adheres to security and user experience standards for seamless integration.
### Setup and Deployment
#### Prerequisites
* Python
* Flask
{% hint style="info" %}
Note that it was not very hard to have an LLM rewrite this entire application in another language, in our case Golang. So do keep in mind that you are not limited to Python.
{% endhint %}
#### Environment Configuration
Set these environment variables:
* `APP_SECRET_KEY`: Unique secret key for Flask.
* `REMEMBERIZER_CLIENT_ID`: Client ID for your Rememberizer app.
* `REMEMBERIZER_CLIENT_SECRET`: Client secret for your Rememberizer app.
* `OPENAI_API_KEY`: Your OpenAI API key.
#### Running the Application
1. **Start Flask App**: Run `flask run` in the terminal and access the app at `http://localhost:5000`.
2. **Copy the callback URL to your Rememberizer app config**: `https://<YOURHOST>/auth/rememberizer/callback` example:`http://localhost:5000/auth/rememberizer/callback`.
#### Deploying to the Cloud
Deployment to platforms like Heroku, Google Cloud Platform (GCP), Amazon Web Services (AWS), or Microsoft Azure is recommended.
**Heroku Deployment**
1. **Create a Heroku Account**: Install the Heroku CLI.
2. **Prepare Your Application**: Ensure a `Procfile`, `runtime.txt`, and `requirements.txt` are present.
3. **Deploy**: Use the Heroku CLI or GitHub integration for deployment.
**Detailed Steps**
* **Connect Heroku to GitHub**: Enable automatic deploys from the GitHub repository for seamless updates.
* **Deploy Manually**: Optionally, use manual deployment for more control.
**Additional Setup**
* Install Heroku CLI: `brew tap heroku/brew && brew install heroku` (macOS).
* Add SSL certificates: Use self-signed certificates for initial HTTPS setup.
* Configure Environment Variables on Heroku: Use `heroku config:set KEY=value` for essential keys.
**Other Cloud Platforms**
* **GCP**: Set up a GCP account, prepare your app with `app.yaml`, and deploy using `gcloud app deploy`.
* **AWS**: Use Elastic Beanstalk for deployment after setting up an AWS account and the AWS CLI.
* **Azure**: Deploy through Azure App Service after creating an Azure account and installing the Azure CLI.
#### Security and Best Practices
Before deployment, verify your `requirements.txt`, adjust configurations for production, and update OAuth redirect URIs.
### Application Code Notes
**@app.route('/') (Index Route):**
This route renders the index.html template when the root URL (/) is accessed. It serves as the homepage of your application.
**@app.route('/auth/rememberizer') (Rememberizer Authentication Route):**
This route initiates the OAuth2 authentication process with Rememberizer.ai. It generates a random state value, stores it in the session, constructs the authorization URL with the necessary parameters (client ID, redirect URI, scope, and state), and redirects the user to Rememberizer.ai's authorization page.
**@app.route('/auth/rememberizer/callback') (Rememberizer Callback Route):**
This route handles the callback from Rememberizer.ai after the user has authorized your application. It extracts the authorization code from the query parameters, exchanges it for an access token using Rememberizer.ai's token endpoint, and stores the access token in the session. Then, it redirects the user to the /dashboard route.
**@app.route('/dashboard') (Dashboard Route):**
This route displays the dashboard page to the user. It checks if the user has an access token in the session; if not, it redirects them to the authentication route. If the user is authenticated, it makes a request to Rememberizer.ai's account endpoint to retrieve account information and renders the dashboard.html template with this information.
**@app.route('/slack-info') (Slack Integration Info Route):**
This route shows information about the user's Slack integration with Rememberizer.ai. It checks for an access token and makes a request to Rememberizer.ai's integrations endpoint to get the integration data. It then renders the slack\_info.html template with this data.
**@app.route('/ask', methods=\['POST']) (Ask Route):**
This route handles the submission of questions from the user. It checks for an access token, retrieves the user's question from the form data, and makes a request to Rememberizer.ai's document search endpoint to find relevant information. It then uses OpenAI's GPT-4 model to generate an answer based on the question and the search results. The answer is rendered in the answer.html template.
### Additional Notes
* **Iconography**: Designed with a detailed folded paper art style, reflecting AI and communication integration. Our icon was created in Midjourney and Image2Icon.
* **SSL Configuration**: Generate self-signed certificates using OpenSSL for secure communication.
### Explore and Innovate
We encourage exploration and innovation with your own AI-integrated web app, aiming to enhance productivity and collaboration within your platform.
***
This revised documentation provides a comprehensive guide for developers to create their own AI-integrated web app, similar to Talk-to-Slack. It includes detailed instructions for setup, deployment, and application code overview, along with best
==> developer/README.md <==
---
description: Overview of Rememberizer's developer tools, APIs, and integration options
type: guide
last_updated: 2025-04-03
---
# Developer Tools and APIs
Welcome to the Rememberizer developer documentation. This section provides comprehensive information about the tools, APIs, and integration options available to developers working with Rememberizer's semantic search and knowledge management capabilities.
## Overview of Rememberizer's Developer Features
Rememberizer offers a robust set of developer tools designed to help you integrate powerful semantic search capabilities into your applications. As a developer, you can:
- **Access semantic search** through RESTful APIs with vector embedding technology
- **Integrate Rememberizer** with your own applications using OAuth2 or API keys
- **Build custom applications** that leverage users' knowledge bases
- **Create vector stores** for specialized semantic search databases
- **Connect with AI models** including OpenAI GPTs and LangChain
## Core Components
Rememberizer's architecture consists of several key components that work together to provide a comprehensive knowledge management and semantic search system:
| Component | Description |
|-----------|-------------|
| **API Service** | RESTful endpoints providing access to Rememberizer's features |
| **Authentication System** | OAuth2 and API key management for secure access |
| **Vector Stores** | Specialized databases optimized for semantic search |
| **Mementos** | Configurable access filters for knowledge sources |
| **Integrations** | Connectors to external data sources (Slack, Google Drive, etc.) |
| **Document Processing** | Systems for chunking, embedding, and indexing content |
## Authentication Options
Rememberizer supports two primary authentication methods:
1. **OAuth2 Authentication**: For applications requiring access to specific user data and documents. This flow allows users to authorize your application to access their knowledge through configurable mementos.
2. **API Key Authentication**: For accessing vector stores or common knowledge bases directly, without the OAuth flow. This provides a simpler integration path for applications that don't need user-specific data.
## Developer Documentation Roadmap
This documentation is organized to help you quickly find the information you need:
### Getting Started
- [Registering Rememberizer Apps](registering-rememberizer-apps.md) - Create developer applications
- [Authorizing Rememberizer Apps](authorizing-rememberizer-apps.md) - Implement OAuth2 authorization
- [Registering and Using API Keys](registering-and-using-api-keys.md) - Work with API key authentication
### Core Features
- [Vector Stores](vector-stores.md) - Create and manage semantic search databases
- [Creating a Rememberizer GPT](creating-a-rememberizer-gpt.md) - Integrate with OpenAI's GPT models
- [LangChain Integration](langchain-integration.md) - Connect with LangChain applications
- [Enterprise Integration Patterns](enterprise-integration-patterns.md) - Architectural patterns for enterprise deployments
### API Reference
- [API Documentation](api-docs/README.md) - Comprehensive API reference
- Authentication, search, document management, and more specialized endpoints
### Examples and Sample Code
- [Talk-to-Slack Sample Web App](talk-to-slack-the-sample-web-app.md) - Example integration
## Example Integration Flow
Here's a typical flow for integrating Rememberizer with your application:
1. Register an application in the Rememberizer developer portal
2. Implement OAuth2 authorization in your application
3. Request access to user mementos
4. Make API calls to search and retrieve knowledge
5. Process and display results in your application
||CODE_BLOCK||javascript
// Example: Making an authenticated API request with OAuth token
async function searchUserKnowledge(query, token) {
const response = await fetch('https://api.rememberizer.ai/api/v1/search/', {
method: 'POST',
headers: {
'Authorization': `Bearer ${token}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({ query })
});
return response.json();
}
||CODE_BLOCK||
## Next Steps
Start by [registering your application](registering-rememberizer-apps.md) to obtain client credentials, then explore the [API documentation](api-docs/README.md) to learn about available endpoints.
==> developer/vector-stores.md <==
---
description: >-
This guide will help you understand how to use the Rememberizer Vector Store
as a developer.
type: guide
last_updated: 2025-04-03
---
# Vector Stores
The Rememberizer Vector Store simplifies the process of dealing with vector data, allowing you to focus on text input and leveraging the power of vectors for various applications such as search and data analysis.
## Introduction
The Rememberizer Vector Store provides an easy-to-use interface for handling vector data while abstracting away the complexity of vector embeddings. Powered by PostgreSQL with the pgvector extension, Rememberizer Vector Store allows you to work directly with text. The service handles chunking, vectorizing, and storing the text data, making it easier for you to focus on your core application logic.
For a deeper understanding of the theoretical concepts behind vector embeddings and vector databases, see [What are Vector Embeddings and Vector Databases?](../background/what-are-vector-embeddings-and-vector-databases.md).
## Technical Overview
### How Vector Stores Work
Rememberizer Vector Stores convert text into high-dimensional vector representations (embeddings) that capture semantic meaning. This enables:
1. **Semantic Search**: Find documents based on meaning rather than just keywords
2. **Similarity Matching**: Identify conceptually related content
3. **Efficient Retrieval**: Quickly locate relevant information from large datasets
### Key Components
- **Document Processing**: Text is split into optimally sized chunks with overlapping boundaries for context preservation
- **Vectorization**: Chunks are converted to embeddings using state-of-the-art models
- **Indexing**: Specialized algorithms organize vectors for efficient similarity search
- **Query Processing**: Search queries are vectorized and compared against stored embeddings
### Architecture
Rememberizer implements vector stores using:
- **PostgreSQL with pgvector extension**: For efficient vector storage and search
- **Collection-based organization**: Each vector store has its own isolated collection
- **API-driven access**: Simple RESTful endpoints for all operations
## Getting Started
### Creating a Vector Store
1. Navigate to the Vector Stores Section in your dashboard
2. Click on "Create new Vector Store":
* A form will appear prompting you to enter details.
3. Fill in the Details:
* **Name**: Provide a unique name for your vector store.
* **Description**: Write a brief description of the vector store.
* **Embedding Model**: Select the model that converts text to vectors.
* **Indexing Algorithm**: Choose how vectors will be organized for search.
* **Search Metric**: Define how similarity between vectors is calculated.
* **Vector Dimension**: The size of the vector embeddings (typically 768-1536).
4. Submit the Form:
* Click on the "Create" button. You will receive a success notification, and the new store will appear in your vector store list.
<figure><img src="../.gitbook/assets/create_vector_DB_store.png" alt="Create a New Vector Store"><figcaption><p>Create a New Vector Store</p></figcaption></figure>
### Configuration Options
#### Embedding Models
| Model | Dimensions | Description | Best For |
|-------|------------|-------------|----------|
| openai/text-embedding-3-large | 1536 | High-accuracy embedding model from OpenAI | Production applications requiring maximum accuracy |
| openai/text-embedding-3-small | 1536 | Smaller, faster embedding model from OpenAI | Applications with higher throughput requirements |
#### Indexing Algorithms
| Algorithm | Description | Tradeoffs |
|-----------|-------------|-----------|
| IVFFLAT (default) | Inverted file with flat compression | Good balance of speed and accuracy; works well for most datasets |
| HNSW | Hierarchical Navigable Small World | Better accuracy for large datasets; higher memory requirements |
#### Search Metrics
| Metric | Description | Best For |
|--------|-------------|----------|
| cosine (default) | Measures angle between vectors | General purpose similarity matching |
| inner product (ip) | Dot product between vectors | When vector magnitude is important |
| L2 (Euclidean) | Straight-line distance between vectors | When spatial relationships matter |
### Managing Vector Stores
1. View and Edit Vector Stores:
* Access the management dashboard to view, edit, or delete vector stores.
2. Viewing Documents:
* Browse individual documents and their associated metadata within a specific vector store.
3. Statistics:
* View detailed statistics such as the number of vectors stored, query performance, and operational metrics.
<figure><img src="../.gitbook/assets/vector_store_management.png" alt="View Details of a Vector Store"><figcaption><p>View Details of a Vector Store</p></figcaption></figure>
## API Key Management
API keys are used to authenticate and authorize access to the Rememberizer Vector Store's API endpoints. Proper management of API keys is essential for maintaining the security and integrity of your vector stores.
### Creating API Keys
1. Head over to your Vector Store details page
2. Navigate to the API Key Management Section:
* It can be found within the "Configuration" tab
3. Click on **"Add API Key"**:
* A form will appear prompting you to enter details.
4. Fill in the Details:
* **Name**: Provide a name for the API key to help you identify its use case.
5. Submit the Form:
* Click on the "Create" button. The new API key will be generated and displayed. Make sure to copy and store it securely. This key is used to authenticate requests to that specific vector store.
<figure><img src="../.gitbook/assets/vector_store_api_key.png" alt="Create a New API Key"><figcaption><p>Create a New API Key</p></figcaption></figure>
### Revoking API Keys
If an API key is no longer needed, you can delete it to prevent any potential misuse.
For security reasons, you may want to rotate your API keys periodically. This involves generating a new key and revoking the old one.
## Using the Vector Store API
After creating a Vector Store and generating an API key, you can interact with it using the REST API.
### Code Examples
{% tabs %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import json
API_KEY = "your_api_key_here"
VECTOR_STORE_ID = "vs_abc123" # Replace with your vector store ID
BASE_URL = "https://api.rememberizer.ai/api/v1"
# Upload a document to the vector store
def upload_document(file_path, document_name=None):
if document_name is None:
document_name = file_path.split("/")[-1]
with open(file_path, "rb") as f:
files = {"file": (document_name, f)}
headers = {"x-api-key": API_KEY}
response = requests.post(
f"{BASE_URL}/vector-stores/{VECTOR_STORE_ID}/documents",
headers=headers,
files=files
)
if response.status_code == 201:
print(f"Document '{document_name}' uploaded successfully!")
return response.json()
else:
print(f"Error uploading document: {response.text}")
return None
# Upload text content to the vector store
def upload_text(content, document_name):
headers = {
"x-api-key": API_KEY,
"Content-Type": "application/json"
}
data = {
"name": document_name,
"content": content
}
response = requests.post(
f"{BASE_URL}/vector-stores/{VECTOR_STORE_ID}/documents/text",
headers=headers,
json=data
)
if response.status_code == 201:
print(f"Text document '{document_name}' uploaded successfully!")
return response.json()
else:
print(f"Error uploading text: {response.text}")
return None
# Search the vector store
def search_vector_store(query, num_results=5, prev_chunks=1, next_chunks=1):
headers = {"x-api-key": API_KEY}
params = {
"q": query,
"n": num_results,
"prev_chunks": prev_chunks,
"next_chunks": next_chunks
}
response = requests.get(
f"{BASE_URL}/vector-stores/{VECTOR_STORE_ID}/documents/search",
headers=headers,
params=params
)
if response.status_code == 200:
results = response.json()
print(f"Found {len(results['matched_chunks'])} matches for '{query}'")
# Print the top result
if results['matched_chunks']:
top_match = results['matched_chunks'][0]
print(f"Top match (distance: {top_match['distance']}):")
print(f"Document: {top_match['document']['name']}")
print(f"Content: {top_match['matched_content']}")
return results
else:
print(f"Error searching: {response.text}")
return None
# Example usage
# upload_document("path/to/document.pdf")
# upload_text("This is a sample text to be vectorized", "sample-document.txt")
# search_vector_store("How does vector similarity work?")
||CODE_BLOCK||
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
// Vector Store API Client
class VectorStoreClient {
constructor(apiKey, vectorStoreId) {
this.apiKey = apiKey;
this.vectorStoreId = vectorStoreId;
this.baseUrl = 'https://api.rememberizer.ai/api/v1';
}
// Get vector store information
async getVectorStoreInfo() {
const response = await fetch(`${this.baseUrl}/vector-stores/${this.vectorStoreId}`, {
method: 'GET',
headers: {
'x-api-key': this.apiKey
}
});
if (!response.ok) {
throw new Error(`Failed to get vector store info: ${response.statusText}`);
}
return response.json();
}
// Upload a text document
async uploadTextDocument(name, content) {
const response = await fetch(`${this.baseUrl}/vector-stores/${this.vectorStoreId}/documents/text`, {
method: 'POST',
headers: {
'x-api-key': this.apiKey,
'Content-Type': 'application/json'
},
body: JSON.stringify({
name,
content
})
});
if (!response.ok) {
throw new Error(`Failed to upload text document: ${response.statusText}`);
}
return response.json();
}
// Upload a file
async uploadFile(file, onProgress) {
const formData = new FormData();
formData.append('file', file);
const xhr = new XMLHttpRequest();
return new Promise((resolve, reject) => {
xhr.open('POST', `${this.baseUrl}/vector-stores/${this.vectorStoreId}/documents`);
xhr.setRequestHeader('x-api-key', this.apiKey);
xhr.upload.onprogress = (event) => {
if (event.lengthComputable && onProgress) {
const percentComplete = (event.loaded / event.total) * 100;
onProgress(percentComplete);
}
};
xhr.onload = () => {
if (xhr.status === 201) {
resolve(JSON.parse(xhr.responseText));
} else {
reject(new Error(`Failed to upload file: ${xhr.statusText}`));
}
};
xhr.onerror = () => {
reject(new Error('Network error during file upload'));
};
xhr.send(formData);
});
}
// Search documents in the vector store
async searchDocuments(query, options = {}) {
const params = new URLSearchParams({
q: query,
n: options.numResults || 10,
prev_chunks: options.prevChunks || 1,
next_chunks: options.nextChunks || 1
});
if (options.threshold) {
params.append('t', options.threshold);
}
const response = await fetch(
`${this.baseUrl}/vector-stores/${this.vectorStoreId}/documents/search?${params}`,
{
method: 'GET',
headers: {
'x-api-key': this.apiKey
}
}
);
if (!response.ok) {
throw new Error(`Search failed: ${response.statusText}`);
}
return response.json();
}
// List all documents in the vector store
async listDocuments() {
const response = await fetch(
`${this.baseUrl}/vector-stores/${this.vectorStoreId}/documents`,
{
method: 'GET',
headers: {
'x-api-key': this.apiKey
}
}
);
if (!response.ok) {
throw new Error(`Failed to list documents: ${response.statusText}`);
}
return response.json();
}
// Delete a document
async deleteDocument(documentId) {
const response = await fetch(
`${this.baseUrl}/vector-stores/${this.vectorStoreId}/documents/${documentId}`,
{
method: 'DELETE',
headers: {
'x-api-key': this.apiKey
}
}
);
if (!response.ok) {
throw new Error(`Failed to delete document: ${response.statusText}`);
}
return true;
}
}
// Example usage
/*
const client = new VectorStoreClient('your_api_key', 'vs_abc123');
// Search documents
client.searchDocuments('How does semantic search work?')
.then(results => {
console.log(`Found ${results.matched_chunks.length} matches`);
results.matched_chunks.forEach(match => {
console.log(`Document: ${match.document.name}`);
console.log(`Score: ${match.distance}`);
console.log(`Content: ${match.matched_content}`);
console.log('---');
});
})
.catch(error => console.error(error));
*/
||CODE_BLOCK||
{% endtab %}
{% tab title="Ruby" %}
||CODE_BLOCK||ruby
require 'net/http'
require 'uri'
require 'json'
class VectorStoreClient
def initialize(api_key, vector_store_id)
@api_key = api_key
@vector_store_id = vector_store_id
@base_url = 'https://api.rememberizer.ai/api/v1'
end
# Get vector store details
def get_vector_store_info
uri = URI("#{@base_url}/vector-stores/#{@vector_store_id}")
request = Net::HTTP::Get.new(uri)
request['x-api-key'] = @api_key
response = send_request(uri, request)
JSON.parse(response.body)
end
# Upload text content
def upload_text(name, content)
uri = URI("#{@base_url}/vector-stores/#{@vector_store_id}/documents/text")
request = Net::HTTP::Post.new(uri)
request['Content-Type'] = 'application/json'
request['x-api-key'] = @api_key
request.body = {
name: name,
content: content
}.to_json
response = send_request(uri, request)
JSON.parse(response.body)
end
# Search documents
def search(query, num_results: 5, prev_chunks: 1, next_chunks: 1, threshold: nil)
uri = URI("#{@base_url}/vector-stores/#{@vector_store_id}/documents/search")
params = {
q: query,
n: num_results,
prev_chunks: prev_chunks,
next_chunks: next_chunks
}
params[:t] = threshold if threshold
uri.query = URI.encode_www_form(params)
request = Net::HTTP::Get.new(uri)
request['x-api-key'] = @api_key
response = send_request(uri, request)
JSON.parse(response.body)
end
# List documents
def list_documents
uri = URI("#{@base_url}/vector-stores/#{@vector_store_id}/documents")
request = Net::HTTP::Get.new(uri)
request['x-api-key'] = @api_key
response = send_request(uri, request)
JSON.parse(response.body)
end
# Upload file (multipart form)
def upload_file(file_path)
uri = URI("#{@base_url}/vector-stores/#{@vector_store_id}/documents")
file_name = File.basename(file_path)
file_content = File.binread(file_path)
boundary = "RememberizerBoundary#{rand(1000000)}"
request = Net::HTTP::Post.new(uri)
request['Content-Type'] = "multipart/form-data; boundary=#{boundary}"
request['x-api-key'] = @api_key
post_body = []
post_body << "--#{boundary}\r\n"
post_body << "Content-Disposition: form-data; name=\"file\"; filename=\"#{file_name}\"\r\n"
post_body << "Content-Type: application/octet-stream\r\n\r\n"
post_body << file_content
post_body << "\r\n--#{boundary}--\r\n"
request.body = post_body.join
response = send_request(uri, request)
JSON.parse(response.body)
end
private
def send_request(uri, request)
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = (uri.scheme == 'https')
response = http.request(request)
unless response.is_a?(Net::HTTPSuccess)
raise "API request failed: #{response.code} #{response.message}\n#{response.body}"
end
response
end
end
# Example usage
=begin
client = VectorStoreClient.new('your_api_key', 'vs_abc123')
# Search for documents
results = client.search('What are the best practices for data security?')
puts "Found #{results['matched_chunks'].length} results"
# Display top result
if results['matched_chunks'].any?
top_match = results['matched_chunks'].first
puts "Top match (distance: #{top_match['distance']}):"
puts "Document: #{top_match['document']['name']}"
puts "Content: #{top_match['matched_content']}"
end
=end
||CODE_BLOCK||
{% endtab %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
# Set your API key and Vector Store ID
API_KEY="your_api_key_here"
VECTOR_STORE_ID="vs_abc123"
BASE_URL="https://api.rememberizer.ai/api/v1"
# Get vector store information
curl -X GET "${BASE_URL}/vector-stores/${VECTOR_STORE_ID}" \
-H "x-api-key: ${API_KEY}"
# Upload a text document
curl -X POST "${BASE_URL}/vector-stores/${VECTOR_STORE_ID}/documents/text" \
-H "x-api-key: ${API_KEY}" \
-H "Content-Type: application/json" \
-d '{
"name": "example-document.txt",
"content": "This is a sample document that will be vectorized and stored in the vector database for semantic search."
}'
# Upload a file
curl -X POST "${BASE_URL}/vector-stores/${VECTOR_STORE_ID}/documents" \
-H "x-api-key: ${API_KEY}" \
-F "file=@/path/to/your/document.pdf"
# Search for documents
curl -X GET "${BASE_URL}/vector-stores/${VECTOR_STORE_ID}/documents/search?q=semantic%20search&n=5&prev_chunks=1&next_chunks=1" \
-H "x-api-key: ${API_KEY}"
# List all documents
curl -X GET "${BASE_URL}/vector-stores/${VECTOR_STORE_ID}/documents" \
-H "x-api-key: ${API_KEY}"
# Delete a document
curl -X DELETE "${BASE_URL}/vector-stores/${VECTOR_STORE_ID}/documents/123" \
-H "x-api-key: ${API_KEY}"
||CODE_BLOCK||
{% endtab %}
{% endtabs %}
## Performance Considerations
<figure>
<div style="border: 2px dashed #ccc; padding: 20px; text-align: center; background-color: #f9f9f9;">
<p style="font-weight: bold;">Coming soon: Vector Store Architecture Diagram</p>
<p>This technical architecture diagram will illustrate:</p>
<ul style="text-align: left; display: inline-block;">
<li>The PostgreSQL + pgvector foundation architecture</li>
<li>Indexing algorithm structures (IVFFLAT vs. HNSW)</li>
<li>How search metrics work in vector space (visual comparison)</li>
<li>Document chunking process with overlap visualization</li>
<li>Performance considerations visualized across different scales</li>
</ul>
</div>
<figcaption>Technical architecture of Rememberizer Vector Store implementation</figcaption>
</figure>
### Optimizing for Different Data Volumes
| Data Volume | Recommended Configuration | Notes |
|-------------|---------------------------|-------|
| Small (<10k documents) | IVFFLAT, cosine similarity | Simple configuration provides good performance |
| Medium (10k-100k documents) | IVFFLAT, ensure regular reindexing | Balance between search speed and index maintenance |
| Large (>100k documents) | HNSW, consider increasing vector dimensions | Higher memory usage but maintains performance at scale |
### Chunking Strategies
The chunking process significantly impacts search quality:
- **Chunk Size**: Rememberizer uses a default chunk size of 1024 bytes with a 200-byte overlap
- **Smaller Chunks** (512-1024 bytes): More precise matches, better for specific questions
- **Larger Chunks** (1500-2048 bytes): More context in each match, better for broader topics
- **Overlap**: Ensures context is not lost at chunk boundaries
### Query Optimization
- **Context Windows**: Use `prev_chunks` and `next_chunks` to retrieve surrounding content
- **Results Count**: Start with 3-5 results (`n` parameter) and adjust based on precision needs
- **Threshold**: Adjust the `t` parameter to filter results by similarity score
## Advanced Usage
### Reindexing
Rememberizer automatically triggers reindexing when vector counts exceed predefined thresholds, but consider manual reindexing after:
- Uploading a large number of documents
- Changing the embedding model
- Modifying the indexing algorithm
### Query Enhancement
For better search results:
1. **Be specific** in search queries
2. **Include context** when possible
3. **Use natural language** rather than keywords
4. **Adjust parameters** based on result quality
## Migrating from Other Vector Databases
If you're currently using other vector database solutions and want to migrate to Rememberizer Vector Store, the following guides will help you transition your data efficiently.
### Migration Overview
Migrating vector data involves:
1. Exporting data from your source vector database
2. Converting the data to a format compatible with Rememberizer
3. Importing the data into your Rememberizer Vector Store
4. Verifying the migration was successful
### Benefits of Migrating to Rememberizer
- **PostgreSQL Foundation**: Built on mature database technology with built-in backup and recovery
- **Integrated Ecosystem**: Seamless connection with other Rememberizer components
- **Simplified Management**: Unified interface for vector operations
- **Advanced Security**: Row-level security and fine-grained access controls
- **Scalable Architecture**: Performance optimization as your data grows
### Migrating from Pinecone
{% tabs %}
{% tab title="Python" %}
||CODE_BLOCK||python
import os
import pinecone
import requests
import json
import time
# Set up Pinecone client
pinecone.init(api_key="PINECONE_API_KEY", environment="PINECONE_ENV")
source_index = pinecone.Index("your-pinecone-index")
# Set up Rememberizer Vector Store client
REMEMBERIZER_API_KEY = "your_rememberizer_api_key"
VECTOR_STORE_ID = "vs_abc123" # Your Rememberizer vector store ID
BASE_URL = "https://api.rememberizer.ai/api/v1"
# 1. Set up batch size for migration (adjust based on your data size)
BATCH_SIZE = 100
# 2. Function to get vectors from Pinecone
def fetch_vectors_from_pinecone(index_name, batch_size, cursor=None):
# Use the list operation if available in your Pinecone version
try:
result = source_index.list(limit=batch_size, cursor=cursor)
vectors = result.get("vectors", {})
next_cursor = result.get("cursor")
return vectors, next_cursor
except AttributeError:
# For older Pinecone versions without list operation
# This is a simplified approach; actual implementation depends on your data access pattern
query_response = source_index.query(
vector=[0] * source_index.describe_index_stats()["dimension"],
top_k=batch_size,
include_metadata=True,
include_values=True
)
return {item.id: {"id": item.id, "values": item.values, "metadata": item.metadata}
for item in query_response.matches}, None
# 3. Function to upload vectors to Rememberizer
def upload_to_rememberizer(vectors):
headers = {
"x-api-key": REMEMBERIZER_API_KEY,
"Content-Type": "application/json"
}
for vector_id, vector_data in vectors.items():
# Convert Pinecone vector data to Rememberizer format
document_name = vector_data.get("metadata", {}).get("filename", f"pinecone_doc_{vector_id}")
content = vector_data.get("metadata", {}).get("text", "")
if not content:
print(f"Skipping {vector_id} - no text content found in metadata")
continue
data = {
"name": document_name,
"content": content,
# Optional: include additional metadata
"metadata": vector_data.get("metadata", {})
}
response = requests.post(
f"{BASE_URL}/vector-stores/{VECTOR_STORE_ID}/documents/text",
headers=headers,
json=data
)
if response.status_code == 201:
print(f"Document '{document_name}' uploaded successfully!")
else:
print(f"Error uploading document {document_name}: {response.text}")
# Add a small delay to prevent rate limiting
time.sleep(0.1)
# 4. Main migration function
def migrate_pinecone_to_rememberizer():
cursor = None
total_migrated = 0
print("Starting migration from Pinecone to Rememberizer...")
while True:
vectors, cursor = fetch_vectors_from_pinecone("your-pinecone-index", BATCH_SIZE, cursor)
if not vectors:
break
print(f"Fetched {len(vectors)} vectors from Pinecone")
upload_to_rememberizer(vectors)
total_migrated += len(vectors)
print(f"Progress: {total_migrated} vectors migrated")
if not cursor:
break
print(f"Migration complete! {total_migrated} total vectors migrated to Rememberizer")
# Run the migration
# migrate_pinecone_to_rememberizer()
||CODE_BLOCK||
{% endtab %}
{% tab title="Node.js" %}
||CODE_BLOCK||javascript
const { PineconeClient } = require('@pinecone-database/pinecone');
const axios = require('axios');
// Pinecone configuration
const pineconeApiKey = 'PINECONE_API_KEY';
const pineconeEnvironment = 'PINECONE_ENVIRONMENT';
const pineconeIndexName = 'YOUR_PINECONE_INDEX';
// Rememberizer configuration
const rememberizerApiKey = 'YOUR_REMEMBERIZER_API_KEY';
const vectorStoreId = 'vs_abc123';
const baseUrl = 'https://api.rememberizer.ai/api/v1';
// Batch size configuration
const BATCH_SIZE = 100;
// Initialize Pinecone client
async function initPinecone() {
const pinecone = new PineconeClient();
await pinecone.init({
apiKey: pineconeApiKey,
environment: pineconeEnvironment,
});
return pinecone;
}
// Fetch vectors from Pinecone
async function fetchVectorsFromPinecone(pinecone, batchSize, paginationToken = null) {
const index = pinecone.Index(pineconeIndexName);
try {
// For newer Pinecone versions
const listResponse = await index.list({
limit: batchSize,
paginationToken: paginationToken
});
return {
vectors: listResponse.vectors || {},
nextToken: listResponse.paginationToken
};
} catch (error) {
// Fallback for older Pinecone versions
// This is simplified; actual implementation depends on your data access pattern
const stats = await index.describeIndexStats();
const dimension = stats.dimension;
const queryResponse = await index.query({
vector: Array(dimension).fill(0),
topK: batchSize,
includeMetadata: true,
includeValues: true
});
const vectors = {};
queryResponse.matches.forEach(match => {
vectors[match.id] = {
id: match.id,
values: match.values,
metadata: match.metadata
};
});
return { vectors, nextToken: null };
}
}
// Upload vectors to Rememberizer
async function uploadToRememberizer(vectors) {
const headers = {
'x-api-key': rememberizerApiKey,
'Content-Type': 'application/json'
};
const results = [];
for (const [vectorId, vectorData] of Object.entries(vectors)) {
const documentName = vectorData.metadata?.filename || `pinecone_doc_${vectorId}`;
const content = vectorData.metadata?.text || '';
if (!content) {
console.log(`Skipping ${vectorId} - no text content found in metadata`);
continue;
}
const data = {
name: documentName,
content: content,
// Optional: include additional metadata
metadata: vectorData.metadata || {}
};
try {
const response = await axios.post(
`${baseUrl}/vector-stores/${vectorStoreId}/documents/text`,
data,
{ headers }
);
if (response.status === 201) {
console.log(`Document '${documentName}' uploaded successfully!`);
results.push({ id: vectorId, success: true });
} else {
console.error(`Error uploading document ${documentName}: ${response.statusText}`);
results.push({ id: vectorId, success: false, error: response.statusText });
}
} catch (error) {
console.error(`Error uploading document ${documentName}: ${error.message}`);
results.push({ id: vectorId, success: false, error: error.message });
}
// Add a small delay to prevent rate limiting
await new Promise(resolve => setTimeout(resolve, 100));
}
return results;
}
// Main migration function
async function migratePineconeToRememberizer() {
try {
console.log('Starting migration from Pinecone to Rememberizer...');
const pinecone = await initPinecone();
let nextToken = null;
let totalMigrated = 0;
do {
const { vectors, nextToken: token } = await fetchVectorsFromPinecone(
pinecone,
BATCH_SIZE,
nextToken
);
nextToken = token;
if (Object.keys(vectors).length === 0) {
break;
}
console.log(`Fetched ${Object.keys(vectors).length} vectors from Pinecone`);
const results = await uploadToRememberizer(vectors);
const successCount = results.filter(r => r.success).length;
totalMigrated += successCount;
console.log(`Progress: ${totalMigrated} vectors migrated successfully`);
} while (nextToken);
console.log(`Migration complete! ${totalMigrated} total vectors migrated to Rememberizer`);
} catch (error) {
console.error('Migration failed:', error);
}
}
// Run the migration
// migratePineconeToRememberizer();
||CODE_BLOCK||
{% endtab %}
{% endtabs %}
### Migrating from Qdrant
{% tabs %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import json
import time
from qdrant_client import QdrantClient
from qdrant_client.http import models as rest
# Set up Qdrant client
QDRANT_URL = "http://localhost:6333" # or your Qdrant cloud URL
QDRANT_API_KEY = "your_qdrant_api_key" # if using Qdrant Cloud
QDRANT_COLLECTION_NAME = "your_collection"
qdrant_client = QdrantClient(
url=QDRANT_URL,
api_key=QDRANT_API_KEY # Only for Qdrant Cloud
)
# Set up Rememberizer Vector Store client
REMEMBERIZER_API_KEY = "your_rememberizer_api_key"
VECTOR_STORE_ID = "vs_abc123" # Your Rememberizer vector store ID
BASE_URL = "https://api.rememberizer.ai/api/v1"
# Batch size for processing
BATCH_SIZE = 100
# Function to fetch points from Qdrant
def fetch_points_from_qdrant(collection_name, batch_size, offset=0):
try:
# Get collection info to determine vector dimension
collection_info = qdrant_client.get_collection(collection_name=collection_name)
# Scroll through points
scroll_result = qdrant_client.scroll(
collection_name=collection_name,
limit=batch_size,
offset=offset,
with_payload=True,
with_vectors=True
)
points = scroll_result[0] # Tuple of (points, next_offset)
next_offset = scroll_result[1]
return points, next_offset
except Exception as e:
print(f"Error fetching points from Qdrant: {e}")
return [], None
# Function to upload vectors to Rememberizer
def upload_to_rememberizer(points):
headers = {
"x-api-key": REMEMBERIZER_API_KEY,
"Content-Type": "application/json"
}
results = []
for point in points:
# Extract data from Qdrant point
point_id = point.id
metadata = point.payload
text_content = metadata.get("text", "")
document_name = metadata.get("filename", f"qdrant_doc_{point_id}")
if not text_content:
print(f"Skipping {point_id} - no text content found in payload")
continue
data = {
"name": document_name,
"content": text_content,
# Optional: include additional metadata
"metadata": metadata
}
try:
response = requests.post(
f"{BASE_URL}/vector-stores/{VECTOR_STORE_ID}/documents/text",
headers=headers,
json=data
)
if response.status_code == 201:
print(f"Document '{document_name}' uploaded successfully!")
results.append({"id": point_id, "success": True})
else:
print(f"Error uploading document {document_name}: {response.text}")
results.append({"id": point_id, "success": False, "error": response.text})
except Exception as e:
print(f"Exception uploading document {document_name}: {str(e)}")
results.append({"id": point_id, "success": False, "error": str(e)})
# Add a small delay to prevent rate limiting
time.sleep(0.1)
return results
# Main migration function
def migrate_qdrant_to_rememberizer():
offset = None
total_migrated = 0
print("Starting migration from Qdrant to Rememberizer...")
while True:
points, next_offset = fetch_points_from_qdrant(
QDRANT_COLLECTION_NAME,
BATCH_SIZE,
offset
)
if not points:
break
print(f"Fetched {len(points)} points from Qdrant")
results = upload_to_rememberizer(points)
success_count = sum(1 for r in results if r.get("success", False))
total_migrated += success_count
print(f"Progress: {total_migrated} points migrated successfully")
if next_offset is None:
break
offset = next_offset
print(f"Migration complete! {total_migrated} total points migrated to Rememberizer")
# Run the migration
# migrate_qdrant_to_rememberizer()
||CODE_BLOCK||
{% endtab %}
{% tab title="Node.js" %}
||CODE_BLOCK||javascript
const { QdrantClient } = require('@qdrant/js-client-rest');
const axios = require('axios');
// Qdrant configuration
const qdrantUrl = 'http://localhost:6333'; // or your Qdrant cloud URL
const qdrantApiKey = 'your_qdrant_api_key'; // if using Qdrant Cloud
const qdrantCollectionName = 'your_collection';
// Rememberizer configuration
const rememberizerApiKey = 'YOUR_REMEMBERIZER_API_KEY';
const vectorStoreId = 'vs_abc123';
const baseUrl = 'https://api.rememberizer.ai/api/v1';
// Batch size configuration
const BATCH_SIZE = 100;
// Initialize Qdrant client
const qdrantClient = new QdrantClient({
url: qdrantUrl,
apiKey: qdrantApiKey // Only for Qdrant Cloud
});
// Fetch points from Qdrant
async function fetchPointsFromQdrant(collectionName, batchSize, offset = 0) {
try {
// Get collection info
const collectionInfo = await qdrantClient.getCollection(collectionName);
// Scroll through points
const scrollResult = await qdrantClient.scroll(collectionName, {
limit: batchSize,
offset: offset,
with_payload: true,
with_vectors: true
});
return {
points: scrollResult.points,
nextOffset: scrollResult.next_page_offset
};
} catch (error) {
console.error(`Error fetching points from Qdrant: ${error.message}`);
return { points: [], nextOffset: null };
}
}
// Upload vectors to Rememberizer
async function uploadToRememberizer(points) {
const headers = {
'x-api-key': rememberizerApiKey,
'Content-Type': 'application/json'
};
const results = [];
for (const point of points) {
// Extract data from Qdrant point
const pointId = point.id;
const metadata = point.payload || {};
const textContent = metadata.text || '';
const documentName = metadata.filename || `qdrant_doc_${pointId}`;
if (!textContent) {
console.log(`Skipping ${pointId} - no text content found in payload`);
continue;
}
const data = {
name: documentName,
content: textContent,
// Optional: include additional metadata
metadata: metadata
};
try {
const response = await axios.post(
`${baseUrl}/vector-stores/${vectorStoreId}/documents/text`,
data,
{ headers }
);
if (response.status === 201) {
console.log(`Document '${documentName}' uploaded successfully!`);
results.push({ id: pointId, success: true });
} else {
console.error(`Error uploading document ${documentName}: ${response.statusText}`);
results.push({ id: pointId, success: false, error: response.statusText });
}
} catch (error) {
console.error(`Error uploading document ${documentName}: ${error.message}`);
results.push({ id: pointId, success: false, error: error.message });
}
// Add a small delay to prevent rate limiting
await new Promise(resolve => setTimeout(resolve, 100));
}
return results;
}
// Main migration function
async function migrateQdrantToRememberizer() {
try {
console.log('Starting migration from Qdrant to Rememberizer...');
let offset = null;
let totalMigrated = 0;
do {
const { points, nextOffset } = await fetchPointsFromQdrant(
qdrantCollectionName,
BATCH_SIZE,
offset
);
offset = nextOffset;
if (points.length === 0) {
break;
}
console.log(`Fetched ${points.length} points from Qdrant`);
const results = await uploadToRememberizer(points);
const successCount = results.filter(r => r.success).length;
totalMigrated += successCount;
console.log(`Progress: ${totalMigrated} points migrated successfully`);
} while (offset !== null);
console.log(`Migration complete! ${totalMigrated} total points migrated to Rememberizer`);
} catch (error) {
console.error('Migration failed:', error);
}
}
// Run the migration
// migrateQdrantToRememberizer();
||CODE_BLOCK||
{% endtab %}
{% endtabs %}
### Migrating from Supabase pgvector
If you're already using Supabase with pgvector, the migration to Rememberizer is particularly straightforward since both use PostgreSQL with the pgvector extension.
{% tabs %}
{% tab title="Python" %}
||CODE_BLOCK||python
import psycopg2
import requests
import json
import time
import os
from dotenv import load_dotenv
# Load environment variables
load_dotenv()
# Supabase PostgreSQL configuration
SUPABASE_DB_HOST = os.getenv("SUPABASE_DB_HOST")
SUPABASE_DB_PORT = os.getenv("SUPABASE_DB_PORT", "5432")
SUPABASE_DB_NAME = os.getenv("SUPABASE_DB_NAME")
SUPABASE_DB_USER = os.getenv("SUPABASE_DB_USER")
SUPABASE_DB_PASSWORD = os.getenv("SUPABASE_DB_PASSWORD")
SUPABASE_VECTOR_TABLE = os.getenv("SUPABASE_VECTOR_TABLE", "documents")
# Rememberizer configuration
REMEMBERIZER_API_KEY = os.getenv("REMEMBERIZER_API_KEY")
VECTOR_STORE_ID = os.getenv("VECTOR_STORE_ID") # e.g., "vs_abc123"
BASE_URL = "https://api.rememberizer.ai/api/v1"
# Batch size for processing
BATCH_SIZE = 100
# Connect to Supabase PostgreSQL
def connect_to_supabase():
try:
conn = psycopg2.connect(
host=SUPABASE_DB_HOST,
port=SUPABASE_DB_PORT,
dbname=SUPABASE_DB_NAME,
user=SUPABASE_DB_USER,
password=SUPABASE_DB_PASSWORD
)
return conn
except Exception as e:
print(f"Error connecting to Supabase PostgreSQL: {e}")
return None
# Fetch documents from Supabase pgvector
def fetch_documents_from_supabase(conn, batch_size, offset=0):
try:
cursor = conn.cursor()
# Adjust this query based on your table structure
query = f"""
SELECT id, content, metadata, embedding
FROM {SUPABASE_VECTOR_TABLE}
ORDER BY id
LIMIT %s OFFSET %s
"""
cursor.execute(query, (batch_size, offset))
documents = cursor.fetchall()
cursor.close()
return documents
except Exception as e:
print(f"Error fetching documents from Supabase: {e}")
return []
# Upload documents to Rememberizer
def upload_to_rememberizer(documents):
headers = {
"x-api-key": REMEMBERIZER_API_KEY,
"Content-Type": "application/json"
}
results = []
for doc in documents:
doc_id, content, metadata, embedding = doc
# Parse metadata if it's stored as JSON string
if isinstance(metadata, str):
try:
metadata = json.loads(metadata)
except:
metadata = {}
elif metadata is None:
metadata = {}
document_name = metadata.get("filename", f"supabase_doc_{doc_id}")
if not content:
print(f"Skipping {doc_id} - no content found")
continue
data = {
"name": document_name,
"content": content,
"metadata": metadata
}
try:
response = requests.post(
f"{BASE_URL}/vector-stores/{VECTOR_STORE_ID}/documents/text",
headers=headers,
json=data
)
if response.status_code == 201:
print(f"Document '{document_name}' uploaded successfully!")
results.append({"id": doc_id, "success": True})
else:
print(f"Error uploading document {document_name}: {response.text}")
results.append({"id": doc_id, "success": False, "error": response.text})
except Exception as e:
print(f"Exception uploading document {document_name}: {str(e)}")
results.append({"id": doc_id, "success": False, "error": str(e)})
# Add a small delay to prevent rate limiting
time.sleep(0.1)
return results
# Main migration function
def migrate_supabase_to_rememberizer():
conn = connect_to_supabase()
if not conn:
print("Failed to connect to Supabase. Aborting migration.")
return
offset = 0
total_migrated = 0
print("Starting migration from Supabase pgvector to Rememberizer...")
try:
while True:
documents = fetch_documents_from_supabase(conn, BATCH_SIZE, offset)
if not documents:
break
print(f"Fetched {len(documents)} documents from Supabase")
results = upload_to_rememberizer(documents)
success_count = sum(1 for r in results if r.get("success", False))
total_migrated += success_count
print(f"Progress: {total_migrated} documents migrated successfully")
offset += BATCH_SIZE
finally:
conn.close()
print(f"Migration complete! {total_migrated} total documents migrated to Rememberizer")
# Run the migration
# migrate_supabase_to_rememberizer()
||CODE_BLOCK||
{% endtab %}
{% tab title="Node.js" %}
||CODE_BLOCK||javascript
const { Pool } = require('pg');
const axios = require('axios');
require('dotenv').config();
// Supabase PostgreSQL configuration
const supabasePool = new Pool({
host: process.env.SUPABASE_DB_HOST,
port: process.env.SUPABASE_DB_PORT || 5432,
database: process.env.SUPABASE_DB_NAME,
user: process.env.SUPABASE_DB_USER,
password: process.env.SUPABASE_DB_PASSWORD,
ssl: {
rejectUnauthorized: false
}
});
const supabaseVectorTable = process.env.SUPABASE_VECTOR_TABLE || 'documents';
// Rememberizer configuration
const rememberizerApiKey = process.env.REMEMBERIZER_API_KEY;
const vectorStoreId = process.env.VECTOR_STORE_ID; // e.g., "vs_abc123"
const baseUrl = 'https://api.rememberizer.ai/api/v1';
// Batch size configuration
const BATCH_SIZE = 100;
// Fetch documents from Supabase pgvector
async function fetchDocumentsFromSupabase(batchSize, offset = 0) {
try {
// Adjust this query based on your table structure
const query = `
SELECT id, content, metadata, embedding
FROM ${supabaseVectorTable}
ORDER BY id
LIMIT $1 OFFSET $2
`;
const result = await supabasePool.query(query, [batchSize, offset]);
return result.rows;
} catch (error) {
console.error(`Error fetching documents from Supabase: ${error.message}`);
return [];
}
}
// Upload documents to Rememberizer
async function uploadToRememberizer(documents) {
const headers = {
'x-api-key': rememberizerApiKey,
'Content-Type': 'application/json'
};
const results = [];
for (const doc of documents) {
// Parse metadata if it's stored as JSON string
let metadata = doc.metadata;
if (typeof metadata === 'string') {
try {
metadata = JSON.parse(metadata);
} catch (e) {
metadata = {};
}
} else if (metadata === null) {
metadata = {};
}
const documentName = metadata.filename || `supabase_doc_${doc.id}`;
if (!doc.content) {
console.log(`Skipping ${doc.id} - no content found`);
continue;
}
const data = {
name: documentName,
content: doc.content,
metadata: metadata
};
try {
const response = await axios.post(
`${baseUrl}/vector-stores/${vectorStoreId}/documents/text`,
data,
{ headers }
);
if (response.status === 201) {
console.log(`Document '${documentName}' uploaded successfully!`);
results.push({ id: doc.id, success: true });
} else {
console.error(`Error uploading document ${documentName}: ${response.statusText}`);
results.push({ id: doc.id, success: false, error: response.statusText });
}
} catch (error) {
console.error(`Error uploading document ${documentName}: ${error.message}`);
results.push({ id: doc.id, success: false, error: error.message });
}
// Add a small delay to prevent rate limiting
await new Promise(resolve => setTimeout(resolve, 100));
}
return results;
}
// Main migration function
async function migrateSupabaseToRememberizer() {
try {
console.log('Starting migration from Supabase pgvector to Rememberizer...');
let offset = 0;
let totalMigrated = 0;
while (true) {
const documents = await fetchDocumentsFromSupabase(BATCH_SIZE, offset);
if (documents.length === 0) {
break;
}
console.log(`Fetched ${documents.length} documents from Supabase`);
const results = await uploadToRememberizer(documents);
const successCount = results.filter(r => r.success).length;
totalMigrated += successCount;
console.log(`Progress: ${totalMigrated} documents migrated successfully`);
offset += BATCH_SIZE;
}
console.log(`Migration complete! ${totalMigrated} total documents migrated to Rememberizer`);
} catch (error) {
console.error('Migration failed:', error);
} finally {
await supabasePool.end();
}
}
// Run the migration
// migrateSupabaseToRememberizer();
||CODE_BLOCK||
{% endtab %}
{% endtabs %}
### Migration Best Practices
Follow these recommendations for a successful migration:
1. **Plan Ahead**:
- Estimate the data volume and time required for migration
- Schedule migration during low-traffic periods
- Increase disk space before starting large migrations
2. **Test First**:
- Create a test vector store in Rememberizer
- Migrate a small subset of data (100-1000 vectors)
- Verify search functionality with key queries
3. **Data Validation**:
- Compare document counts before and after migration
- Run benchmark queries to ensure similar results
- Validate that metadata is correctly preserved
4. **Optimize for Performance**:
- Use batch operations for efficiency
- Consider geographic colocation of source and target databases
- Monitor API rate limits and adjust batch sizes accordingly
5. **Post-Migration Steps**:
- Verify index creation in Rememberizer
- Update application configurations to point to new vector store
- Keep source database as backup until migration is verified
For detailed API reference and endpoint documentation, visit the [vector-store](api-docs/vector-store/ "mention") page.
---
Make sure to handle the API keys securely and follow best practices for API key management.
==> developer/enterprise-integration-patterns.md <==
---
description: Architectural patterns, security considerations, and best practices for enterprise integrations with Rememberizer
type: guide
last_updated: 2025-04-03
---
# Enterprise Integration Patterns
This guide provides comprehensive information for organizations looking to integrate Rememberizer's knowledge management and semantic search capabilities into enterprise environments. It covers architectural patterns, security considerations, scalability, and best practices.
## Enterprise Integration Overview
Rememberizer offers robust enterprise integration capabilities that extend beyond basic API usage, allowing organizations to build sophisticated knowledge management systems that:
- **Scale to meet organizational needs** across departments and teams
- **Maintain security and compliance** with enterprise requirements
- **Integrate with existing systems** and workflow tools
- **Enable team-based access control** and knowledge sharing
- **Support high-volume batch operations** for document processing
## Architectural Patterns for Enterprise Integration
### 1. Multi-Tenant Knowledge Management
Organizations can implement a multi-tenant architecture to organize knowledge by teams, departments, or functions:
||CODE_BLOCK||
┌───────────────┐
│ Rememberizer│
│ Platform │
└───────┬───────┘
│
┌─────────────────┼─────────────────┐
│ │ │
┌───────▼────────┐ ┌──────▼───────┐ ┌───────▼────────┐
│ Engineering │ │ Sales │ │ Legal │
│ Knowledge Base│ │ Knowledge Base│ │ Knowledge Base │
└───────┬────────┘ └──────┬───────┘ └───────┬────────┘
│ │ │
│ │ │
┌───────▼────────┐ ┌──────▼───────┐ ┌───────▼────────┐
│ Team-specific │ │ Team-specific│ │ Team-specific │
│ Mementos │ │ Mementos │ │ Mementos │
└────────────────┘ └──────────────┘ └─────────────────┘
||CODE_BLOCK||
**Implementation Steps:**
1. Create separate vector stores for each department or major knowledge domain
2. Configure team-based access control using Rememberizer's team functionality
3. Define mementos to control access to specific knowledge subsets
4. Implement role-based permissions for knowledge administrators and consumers
### 2. Integration Hub Architecture
For enterprises with existing systems, the hub-and-spoke pattern allows Rememberizer to act as a central knowledge repository:
||CODE_BLOCK||
┌─────────────┐ ┌─────────────┐
│ CRM System │ │ ERP System │
└──────┬──────┘ └──────┬──────┘
│ │
│ │
▼ ▼
┌──────────────────────────────────────────┐
│ │
│ Enterprise Service Bus │
│ │
└────────────────────┬─────────────────────┘
│
▼
┌───────────────────┐
│ Rememberizer │
│ Knowledge Platform│
└─────────┬─────────┘
│
┌─────────────────┴────────────────┐
│ │
┌─────────▼──────────┐ ┌──────────▼────────┐
│ Internal Knowledge │ │ Customer Knowledge │
│ Base │ │ Base │
└────────────────────┘ └─────────────────────┘
||CODE_BLOCK||
**Implementation Steps:**
1. Create and configure API keys for system-to-system integration
2. Implement OAuth2 for user-based access to knowledge repositories
3. Set up ETL processes for regular knowledge synchronization
4. Use webhooks to notify external systems of knowledge updates
### 3. Microservices Architecture
For organizations adopting microservices, integrate Rememberizer as a specialized knowledge service:
||CODE_BLOCK||
┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ User Service│ │ Auth Service│ │ Data Service│ │ Search UI │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │ │
└────────────────┼────────────────┼────────────────┘
│ │
▼ ▼
┌─────────────────────────────────┐
│ API Gateway │
└─────────────────┬─────────────┘
│
▼
┌───────────────────┐
│ Rememberizer │
│ Knowledge API │
└───────────────────┘
||CODE_BLOCK||
**Implementation Steps:**
1. Create dedicated service accounts for microservices integration
2. Implement JWT token-based authentication for service-to-service communication
3. Design idempotent API interactions for resilience
4. Implement circuit breakers for fault tolerance
## Enterprise Security Patterns
### Authentication & Authorization
Rememberizer supports multiple authentication methods suitable for enterprise environments:
#### 1. OAuth2 Integration
For user-based access, implement the OAuth2 authorization flow:
||CODE_BLOCK||javascript
// Step 1: Redirect users to Rememberizer authorization endpoint
function redirectToAuth() {
const authUrl = 'https://api.rememberizer.ai/oauth/authorize/';
const params = new URLSearchParams({
client_id: 'YOUR_CLIENT_ID',
redirect_uri: 'YOUR_REDIRECT_URI',
response_type: 'code',
scope: 'read write'
});
window.location.href = `${authUrl}?${params.toString()}`;
}
// Step 2: Exchange authorization code for tokens
async function exchangeCodeForTokens(code) {
const tokenUrl = 'https://api.rememberizer.ai/oauth/token/';
const response = await fetch(tokenUrl, {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
client_id: 'YOUR_CLIENT_ID',
client_secret: 'YOUR_CLIENT_SECRET',
grant_type: 'authorization_code',
code: code,
redirect_uri: 'YOUR_REDIRECT_URI'
})
});
return response.json();
}
||CODE_BLOCK||
#### 2. Service Account Authentication
For system-to-system integration, use API key authentication:
||CODE_BLOCK||python
import requests
def search_knowledge_base(query, api_key):
headers = {
'X-API-Key': api_key,
'Content-Type': 'application/json'
}
payload = {
'query': query,
'num_results': 10
}
response = requests.post(
'https://api.rememberizer.ai/api/v1/search/',
headers=headers,
json=payload
)
return response.json()
||CODE_BLOCK||
#### 3. SAML and Enterprise SSO
For enterprise single sign-on integration:
1. Configure your identity provider (Okta, Azure AD, etc.) to recognize Rememberizer as a service provider
2. Set up SAML attribute mapping to match Rememberizer user attributes
3. Configure Rememberizer to delegate authentication to your identity provider
### Zero Trust Security Model
Implement a zero trust approach with Rememberizer by:
1. **Micro-segmentation**: Create separate knowledge bases with distinct access controls
2. **Continuous Verification**: Implement short-lived tokens and regular reauthentication
3. **Least Privilege**: Define fine-grained mementos that limit access to specific knowledge subsets
4. **Event Logging**: Monitor and audit all access to sensitive knowledge
## Scalability Patterns
### Batch Processing for Document Ingestion
For large-scale document ingestion, implement the batch upload pattern:
||CODE_BLOCK||python
import requests
import time
from concurrent.futures import ThreadPoolExecutor
def batch_upload_documents(files, api_key, batch_size=5):
"""
Upload documents in batches to avoid rate limits
Args:
files: List of file paths to upload
api_key: Rememberizer API key
batch_size: Number of concurrent uploads
"""
headers = {
'X-API-Key': api_key
}
results = []
# Process files in batches
with ThreadPoolExecutor(max_workers=batch_size) as executor:
for i in range(0, len(files), batch_size):
batch = files[i:i+batch_size]
futures = []
# Submit batch of uploads
for file_path in batch:
with open(file_path, 'rb') as f:
files = {'file': f}
future = executor.submit(
requests.post,
'https://api.rememberizer.ai/api/v1/documents/upload/',
headers=headers,
files=files
)
futures.append(future)
# Collect results
for future in futures:
response = future.result()
results.append(response.json())
# Rate limiting - pause between batches
if i + batch_size < len(files):
time.sleep(1)
return results
||CODE_BLOCK||
### High-Volume Search Operations
For applications requiring high-volume search:
||CODE_BLOCK||javascript
async function batchSearchWithRateLimit(queries, apiKey, options = {}) {
const {
batchSize = 5,
delayBetweenBatches = 1000,
maxRetries = 3,
retryDelay = 2000
} = options;
const results = [];
// Process queries in batches
for (let i = 0; i < queries.length; i += batchSize) {
const batch = queries.slice(i, i + batchSize);
const batchPromises = batch.map(query => searchWithRetry(query, apiKey, maxRetries, retryDelay));
// Execute batch
const batchResults = await Promise.all(batchPromises);
results.push(...batchResults);
// Apply rate limiting between batches
if (i + batchSize < queries.length) {
await new Promise(resolve => setTimeout(resolve, delayBetweenBatches));
}
}
return results;
}
async function searchWithRetry(query, apiKey, maxRetries, retryDelay) {
let retries = 0;
while (retries < maxRetries) {
try {
const response = await fetch('https://api.rememberizer.ai/api/v1/search/', {
method: 'POST',
headers: {
'X-API-Key': apiKey,
'Content-Type': 'application/json'
},
body: JSON.stringify({ query })
});
if (response.ok) {
return response.json();
}
// Handle rate limiting specifically
if (response.status === 429) {
const retryAfter = response.headers.get('Retry-After') || retryDelay / 1000;
await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
retries++;
continue;
}
// Other errors
throw new Error(`Search failed with status: ${response.status}`);
} catch (error) {
retries++;
if (retries >= maxRetries) {
throw error;
}
await new Promise(resolve => setTimeout(resolve, retryDelay));
}
}
}
||CODE_BLOCK||
## Team-Based Knowledge Management
Rememberizer supports team-based knowledge management, enabling enterprises to:
1. **Create team workspaces**: Organize knowledge by department or function
2. **Assign role-based permissions**: Control who can view, edit, or administer knowledge
3. **Share knowledge across teams**: Configure cross-team access to specific knowledge bases
### Team Roles and Permissions
Rememberizer supports the following team roles:
| Role | Capabilities |
|------|--------------|
| **Owner** | Full administrative access, can manage team members and all knowledge |
| **Admin** | Can manage knowledge and configure mementos, but cannot manage the team itself |
| **Member** | Can view and search knowledge according to memento permissions |
### Implementing Team-Based Knowledge Sharing
||CODE_BLOCK||python
import requests
def create_team_knowledge_base(team_id, name, description, api_key):
"""
Create a knowledge base for a specific team
"""
headers = {
'X-API-Key': api_key,
'Content-Type': 'application/json'
}
payload = {
'team_id': team_id,
'name': name,
'description': description
}
response = requests.post(
'https://api.rememberizer.ai/api/v1/teams/knowledge/',
headers=headers,
json=payload
)
return response.json()
def grant_team_access(knowledge_id, team_id, permission_level, api_key):
"""
Grant a team access to a knowledge base
Args:
knowledge_id: ID of the knowledge base
team_id: ID of the team to grant access
permission_level: 'read', 'write', or 'admin'
api_key: Rememberizer API key
"""
headers = {
'X-API-Key': api_key,
'Content-Type': 'application/json'
}
payload = {
'team_id': team_id,
'knowledge_id': knowledge_id,
'permission': permission_level
}
response = requests.post(
'https://api.rememberizer.ai/api/v1/knowledge/permissions/',
headers=headers,
json=payload
)
return response.json()
||CODE_BLOCK||
## Enterprise Integration Best Practices
### 1. Implement Robust Error Handling
Design your integration to handle various error scenarios gracefully:
||CODE_BLOCK||javascript
async function robustApiCall(endpoint, method, payload, apiKey) {
try {
const response = await fetch(`https://api.rememberizer.ai/api/v1/${endpoint}`, {
method,
headers: {
'X-API-Key': apiKey,
'Content-Type': 'application/json'
},
body: method !== 'GET' ? JSON.stringify(payload) : undefined
});
// Handle different response types
if (response.status === 204) {
return { success: true };
}
if (!response.ok) {
const error = await response.json();
throw new Error(error.message || `API call failed with status: ${response.status}`);
}
return await response.json();
} catch (error) {
// Log error details for troubleshooting
console.error(`API call to ${endpoint} failed:`, error);
// Provide meaningful error to calling code
throw new Error(`Failed to ${method} ${endpoint}: ${error.message}`);
}
}
||CODE_BLOCK||
### 2. Implement Caching for Frequently Accessed Knowledge
Reduce API load and improve performance with appropriate caching:
||CODE_BLOCK||python
import requests
import time
from functools import lru_cache
# Cache frequently accessed documents for 10 minutes
@lru_cache(maxsize=100)
def get_document_with_cache(document_id, api_key, timestamp=None):
"""
Get a document with caching
Args:
document_id: ID of the document to retrieve
api_key: Rememberizer API key
timestamp: Cache invalidation timestamp (default: 10 min chunks)
"""
# Generate a timestamp that changes every 10 minutes for cache invalidation
if timestamp is None:
timestamp = int(time.time() / 600)
headers = {
'X-API-Key': api_key
}
response = requests.get(
f'https://api.rememberizer.ai/api/v1/documents/{document_id}/',
headers=headers
)
return response.json()
||CODE_BLOCK||
### 3. Implement Asynchronous Processing for Document Uploads
For large document sets, implement asynchronous processing:
||CODE_BLOCK||javascript
async function uploadLargeDocument(file, apiKey) {
// Step 1: Initiate upload
const initResponse = await fetch('https://api.rememberizer.ai/api/v1/documents/upload-async/', {
method: 'POST',
headers: {
'X-API-Key': apiKey,
'Content-Type': 'application/json'
},
body: JSON.stringify({
filename: file.name,
filesize: file.size,
content_type: file.type
})
});
const { upload_id, upload_url } = await initResponse.json();
// Step 2: Upload file to the provided URL
await fetch(upload_url, {
method: 'PUT',
body: file
});
// Step 3: Monitor processing status
const processingId = await initiateProcessing(upload_id, apiKey);
return monitorProcessingStatus(processingId, apiKey);
}
async function initiateProcessing(uploadId, apiKey) {
const response = await fetch('https://api.rememberizer.ai/api/v1/documents/process/', {
method: 'POST',
headers: {
'X-API-Key': apiKey,
'Content-Type': 'application/json'
},
body: JSON.stringify({
upload_id: uploadId
})
});
const { processing_id } = await response.json();
return processing_id;
}
async function monitorProcessingStatus(processingId, apiKey, interval = 2000) {
while (true) {
const statusResponse = await fetch(`https://api.rememberizer.ai/api/v1/documents/process-status/${processingId}/`, {
headers: {
'X-API-Key': apiKey
}
});
const status = await statusResponse.json();
if (status.status === 'completed') {
return status.document_id;
} else if (status.status === 'failed') {
throw new Error(`Processing failed: ${status.error}`);
}
// Wait before checking again
await new Promise(resolve => setTimeout(resolve, interval));
}
}
||CODE_BLOCK||
### 4. Implement Proper Rate Limiting
Respect API rate limits to ensure reliable operation:
||CODE_BLOCK||python
import requests
import time
from functools import wraps
class RateLimiter:
def __init__(self, calls_per_second=5):
self.calls_per_second = calls_per_second
self.last_call_time = 0
self.min_interval = 1.0 / calls_per_second
def __call__(self, func):
@wraps(func)
def wrapper(*args, **kwargs):
current_time = time.time()
time_since_last_call = current_time - self.last_call_time
if time_since_last_call < self.min_interval:
sleep_time = self.min_interval - time_since_last_call
time.sleep(sleep_time)
self.last_call_time = time.time()
return func(*args, **kwargs)
return wrapper
# Apply rate limiting to API calls
@RateLimiter(calls_per_second=5)
def search_documents(query, api_key):
headers = {
'X-API-Key': api_key,
'Content-Type': 'application/json'
}
payload = {
'query': query
}
response = requests.post(
'https://api.rememberizer.ai/api/v1/search/',
headers=headers,
json=payload
)
return response.json()
||CODE_BLOCK||
## Compliance Considerations
### Data Residency
For organizations with data residency requirements:
1. **Choose appropriate region**: Select Rememberizer deployments in compliant regions
2. **Document data flows**: Map where knowledge is stored and processed
3. **Implement filtering**: Use mementos to restrict sensitive data access
### Audit Logging
Implement comprehensive audit logging for compliance:
||CODE_BLOCK||python
import requests
import json
import logging
# Configure logging
logging.basicConfig(
level=logging.INFO,
format='%(asctime)s [%(levelname)s] %(message)s',
handlers=[
logging.FileHandler('rememberizer_audit.log'),
logging.StreamHandler()
]
)
def audit_log_api_call(endpoint, method, user_id, result_status):
"""
Log API call details for audit purposes
"""
log_entry = {
'timestamp': time.time(),
'endpoint': endpoint,
'method': method,
'user_id': user_id,
'status': result_status
}
logging.info(f"API CALL: {json.dumps(log_entry)}")
def search_with_audit(query, api_key, user_id):
endpoint = 'search'
method = 'POST'
try:
headers = {
'X-API-Key': api_key,
'Content-Type': 'application/json'
}
payload = {
'query': query
}
response = requests.post(
'https://api.rememberizer.ai/api/v1/search/',
headers=headers,
json=payload
)
status = 'success' if response.ok else 'error'
audit_log_api_call(endpoint, method, user_id, status)
return response.json()
except Exception as e:
audit_log_api_call(endpoint, method, user_id, 'exception')
raise
||CODE_BLOCK||
## Next Steps
To implement enterprise integrations with Rememberizer:
1. **Design your knowledge architecture**: Map out knowledge domains and access patterns
2. **Set up role-based team structures**: Create teams and assign appropriate permissions
3. **Implement authentication flows**: Choose and implement the authentication methods that meet your requirements
4. **Design scalable workflows**: Implement batch processing for document ingestion
5. **Establish monitoring and audit policies**: Set up logging and monitoring for compliance and operations
## Related Resources
* [Mementos Filter Access](../personal/mementos-filter-access.md) - Control which data sources are available to integrations
* [API Documentation](api-docs/README.md) - Complete API reference for all endpoints
* [LangChain Integration](langchain-integration.md) - Programmatic integration with the LangChain framework
* [Creating a Rememberizer GPT](creating-a-rememberizer-gpt.md) - Integration with OpenAI's GPT platform
* [Vector Stores](vector-stores.md) - Technical details of Rememberizer's vector database implementation
For additional assistance with enterprise integrations, contact the Rememberizer team through the Support portal.
==> developer/authorizing-rememberizer-apps.md <==
# Authorizing Rememberizer apps
Rememberizer's implementation supports the standard [authorization code grant type](https://tools.ietf.org/html/rfc6749#section-4.1).
The web application flow to authorize users for your app is as follows:
1. Users are redirected to Rememberizer to authorize their account.
2. The user chooses mementos to use with your application.
3. Your application accesses the API with the user's access token.
Visit [#explore-third-party-apps-and-service](../personal/manage-third-party-apps.md#explore-third-party-apps-and-service "mention") page to see the UI example of the flow.
<figure>
<div style="border: 2px dashed #ccc; padding: 20px; text-align: center; background-color: #f9f9f9;">
<p style="font-weight: bold;">Coming soon: OAuth2 Authorization Flow Diagram</p>
<p>This sequence diagram will illustrate the complete OAuth2 flow between:</p>
<ul style="text-align: left; display: inline-block;">
<li>User's browser</li>
<li>Your application (client)</li>
<li>Rememberizer authorization server</li>
<li>Rememberizer API resources</li>
</ul>
<p>The diagram will show the exchange of authorization codes, tokens, and API requests across all steps of the process.</p>
</div>
<figcaption>OAuth2 authorization flow sequence diagram for Rememberizer integration</figcaption>
</figure>
### Step 1. Request a user's Rememberizer identity
Redirect the user to the Rememberizer authorization server to initiate the authentication and authorization process.
||CODE_BLOCK||
GET https://api.rememberizer.ai/api/v1/auth/oauth2/authorize/
||CODE_BLOCK||
Parameters:
<table><thead><tr><th width="236">name</th><th>description</th></tr></thead><tbody><tr><td>client_id</td><td><strong>Required</strong><br>The client ID for your application. You can find this value in the Developer. Click <strong>Developer</strong> on the top-left corner. In the list of registered apps, click on your app and you will see the client ID in <strong>App Credentials.</strong></td></tr><tr><td>response_type</td><td><strong>Required</strong><br>Must be <code>code</code> for authorization code grants.</td></tr><tr><td>scope</td><td><p><strong>Optional</strong></p><p>A space-delimited list of scopes that identify the resources that your application could access on the user's behalf.</p></td></tr><tr><td>redirect_uri</td><td><strong>Required</strong><br>The URL in your application where users will be sent after authorization.</td></tr><tr><td>state</td><td><p><strong>Required</strong></p><p>An opaque value used by the client to maintain state between the request and callback. The authorization server includes this value when redirecting the user-agent back to the client.<br></p></td></tr></tbody></table>
### Step 2. User choose and config their mementos
Users will choose which mementos to use with your app.
### Step 3. Users are redirected back to your site by Rememberizer
After users select their mementos, Rememberizer redirects back to your site with a temporary `code` parameter as well as the state you provided in the previous step in a `state` parameter. The temporary code will expire after a short time. If the states don't match, a third party created the request, and you should abort the process.
### Step 4. Exchange authorization code for refresh and access tokens
||CODE_BLOCK||
POST https://api.rememberizer.ai/api/v1/auth/oauth2/token/
||CODE_BLOCK||
This endpoint takes the following input parameters.
<table><thead><tr><th width="165">name</th><th>description</th></tr></thead><tbody><tr><td>client_id</td><td><strong>Required</strong><br>The client ID for your application. You can find this value in the Developer. Instruction to find this ID is in step 1.</td></tr><tr><td>client_secret</td><td><strong>Required</strong><br>The client secret you received from Rememberizer for your application.</td></tr><tr><td>code</td><td>The authorization code you received in step 3.</td></tr><tr><td>redirect_uri</td><td><strong>Required</strong><br>The URL in your application where users are sent after authorization. Must match with the redirect_uri in step 1.</td></tr></tbody></table>
### Step 5. Use the access token to access the API
The access token allows you to make requests to the API on a user's behalf.
||CODE_BLOCK||
Authorization: Bearer OAUTH-TOKEN
GET https://api.rememberizer.ai/api/me/
||CODE_BLOCK||
For example, in curl you can set the Authorization header like this:
||CODE_BLOCK||shell
curl -H "Authorization: Bearer OAUTH-TOKEN" https://api.rememberizer.ai/api/me/
||CODE_BLOCK||
## References
Github: [https://github.com/skydeckai/rememberizer-integration-samples](https://github.com/skydeckai/rememberizer-integration-samples)
==> developer/api-documentations/retrieve-slacks-content.md <==
# Retrieve Slack's content
{% swagger src="../../.gitbook/assets/rememberizer_openapi (1).yml" path="/discussions/{discussion_id}/contents/" method="get" %}
[rememberizer_openapi (1).yml](<../../.gitbook/assets/rememberizer_openapi (1).yml>)
{% endswagger %}
==> developer/api-documentations/retrieve-documents.md <==
# Retrieve documents
{% swagger src="../../.gitbook/assets/rememberizer_openapi (1).yml" path="/documents/" method="get" %}
[rememberizer_openapi (1).yml](<../../.gitbook/assets/rememberizer_openapi (1).yml>)
{% endswagger %}
==> developer/api-documentations/README.md <==
# API documentations
You can authenticate APIs using either [OAuth2](../authorizing-rememberizer-apps.md) or [API keys](../registering-and-using-api-keys.md). OAuth2 is a standard authorization framework that enables applications to securely access specific documents within a system. On the other hand, API keys provide a simpler method to retrieve documents from a common knowledge base without the need to undergo the OAuth2 authentication process.
==> developer/api-documentations/list-available-data-source-integrations.md <==
# List available data source integrations
{% swagger src="../../.gitbook/assets/rememberizer_openapi (1).yml" path="/integrations/" method="get" %}
[rememberizer_openapi (1).yml](<../../.gitbook/assets/rememberizer_openapi (1).yml>)
{% endswagger %}
==> developer/api-documentations/retrieve-current-users-account-details.md <==
# Retrieve current user's account details
{% swagger src="../../.gitbook/assets/rememberizer_openapi (1).yml" path="/account/" method="get" %}
[rememberizer_openapi (1).yml](<../../.gitbook/assets/rememberizer_openapi (1).yml>)
{% endswagger %}
==> developer/api-documentations/memorize-content-to-rememberizer.md <==
# Memorize content to Rememberizer
{% swagger src="../../.gitbook/assets/rememberizer_openapi (1).yml" path="/documents/memorize/" method="post" %}
[rememberizer_openapi (1).yml](<../../.gitbook/assets/rememberizer_openapi (1).yml>)
{% endswagger %}
==> developer/api-documentations/get-all-added-public-knowledge.md <==
# Get all added public knowledge
{% swagger src="../../.gitbook/assets/rememberizer_openapi (1).yml" path="/common_knowledge/subscribed-list/" method="get" %}
[rememberizer_openapi (1).yml](<../../.gitbook/assets/rememberizer_openapi (1).yml>)
{% endswagger %}
==> developer/api-documentations/search-for-documents-by-semantic-similarity.md <==
# Search for documents by semantic similarity
{% swagger src="../../.gitbook/assets/rememberizer_openapi (1).yml" path="/documents/search/" method="get" %}
[rememberizer_openapi (1).yml](<../../.gitbook/assets/rememberizer_openapi (1).yml>)
{% endswagger %}
==> developer/api-documentations/retrieve-document-contents.md <==
# Retrieve document contents
{% swagger src="../../.gitbook/assets/rememberizer_openapi (1).yml" path="/documents/{document_id}/contents/" method="get" %}
[rememberizer_openapi (1).yml](<../../.gitbook/assets/rememberizer_openapi (1).yml>)
{% endswagger %}
==> developer/api-documentations/vector-store/get-a-list-of-documents-in-a-vector-store.md <==
# Get a list of documents in a Vector Store
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents" method="get" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
==> developer/api-documentations/vector-store/get-the-information-of-a-document.md <==
# Get the information of a document
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/{document-id}" method="get" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
==> developer/api-documentations/vector-store/README.md <==
# Vector Store APIs
==> developer/api-documentations/vector-store/get-vector-stores-information.md <==
# Get vector store's information
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/me" method="get" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
==> developer/api-documentations/vector-store/search-for-vector-store-documents-by-semantic-similarity.md <==
# Search for Vector Store documents by semantic similarity
{% swagger src="../../../.gitbook/assets/rememberizer_openapi (1).yml" path="/vector-stores/{vector-store-id}/documents/search" method="get" %}
[rememberizer_openapi (1).yml](<../../../.gitbook/assets/rememberizer_openapi (1).yml>)
{% endswagger %}
==> developer/api-documentations/vector-store/add-new-text-document-to-a-vector-store.md <==
# Add new text document to a Vector Store
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/create" method="post" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
==> developer/api-documentations/vector-store/remove-a-document-in-vector-store.md <==
# Remove a document in Vector Store
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/{document-id}/" method="delete" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
==> developer/api-documentations/vector-store/update-files-content-in-a-vector-store.md <==
# Update file's content in a Vector Store
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/{document-id}/" method="patch" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
==> developer/api-documentations/vector-store/upload-files-to-a-vector-store.md <==
# Upload files to a Vector Store
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/upload" method="post" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
==> developer/api-docs/retrieve-slacks-content.md <==
# Retrieve Slack's content
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/discussions/{discussion_id}/contents/" method="get" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
"https://api.rememberizer.ai/api/v1/discussions/12345/contents/?integration_type=slack&from=2023-06-01T00:00:00Z&to=2023-06-07T23:59:59Z" \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token and `12345` with an actual discussion ID.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const getSlackContents = async (discussionId, from = null, to = null) => {
const url = new URL(`https://api.rememberizer.ai/api/v1/discussions/${discussionId}/contents/`);
url.searchParams.append('integration_type', 'slack');
if (from) {
url.searchParams.append('from', from);
}
if (to) {
url.searchParams.append('to', to);
}
const response = await fetch(url.toString(), {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
const data = await response.json();
console.log(data);
};
// Get Slack contents for the past week
const toDate = new Date().toISOString();
const fromDate = new Date();
fromDate.setDate(fromDate.getDate() - 7);
const fromDateStr = fromDate.toISOString();
getSlackContents(12345, fromDateStr, toDate);
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token and `12345` with an actual discussion ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
from datetime import datetime, timedelta
def get_slack_contents(discussion_id, from_date=None, to_date=None):
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
params = {
"integration_type": "slack"
}
if from_date:
params["from"] = from_date
if to_date:
params["to"] = to_date
response = requests.get(
f"https://api.rememberizer.ai/api/v1/discussions/{discussion_id}/contents/",
headers=headers,
params=params
)
data = response.json()
print(data)
# Get Slack contents for the past week
to_date = datetime.now().isoformat() + "Z"
from_date = (datetime.now() - timedelta(days=7)).isoformat() + "Z"
get_slack_contents(12345, from_date, to_date)
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token and `12345` with an actual discussion ID.
{% endhint %}
{% endtab %}
{% endtabs %}
## Path Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| discussion_id | integer | **Required.** The ID of the Slack channel or discussion to retrieve contents for. |
## Query Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| integration_type | string | **Required.** Set to "slack" for retrieving Slack content. |
| from | string | Starting time in ISO 8601 format at GMT+0. If not specified, the default is now. |
| to | string | Ending time in ISO 8601 format at GMT+0. If not specified, it's 7 days before the "from" parameter. |
## Response Format
||CODE_BLOCK||json
{
"discussion_content": "User A [2023-06-01 10:30:00]: Good morning team!\nUser B [2023-06-01 10:32:15]: Morning! How's everyone doing today?\n...",
"thread_contents": {
"2023-06-01T10:30:00Z": "User C [2023-06-01 10:35:00]: @User A I'm doing great, thanks for asking!\nUser A [2023-06-01 10:37:30]: Glad to hear that @User C!",
"2023-06-02T14:15:22Z": "User D [2023-06-02 14:20:45]: Here's the update on the project...\nUser B [2023-06-02 14:25:10]: Thanks for the update!"
}
}
||CODE_BLOCK||
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 404 | Discussion not found |
| 500 | Internal server error |
This endpoint retrieves the contents of a Slack channel or direct message conversation. It returns both the main channel messages (`discussion_content`) and threaded replies (`thread_contents`). The data is organized chronologically and includes user information, making it easy to understand the context of conversations.
The time range parameters allow you to focus on specific periods, which is particularly useful for reviewing recent activity or historical discussions.
==> developer/api-docs/retrieve-documents.md <==
# Retrieve documents
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/documents/" method="get" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
"https://api.rememberizer.ai/api/v1/documents/?page=1&page_size=20&integration_type=google_drive" \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const getDocuments = async (page = 1, pageSize = 20, integrationType = 'google_drive') => {
const url = new URL('https://api.rememberizer.ai/api/v1/documents/');
url.searchParams.append('page', page);
url.searchParams.append('page_size', pageSize);
if (integrationType) {
url.searchParams.append('integration_type', integrationType);
}
const response = await fetch(url.toString(), {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
const data = await response.json();
console.log(data);
};
getDocuments();
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def get_documents(page=1, page_size=20, integration_type=None):
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
params = {
"page": page,
"page_size": page_size
}
if integration_type:
params["integration_type"] = integration_type
response = requests.get(
"https://api.rememberizer.ai/api/v1/documents/",
headers=headers,
params=params
)
data = response.json()
print(data)
get_documents(integration_type="google_drive")
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% endtabs %}
## Request Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| page | integer | Page number for pagination. Default is 1. |
| page_size | integer | Number of items per page. Default is 10. |
| integration_type | string | Filter documents by integration type. Options: google_drive, slack, dropbox, gmail, common_knowledge |
## Response Format
||CODE_BLOCK||json
{
"count": 257,
"next": "https://api.rememberizer.ai/api/v1/documents/?page=2&page_size=20&integration_type=google_drive",
"previous": null,
"results": [
{
"document_id": "1aBcD2efGhIjK3lMnOpQrStUvWxYz",
"name": "Project Proposal.docx",
"type": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"path": "/Documents/Projects/Proposal.docx",
"url": "https://drive.google.com/file/d/1aBcD2efGhIjK3lMnOpQrStUvWxYz/view",
"id": 12345,
"integration_type": "google_drive",
"source": "[email protected]",
"status": "indexed",
"indexed_on": "2023-06-15T10:30:00Z",
"size": 250000
},
// ... more documents
]
}
||CODE_BLOCK||
## Available Integration Types
| Integration Type | Description |
|-----------------|-------------|
| google_drive | Documents from Google Drive |
| slack | Messages and files from Slack |
| dropbox | Files from Dropbox |
| gmail | Emails from Gmail |
| common_knowledge | Public knowledge sources |
This endpoint retrieves a list of documents from your connected data sources. You can filter by integration type to focus on specific sources.
==> developer/api-docs/mementos.md <==
# Mementos APIs
Mementos allow users to define collections of documents that can be accessed by applications. This document outlines the available Memento APIs.
## List Mementos
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/mementos/" method="get" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
### Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
https://api.rememberizer.ai/api/v1/mementos/ \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const fetchMementos = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/mementos/', {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
const data = await response.json();
console.log(data);
};
fetchMementos();
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def fetch_mementos():
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
response = requests.get(
"https://api.rememberizer.ai/api/v1/mementos/",
headers=headers
)
data = response.json()
print(data)
fetch_mementos()
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% endtabs %}
## Create Memento
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/mementos/" method="post" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
### Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X POST \
https://api.rememberizer.ai/api/v1/mementos/ \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{"name": "Work Documents"}'
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const createMemento = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/mementos/', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN',
'Content-Type': 'application/json'
},
body: JSON.stringify({
name: 'Work Documents'
})
});
const data = await response.json();
console.log(data);
};
createMemento();
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import json
def create_memento():
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN",
"Content-Type": "application/json"
}
payload = {
"name": "Work Documents"
}
response = requests.post(
"https://api.rememberizer.ai/api/v1/mementos/",
headers=headers,
data=json.dumps(payload)
)
data = response.json()
print(data)
create_memento()
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% endtabs %}
## Get Memento Details
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/mementos/{id}/" method="get" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
### Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
https://api.rememberizer.ai/api/v1/mementos/123/ \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token and `123` with an actual memento ID.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const getMementoDetails = async (mementoId) => {
const response = await fetch(`https://api.rememberizer.ai/api/v1/mementos/${mementoId}/`, {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
const data = await response.json();
console.log(data);
};
getMementoDetails(123);
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token and `123` with an actual memento ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def get_memento_details(memento_id):
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
response = requests.get(
f"https://api.rememberizer.ai/api/v1/mementos/{memento_id}/",
headers=headers
)
data = response.json()
print(data)
get_memento_details(123)
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token and `123` with an actual memento ID.
{% endhint %}
{% endtab %}
{% endtabs %}
## Manage Memento Documents
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/mementos/memento_document/{memento_id}/" method="post" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
### Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X POST \
https://api.rememberizer.ai/api/v1/mementos/memento_document/123/ \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"memento": "123",
"add": ["document_id_1", "document_id_2"],
"folder_add": ["folder_id_1"],
"remove": ["document_id_3"]
}'
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token and use actual document and folder IDs.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const manageMementoDocuments = async (mementoId) => {
const response = await fetch(`https://api.rememberizer.ai/api/v1/mementos/memento_document/${mementoId}/`, {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN',
'Content-Type': 'application/json'
},
body: JSON.stringify({
memento: mementoId,
add: ["document_id_1", "document_id_2"],
folder_add: ["folder_id_1"],
remove: ["document_id_3"]
})
});
const data = await response.json();
console.log(data);
};
manageMementoDocuments(123);
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token and use actual document and folder IDs.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import json
def manage_memento_documents(memento_id):
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN",
"Content-Type": "application/json"
}
payload = {
"memento": memento_id,
"add": ["document_id_1", "document_id_2"],
"folder_add": ["folder_id_1"],
"remove": ["document_id_3"]
}
response = requests.post(
f"https://api.rememberizer.ai/api/v1/mementos/memento_document/{memento_id}/",
headers=headers,
data=json.dumps(payload)
)
data = response.json()
print(data)
manage_memento_documents(123)
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token and use actual document and folder IDs.
{% endhint %}
{% endtab %}
{% endtabs %}
## Delete Memento
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/mementos/{id}/" method="delete" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
### Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X DELETE \
https://api.rememberizer.ai/api/v1/mementos/123/ \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token and `123` with an actual memento ID.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const deleteMemento = async (mementoId) => {
const response = await fetch(`https://api.rememberizer.ai/api/v1/mementos/${mementoId}/`, {
method: 'DELETE',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
if (response.status === 204) {
console.log("Memento deleted successfully");
} else {
console.error("Failed to delete memento");
}
};
deleteMemento(123);
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token and `123` with an actual memento ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def delete_memento(memento_id):
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
response = requests.delete(
f"https://api.rememberizer.ai/api/v1/mementos/{memento_id}/",
headers=headers
)
if response.status_code == 204:
print("Memento deleted successfully")
else:
print("Failed to delete memento")
delete_memento(123)
||CODE_BLOCK||
{% hint style="info" %}
To test this API call, replace `YOUR_JWT_TOKEN` with your actual JWT token and `123` with an actual memento ID.
{% endhint %}
{% endtab %}
{% endtabs %}
==> developer/api-docs/retrieve-current-user-account-details.md <==
# Retrieve current user's account details
This endpoint allows you to retrieve the details of the currently authenticated user's account.
## Endpoint
||CODE_BLOCK||
GET /api/v1/account/
||CODE_BLOCK||
## Authentication
This endpoint requires authentication using a JWT token.
## Request
No request parameters are required.
## Response
||CODE_BLOCK||json
{
"id": "user_id",
"email": "[email protected]",
"name": "User Name"
}
||CODE_BLOCK||
## User Profile (Extended Information)
For more detailed user profile information, you can use:
||CODE_BLOCK||
GET /api/v1/me/
||CODE_BLOCK||
### Extended Response
||CODE_BLOCK||json
{
"id": "username",
"email": "[email protected]",
"name": "User Name",
"user_onboarding_status": 7,
"dev_onboarding_status": 3,
"company_name": "Company",
"website": "https://example.com",
"bio": "User bio",
"team": [
{
"id": "team_id",
"name": "Team Name",
"image_url": "url",
"role": "admin"
}
],
"embed_quota": 10000,
"current_usage": 500,
"email_verified": true
}
||CODE_BLOCK||
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 401 | Unauthorized - Invalid or missing authentication credentials |
| 403 | Forbidden - User does not have permission to access this resource |
| 500 | Internal Server Error - Something went wrong on the server |
## Usage Example
### Using cURL
||CODE_BLOCK||bash
curl -H "Authorization: Bearer YOUR_JWT_TOKEN" https://api.rememberizer.ai/api/v1/account/
||CODE_BLOCK||
### Using JavaScript
||CODE_BLOCK||javascript
const response = await fetch('https://api.rememberizer.ai/api/v1/account/', {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
const data = await response.json();
console.log(data);
||CODE_BLOCK||
### Using Python
||CODE_BLOCK||python
import requests
headers = {"Authorization": "Bearer YOUR_JWT_TOKEN"}
response = requests.get("https://api.rememberizer.ai/api/v1/account/", headers=headers)
data = response.json()
print(data)
||CODE_BLOCK||
==> developer/api-docs/README.md <==
# API Documentation
You can authenticate APIs using either [OAuth2](../authorizing-rememberizer-apps.md) or [API keys](../registering-and-using-api-keys.md). OAuth2 is a standard authorization framework that enables applications to securely access specific documents within a system. On the other hand, API keys provide a simpler method to retrieve documents from a common knowledge base without the need to undergo the OAuth2 authentication process.
## API Overview
Rememberizer provides a comprehensive set of APIs for working with documents, vector stores, mementos, and more. The APIs are organized into the following categories:
### Authentication APIs
- Sign Up, Sign In, Email Verification
- Password Reset
- OAuth Endpoints (Google, Microsoft)
- Token Management and Logout
### User APIs
- User Profile and Account Information
- User Onboarding
### Document APIs
- List, Create, and Update Documents
- Document Processing
- Batch Document Operations
### Search APIs
- Basic Search
- Agentic Search
- Batch Search Operations
### Mementos APIs
- Create, List, Update, and Delete Mementos
- Manage Memento Documents
### Vector Stores APIs
- Create and List Vector Stores
- Upload Text and File Documents
- Search Vector Stores
- Batch Upload and Search
### Integrations APIs
- List Integrations
- OAuth Integration Endpoints (Google Drive, Gmail, Slack, Dropbox)
### Applications APIs
- List and Create Applications
### Common Knowledge APIs
- List and Create Common Knowledge Items
### Team APIs
- Team Management
- Team Members
- Role-Based Permissions
For enterprise integration patterns, security considerations, and architectural best practices, see the [Enterprise Integration Patterns](../enterprise-integration-patterns.md) guide.
## Base URL
All API endpoints are relative to:
||CODE_BLOCK||
https://api.rememberizer.ai/api/v1/
||CODE_BLOCK||
## Authentication
Endpoints require authentication using either:
- JWT token (passed in Authorization header or cookies)
- API key (passed in x-api-key header)
For detailed information about specific endpoints, refer to the individual API documentation pages.
==> developer/api-docs/list-available-data-source-integrations.md <==
# List available data source integrations
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/integrations/" method="get" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
https://api.rememberizer.ai/api/v1/integrations/ \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const getIntegrations = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/integrations/', {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
const data = await response.json();
console.log(data);
};
getIntegrations();
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def get_integrations():
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
response = requests.get(
"https://api.rememberizer.ai/api/v1/integrations/",
headers=headers
)
data = response.json()
print(data)
get_integrations()
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% endtabs %}
## Response Format
||CODE_BLOCK||json
{
"data": [
{
"id": 101,
"integration_type": "google_drive",
"integration_step": "authorized",
"source": "[email protected]",
"document_type": "drive",
"document_stats": {
"status": {
"indexed": 250,
"indexing": 5,
"error": 2
},
"total_size": 15000000,
"document_count": 257
},
"consent_time": "2023-06-15T10:30:00Z",
"memory_config": null,
"token_validity": true
},
{
"id": 102,
"integration_type": "slack",
"integration_step": "authorized",
"source": "workspace-name",
"document_type": "channel",
"document_stats": {
"status": {
"indexed": 45,
"indexing": 0,
"error": 0
},
"total_size": 5000000,
"document_count": 45
},
"consent_time": "2023-06-16T14:45:00Z",
"memory_config": null,
"token_validity": true
}
],
"message": "Integrations retrieved successfully",
"code": "success"
}
||CODE_BLOCK||
This endpoint retrieves a list of all available data source integrations for the current user. The response includes detailed information about each integration, including the integration type, status, and document statistics.
==> developer/api-docs/memorize-content-to-rememberizer.md <==
# Memorize content to Rememberizer
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/documents/memorize/" method="post" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X POST \
https://api.rememberizer.ai/api/v1/documents/memorize/ \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "Important Information",
"content": "This is important content that I want Rememberizer to remember."
}'
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const memorizeContent = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/documents/memorize/', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN',
'Content-Type': 'application/json'
},
body: JSON.stringify({
name: 'Important Information',
content: 'This is important content that I want Rememberizer to remember.'
})
});
if (response.status === 201) {
console.log("Content stored successfully");
} else {
console.error("Failed to store content");
const errorData = await response.json();
console.error(errorData);
}
};
memorizeContent();
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import json
def memorize_content():
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN",
"Content-Type": "application/json"
}
payload = {
"name": "Important Information",
"content": "This is important content that I want Rememberizer to remember."
}
response = requests.post(
"https://api.rememberizer.ai/api/v1/documents/memorize/",
headers=headers,
data=json.dumps(payload)
)
if response.status_code == 201:
print("Content stored successfully")
else:
print(f"Failed to store content: {response.text}")
memorize_content()
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% endtabs %}
## Request Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| name | string | **Required.** A name for the content being stored. |
| content | string | **Required.** The text content to store in Rememberizer. |
## Response
A successful request returns a 201 Created status code with no response body.
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 400 | Bad Request - Missing required fields or invalid parameters |
| 401 | Unauthorized - Invalid or missing authentication |
| 500 | Internal Server Error |
## Use Cases
This endpoint is particularly useful for:
1. Storing important notes or information that you want to access later
2. Adding content that isn't available through integrated data sources
3. Manually adding information that needs to be searchable
4. Adding contextual information for LLMs accessing your knowledge base
The stored content becomes searchable through the search endpoints and can be included in mementos.
==> developer/api-docs/authentication.md <==
# Authentication APIs
Rememberizer provides several authentication endpoints to manage user accounts and sessions. This document outlines the available authentication APIs.
## Sign Up
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/auth/signup/" method="post" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
### Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X POST \
https://api.rememberizer.ai/api/v1/auth/signup/ \
-H "Content-Type: application/json" \
-d '{
"email": "[email protected]",
"password": "secure_password",
"name": "John Doe",
"captcha": "recaptcha_response"
}'
||CODE_BLOCK||
{% hint style="info" %}
Replace `recaptcha_response` with an actual reCAPTCHA response.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const signUp = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/auth/signup/', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
email: '[email protected]',
password: 'secure_password',
name: 'John Doe',
captcha: 'recaptcha_response'
})
});
const data = await response.json();
console.log(data);
};
signUp();
||CODE_BLOCK||
{% hint style="info" %}
Replace `recaptcha_response` with an actual reCAPTCHA response.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import json
def sign_up():
headers = {
"Content-Type": "application/json"
}
payload = {
"email": "[email protected]",
"password": "secure_password",
"name": "John Doe",
"captcha": "recaptcha_response"
}
response = requests.post(
"https://api.rememberizer.ai/api/v1/auth/signup/",
headers=headers,
data=json.dumps(payload)
)
data = response.json()
print(data)
sign_up()
||CODE_BLOCK||
{% hint style="info" %}
Replace `recaptcha_response` with an actual reCAPTCHA response.
{% endhint %}
{% endtab %}
{% endtabs %}
## Sign In
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/auth/signin/" method="post" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
### Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X POST \
https://api.rememberizer.ai/api/v1/auth/signin/ \
-H "Content-Type: application/json" \
-d '{
"login": "[email protected]",
"password": "secure_password",
"captcha": "recaptcha_response"
}'
||CODE_BLOCK||
{% hint style="info" %}
Replace `recaptcha_response` with an actual reCAPTCHA response.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const signIn = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/auth/signin/', {
method: 'POST',
headers: {
'Content-Type': 'application/json'
},
body: JSON.stringify({
login: '[email protected]',
password: 'secure_password',
captcha: 'recaptcha_response'
})
});
// Check for auth cookies in response
if (response.status === 204) {
console.log("Login successful!");
} else {
console.error("Login failed!");
}
};
signIn();
||CODE_BLOCK||
{% hint style="info" %}
Replace `recaptcha_response` with an actual reCAPTCHA response.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import json
def sign_in():
headers = {
"Content-Type": "application/json"
}
payload = {
"login": "[email protected]",
"password": "secure_password",
"captcha": "recaptcha_response"
}
response = requests.post(
"https://api.rememberizer.ai/api/v1/auth/signin/",
headers=headers,
data=json.dumps(payload)
)
if response.status_code == 204:
print("Login successful!")
else:
print("Login failed!")
sign_in()
||CODE_BLOCK||
{% hint style="info" %}
Replace `recaptcha_response` with an actual reCAPTCHA response.
{% endhint %}
{% endtab %}
{% endtabs %}
## Email Verification
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/auth/verify-email/" method="post" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
### Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X POST \
https://api.rememberizer.ai/api/v1/auth/verify-email/ \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"verification_code": "123456"
}'
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token and use the verification code sent to your email.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const verifyEmail = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/auth/verify-email/', {
method: 'POST',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN',
'Content-Type': 'application/json'
},
body: JSON.stringify({
verification_code: '123456'
})
});
if (response.status === 200) {
console.log("Email verification successful!");
} else {
console.error("Email verification failed!");
}
};
verifyEmail();
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token and use the verification code sent to your email.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import json
def verify_email():
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN",
"Content-Type": "application/json"
}
payload = {
"verification_code": "123456"
}
response = requests.post(
"https://api.rememberizer.ai/api/v1/auth/verify-email/",
headers=headers,
data=json.dumps(payload)
)
if response.status_code == 200:
print("Email verification successful!")
else:
print("Email verification failed!")
verify_email()
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token and use the verification code sent to your email.
{% endhint %}
{% endtab %}
{% endtabs %}
## Token Management
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/auth/custom-refresh/" method="post" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
### Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X POST \
https://api.rememberizer.ai/api/v1/auth/custom-refresh/ \
-b "refresh_token=YOUR_REFRESH_TOKEN"
||CODE_BLOCK||
{% hint style="info" %}
This endpoint uses cookies for authentication. The refresh token should be sent as a cookie.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const refreshToken = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/auth/custom-refresh/', {
method: 'POST',
credentials: 'include' // This includes cookies in the request
});
if (response.status === 204) {
console.log("Token refreshed successfully!");
} else {
console.error("Token refresh failed!");
}
};
refreshToken();
||CODE_BLOCK||
{% hint style="info" %}
This endpoint uses cookies for authentication. Make sure your application includes credentials in the request.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def refresh_token():
cookies = {
"refresh_token": "YOUR_REFRESH_TOKEN"
}
response = requests.post(
"https://api.rememberizer.ai/api/v1/auth/custom-refresh/",
cookies=cookies
)
if response.status_code == 204:
print("Token refreshed successfully!")
else:
print("Token refresh failed!")
refresh_token()
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_REFRESH_TOKEN` with your actual refresh token.
{% endhint %}
{% endtab %}
{% endtabs %}
## Logout
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/auth/custom-logout/" method="post" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
### Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X POST \
https://api.rememberizer.ai/api/v1/auth/custom-logout/
||CODE_BLOCK||
{% hint style="info" %}
This endpoint will clear the authentication cookies.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const logout = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/auth/custom-logout/', {
method: 'POST',
credentials: 'include' // This includes cookies in the request
});
if (response.status === 204) {
console.log("Logout successful!");
} else {
console.error("Logout failed!");
}
};
logout();
||CODE_BLOCK||
{% hint style="info" %}
This endpoint uses cookies for authentication. Make sure your application includes credentials in the request.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def logout():
session = requests.Session()
response = session.post(
"https://api.rememberizer.ai/api/v1/auth/custom-logout/"
)
if response.status_code == 204:
print("Logout successful!")
else:
print("Logout failed!")
logout()
||CODE_BLOCK||
{% hint style="info" %}
This endpoint will clear the authentication cookies.
{% endhint %}
{% endtab %}
{% endtabs %}
==> developer/api-docs/get-all-added-public-knowledge.md <==
# Get all added public knowledge
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/common-knowledge/subscribed-list/" method="get" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
https://api.rememberizer.ai/api/v1/common-knowledge/subscribed-list/ \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const getPublicKnowledge = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/common-knowledge/subscribed-list/', {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
const data = await response.json();
console.log(data);
};
getPublicKnowledge();
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def get_public_knowledge():
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
response = requests.get(
"https://api.rememberizer.ai/api/v1/common-knowledge/subscribed-list/",
headers=headers
)
data = response.json()
print(data)
get_public_knowledge()
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% endtabs %}
## Response Format
||CODE_BLOCK||json
[
{
"id": 1,
"num_of_subscribers": 76,
"publisher_name": "Rememberizer AI",
"published_by_me": false,
"subscribed_by_me": true,
"size": 66741,
"created": "2023-01-15T14:30:00Z",
"modified": "2023-05-20T09:45:12Z",
"priority_score": 2.053,
"name": "Rememberizer Docs",
"image_url": "https://example.com/images/rememberizer-docs.png",
"description": "The latest documentation and blog posts about Rememberizer.",
"api_key": null,
"is_sharing": true,
"memento": 159,
"document_ids": [1234, 5678, 9012]
}
]
||CODE_BLOCK||
This endpoint retrieves a list of all public knowledge (also known as common knowledge) that the current user has subscribed to. Each item includes metadata about the knowledge source, such as publication date, size, and associated documents.
==> developer/api-docs/search-for-documents-by-semantic-similarity.md <==
---
description: Semantic search endpoint with batch processing capabilities
type: api
last_updated: 2025-04-03
---
# Search for documents by semantic similarity
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/documents/search/" method="get" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
"https://api.rememberizer.ai/api/v1/documents/search/?q=How%20to%20integrate%20Rememberizer%20with%20custom%20applications&n=5&from=2023-01-01T00:00:00Z&to=2023-12-31T23:59:59Z" \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const searchDocuments = async (query, numResults = 5, from = null, to = null) => {
const url = new URL('https://api.rememberizer.ai/api/v1/documents/search/');
url.searchParams.append('q', query);
url.searchParams.append('n', numResults);
if (from) {
url.searchParams.append('from', from);
}
if (to) {
url.searchParams.append('to', to);
}
const response = await fetch(url.toString(), {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
const data = await response.json();
console.log(data);
};
searchDocuments('How to integrate Rememberizer with custom applications', 5);
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def search_documents(query, num_results=5, from_date=None, to_date=None):
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
params = {
"q": query,
"n": num_results
}
if from_date:
params["from"] = from_date
if to_date:
params["to"] = to_date
response = requests.get(
"https://api.rememberizer.ai/api/v1/documents/search/",
headers=headers,
params=params
)
data = response.json()
print(data)
search_documents("How to integrate Rememberizer with custom applications", 5)
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% tab title="Ruby" %}
||CODE_BLOCK||ruby
require 'net/http'
require 'uri'
require 'json'
def search_documents(query, num_results=5, from_date=nil, to_date=nil)
uri = URI('https://api.rememberizer.ai/api/v1/documents/search/')
params = {
q: query,
n: num_results
}
params[:from] = from_date if from_date
params[:to] = to_date if to_date
uri.query = URI.encode_www_form(params)
request = Net::HTTP::Get.new(uri)
request['Authorization'] = 'Bearer YOUR_JWT_TOKEN'
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
response = http.request(request)
data = JSON.parse(response.body)
puts data
end
search_documents("How to integrate Rememberizer with custom applications", 5)
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token.
{% endhint %}
{% endtab %}
{% endtabs %}
## Query Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| q | string | **Required.** The search query text (up to 400 words). |
| n | integer | Number of results to return. Default: 3. Use higher values (e.g., 10) for more comprehensive results. |
| from | string | Start of the time range for documents to be searched, in ISO 8601 format. |
| to | string | End of the time range for documents to be searched, in ISO 8601 format. |
| prev_chunks | integer | Number of preceding chunks to include for context. Default: 2. |
| next_chunks | integer | Number of following chunks to include for context. Default: 2. |
## Response Format
||CODE_BLOCK||json
{
"data_sources": [
{
"name": "Google Drive",
"documents": 3
},
{
"name": "Slack",
"documents": 2
}
],
"matched_chunks": [
{
"document": {
"id": 12345,
"document_id": "1aBcD2efGhIjK3lMnOpQrStUvWxYz",
"name": "Rememberizer API Documentation.pdf",
"type": "application/pdf",
"path": "/Documents/Rememberizer/API Documentation.pdf",
"url": "https://drive.google.com/file/d/1aBcD2efGhIjK3lMnOpQrStUvWxYz/view",
"size": 250000,
"created_time": "2023-05-10T14:30:00Z",
"modified_time": "2023-06-15T09:45:00Z",
"indexed_on": "2023-06-15T10:30:00Z",
"integration": {
"id": 101,
"integration_type": "google_drive"
}
},
"matched_content": "To integrate Rememberizer with custom applications, you can use the OAuth2 authentication flow to authorize your application to access a user's Rememberizer data. Once authorized, your application can use the Rememberizer APIs to search for documents, retrieve content, and more.",
"distance": 0.123
},
// ... more matched chunks
],
"message": "Search completed successfully",
"code": "success"
}
||CODE_BLOCK||
## Search Optimization Tips
### For Question Answering
When searching for an answer to a question, try formulating your query as if it were an ideal answer. For example:
Instead of: "What is vector embedding?"
Try: "Vector embedding is a technique that converts text into numerical vectors in a high-dimensional space."
{% hint style="info" %}
For a deeper understanding of how vector embeddings work and why this search approach is effective, see [What are Vector Embeddings and Vector Databases?](../../background/what-are-vector-embeddings-and-vector-databases.md)
{% endhint %}
### Adjusting Result Count
- Start with `n=3` for quick, high-relevance results
- Increase to `n=10` or higher for more comprehensive information
- If search returns insufficient information, try increasing the `n` parameter
### Time-Based Filtering
Use the `from` and `to` parameters to focus on documents from specific time periods:
- Recent documents: Set `from` to a recent date
- Historical analysis: Specify a specific date range
- Excluding outdated information: Set an appropriate `to` date
## Batch Operations
For efficiently handling large volumes of search queries, Rememberizer supports batch operations to optimize performance and reduce API call overhead.
### Batch Search
{% tabs %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import time
import json
from concurrent.futures import ThreadPoolExecutor
def batch_search_documents(queries, num_results=5, batch_size=10):
"""
Perform batch searches with multiple queries
Args:
queries: List of search query strings
num_results: Number of results to return per query
batch_size: Number of queries to process in parallel
Returns:
List of search results for each query
"""
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN",
"Content-Type": "application/json"
}
results = []
# Process queries in batches
for i in range(0, len(queries), batch_size):
batch = queries[i:i+batch_size]
# Create a thread pool to send requests in parallel
with ThreadPoolExecutor(max_workers=batch_size) as executor:
futures = []
for query in batch:
params = {
"q": query,
"n": num_results
}
future = executor.submit(
requests.get,
"https://api.rememberizer.ai/api/v1/documents/search/",
headers=headers,
params=params
)
futures.append(future)
# Collect results as they complete
for future in futures:
response = future.result()
results.append(response.json())
# Rate limiting - pause between batches to avoid API throttling
if i + batch_size < len(queries):
time.sleep(1)
return results
# Example usage
queries = [
"How to use OAuth with Rememberizer",
"Vector database configuration options",
"Best practices for semantic search",
# Add more queries as needed
]
results = batch_search_documents(queries, num_results=3, batch_size=5)
||CODE_BLOCK||
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
/**
* Perform batch searches with multiple queries
*
* @param {string[]} queries - List of search query strings
* @param {number} numResults - Number of results to return per query
* @param {number} batchSize - Number of queries to process in parallel
* @param {number} delayBetweenBatches - Milliseconds to wait between batches
* @returns {Promise<Array>} - List of search results for each query
*/
async function batchSearchDocuments(queries, numResults = 5, batchSize = 10, delayBetweenBatches = 1000) {
const results = [];
// Process queries in batches
for (let i = 0; i < queries.length; i += batchSize) {
const batch = queries.slice(i, i + batchSize);
// Create an array of promises for concurrent requests
const batchPromises = batch.map(query => {
const url = new URL('https://api.rememberizer.ai/api/v1/documents/search/');
url.searchParams.append('q', query);
url.searchParams.append('n', numResults);
return fetch(url.toString(), {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
}).then(response => response.json());
});
// Wait for all requests in the batch to complete
const batchResults = await Promise.all(batchPromises);
results.push(...batchResults);
// Rate limiting - pause between batches to avoid API throttling
if (i + batchSize < queries.length) {
await new Promise(resolve => setTimeout(resolve, delayBetweenBatches));
}
}
return results;
}
// Example usage
const queries = [
"How to use OAuth with Rememberizer",
"Vector database configuration options",
"Best practices for semantic search",
// Add more queries as needed
];
batchSearchDocuments(queries, 3, 5)
.then(results => console.log(results))
.catch(error => console.error('Error in batch search:', error));
||CODE_BLOCK||
{% endtab %}
{% tab title="Ruby" %}
||CODE_BLOCK||ruby
require 'net/http'
require 'uri'
require 'json'
require 'concurrent'
# Perform batch searches with multiple queries
#
# @param queries [Array<String>] List of search query strings
# @param num_results [Integer] Number of results to return per query
# @param batch_size [Integer] Number of queries to process in parallel
# @param delay_between_batches [Float] Seconds to wait between batches
# @return [Array] List of search results for each query
def batch_search_documents(queries, num_results = 5, batch_size = 10, delay_between_batches = 1.0)
results = []
# Process queries in batches
queries.each_slice(batch_size).with_index do |batch, batch_index|
# Create a thread pool for concurrent requests
pool = Concurrent::FixedThreadPool.new(batch_size)
futures = []
batch.each do |query|
futures << Concurrent::Future.execute(executor: pool) do
uri = URI('https://api.rememberizer.ai/api/v1/documents/search/')
params = {
q: query,
n: num_results
}
uri.query = URI.encode_www_form(params)
request = Net::HTTP::Get.new(uri)
request['Authorization'] = 'Bearer YOUR_JWT_TOKEN'
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
response = http.request(request)
JSON.parse(response.body)
end
end
# Collect results from all threads
batch_results = futures.map(&:value)
results.concat(batch_results)
# Rate limiting - pause between batches to avoid API throttling
if batch_index < (queries.length / batch_size.to_f).ceil - 1
sleep(delay_between_batches)
end
end
pool.shutdown
results
end
# Example usage
queries = [
"How to use OAuth with Rememberizer",
"Vector database configuration options",
"Best practices for semantic search",
# Add more queries as needed
]
results = batch_search_documents(queries, 3, 5)
puts results
||CODE_BLOCK||
{% endtab %}
{% endtabs %}
### Performance Considerations
When implementing batch operations, consider these best practices:
1. **Optimal Batch Size**: Start with batch sizes of 5-10 queries and adjust based on your application's performance characteristics.
2. **Rate Limiting**: Include delays between batches to prevent API throttling. A good starting point is 1 second between batches.
3. **Error Handling**: Implement robust error handling to manage failed requests within batches.
4. **Resource Management**: Monitor client-side resource usage, particularly with large batch sizes, to prevent excessive memory consumption.
5. **Response Processing**: Process batch results asynchronously when possible to improve user experience.
For high-volume applications, consider implementing a queue system to manage large numbers of search requests efficiently.
This endpoint provides powerful semantic search capabilities across your entire knowledge base. It uses vector embeddings to find content based on meaning rather than exact keyword matches.
==> developer/api-docs/retrieve-document-contents.md <==
# Retrieve document contents
{% swagger src="../../.gitbook/assets/rememberizer_openapi.yml" path="/documents/{document_id}/contents/" method="get" %}
[rememberizer_openapi.yml](../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
"https://api.rememberizer.ai/api/v1/documents/12345/contents/?start_chunk=0&end_chunk=20" \
-H "Authorization: Bearer YOUR_JWT_TOKEN"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token and `12345` with an actual document ID.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const getDocumentContents = async (documentId, startChunk = 0, endChunk = 20) => {
const url = new URL(`https://api.rememberizer.ai/api/v1/documents/${documentId}/contents/`);
url.searchParams.append('start_chunk', startChunk);
url.searchParams.append('end_chunk', endChunk);
const response = await fetch(url.toString(), {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
const data = await response.json();
console.log(data);
// If there are more chunks, you can fetch them
if (data.end_chunk < totalChunks) {
// Fetch the next set of chunks
await getDocumentContents(documentId, data.end_chunk, data.end_chunk + 20);
}
};
getDocumentContents(12345);
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token and `12345` with an actual document ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def get_document_contents(document_id, start_chunk=0, end_chunk=20):
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
params = {
"start_chunk": start_chunk,
"end_chunk": end_chunk
}
response = requests.get(
f"https://api.rememberizer.ai/api/v1/documents/{document_id}/contents/",
headers=headers,
params=params
)
data = response.json()
print(data)
# If there are more chunks, you can fetch them
# This is a simplistic example - you might want to implement a proper recursion check
if 'end_chunk' in data and data['end_chunk'] < total_chunks:
get_document_contents(document_id, data['end_chunk'], data['end_chunk'] + 20)
get_document_contents(12345)
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_JWT_TOKEN` with your actual JWT token and `12345` with an actual document ID.
{% endhint %}
{% endtab %}
{% endtabs %}
## Path Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| document_id | integer | **Required.** The ID of the document to retrieve contents for. |
## Query Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| start_chunk | integer | The starting chunk index. Default is 0. |
| end_chunk | integer | The ending chunk index. Default is start_chunk + 20. |
## Response Format
||CODE_BLOCK||json
{
"content": "The full text content of the document chunks...",
"end_chunk": 20
}
||CODE_BLOCK||
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 404 | Document not found |
| 500 | Internal server error |
## Pagination for Large Documents
For large documents, the content is split into chunks. You can retrieve the full document by making multiple requests:
1. Make an initial request with `start_chunk=0`
2. Use the returned `end_chunk` value as the `start_chunk` for the next request
3. Continue until you have retrieved all chunks
This endpoint returns the raw text content of a document, allowing you to access the full information for detailed processing or analysis.
==> developer/api-docs/vector-store/get-a-list-of-documents-in-a-vector-store.md <==
# Get a list of documents in a Vector Store
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents" method="get" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
https://api.rememberizer.ai/api/v1/vector-stores/vs_abc123/documents \
-H "x-api-key: YOUR_API_KEY"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const getVectorStoreDocuments = async (vectorStoreId) => {
const response = await fetch(`https://api.rememberizer.ai/api/v1/vector-stores/${vectorStoreId}/documents`, {
method: 'GET',
headers: {
'x-api-key': 'YOUR_API_KEY'
}
});
const data = await response.json();
console.log(data);
};
getVectorStoreDocuments('vs_abc123');
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def get_vector_store_documents(vector_store_id):
headers = {
"x-api-key": "YOUR_API_KEY"
}
response = requests.get(
f"https://api.rememberizer.ai/api/v1/vector-stores/{vector_store_id}/documents",
headers=headers
)
data = response.json()
print(data)
get_vector_store_documents('vs_abc123')
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% endtabs %}
## Path Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| vector-store-id | string | **Required.** The ID of the vector store to list documents from. |
## Response Format
||CODE_BLOCK||json
[
{
"id": 1234,
"name": "Product Manual.pdf",
"type": "application/pdf",
"vector_store": "vs_abc123",
"size": 250000,
"status": "indexed",
"processing_status": "completed",
"indexed_on": "2023-06-15T10:30:00Z",
"status_error_message": null,
"created": "2023-06-15T10:15:00Z",
"modified": "2023-06-15T10:30:00Z"
},
{
"id": 1235,
"name": "Technical Specifications.docx",
"type": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"vector_store": "vs_abc123",
"size": 125000,
"status": "indexed",
"processing_status": "completed",
"indexed_on": "2023-06-15T11:45:00Z",
"status_error_message": null,
"created": "2023-06-15T11:30:00Z",
"modified": "2023-06-15T11:45:00Z"
}
]
||CODE_BLOCK||
## Authentication
This endpoint requires authentication using an API key in the `x-api-key` header.
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 401 | Unauthorized - Invalid or missing API key |
| 404 | Not Found - Vector Store not found |
| 500 | Internal Server Error |
This endpoint retrieves a list of all documents stored in the specified vector store. It provides metadata about each document, including the document's processing status, size, and indexed timestamp. This information is useful for monitoring your vector store's contents and checking document processing status.
==> developer/api-docs/vector-store/get-the-information-of-a-document.md <==
# Get the information of a document
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/{document-id}" method="get" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
https://api.rememberizer.ai/api/v1/vector-stores/vs_abc123/documents/1234 \
-H "x-api-key: YOUR_API_KEY"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and `1234` with the document ID.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const getDocumentInfo = async (vectorStoreId, documentId) => {
const response = await fetch(`https://api.rememberizer.ai/api/v1/vector-stores/${vectorStoreId}/documents/${documentId}`, {
method: 'GET',
headers: {
'x-api-key': 'YOUR_API_KEY'
}
});
const data = await response.json();
console.log(data);
};
getDocumentInfo('vs_abc123', 1234);
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and `1234` with the document ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def get_document_info(vector_store_id, document_id):
headers = {
"x-api-key": "YOUR_API_KEY"
}
response = requests.get(
f"https://api.rememberizer.ai/api/v1/vector-stores/{vector_store_id}/documents/{document_id}",
headers=headers
)
data = response.json()
print(data)
get_document_info('vs_abc123', 1234)
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and `1234` with the document ID.
{% endhint %}
{% endtab %}
{% endtabs %}
## Path Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| vector-store-id | string | **Required.** The ID of the vector store containing the document. |
| document-id | integer | **Required.** The ID of the document to retrieve. |
## Response Format
||CODE_BLOCK||json
{
"id": 1234,
"name": "Product Manual.pdf",
"type": "application/pdf",
"vector_store": "vs_abc123",
"size": 250000,
"status": "indexed",
"processing_status": "completed",
"indexed_on": "2023-06-15T10:30:00Z",
"status_error_message": null,
"created": "2023-06-15T10:15:00Z",
"modified": "2023-06-15T10:30:00Z"
}
||CODE_BLOCK||
## Authentication
This endpoint requires authentication using an API key in the `x-api-key` header.
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 401 | Unauthorized - Invalid or missing API key |
| 404 | Not Found - Vector Store or document not found |
| 500 | Internal Server Error |
This endpoint retrieves detailed information about a specific document in the vector store. It's useful for checking the processing status of individual documents and retrieving metadata like file type, size, and timestamps. This can be particularly helpful when troubleshooting issues with document processing or when you need to verify that a document was properly indexed.
==> developer/api-docs/vector-store/README.md <==
# Vector Store APIs
The Vector Store APIs allow you to create, manage, and search vector stores in Rememberizer. Vector stores enable you to store and retrieve documents using semantic similarity search.
## Available Vector Store Endpoints
### Management Endpoints
- [Get vector store's information](get-vector-stores-information.md)
- [Get a list of documents in a Vector Store](get-a-list-of-documents-in-a-vector-store.md)
- [Get the information of a document](get-the-information-of-a-document.md)
### Document Operations
- [Add new text document to a Vector Store](add-new-text-document-to-a-vector-store.md)
- [Upload files to a Vector Store](upload-files-to-a-vector-store.md)
- [Update file's content in a Vector Store](update-files-content-in-a-vector-store.md)
- [Remove a document in Vector Store](remove-a-document-in-vector-store.md)
### Search Operations
- [Search for Vector Store documents by semantic similarity](search-for-vector-store-documents-by-semantic-similarity.md)
## Creating a Vector Store
To create a new Vector Store, use the following endpoint:
||CODE_BLOCK||
POST /api/v1/vector-stores/
||CODE_BLOCK||
### Request Body
||CODE_BLOCK||json
{
"name": "Store name",
"description": "Store description",
"embedding_model": "sentence-transformers/all-mpnet-base-v2",
"indexing_algorithm": "ivfflat",
"vector_dimension": 128,
"search_metric": "cosine_distance"
}
||CODE_BLOCK||
### Response
||CODE_BLOCK||json
{
"id": "store_id",
"name": "Vector Store Name",
"description": "Store description",
"created": "2023-05-01T00:00:00Z",
"modified": "2023-05-01T00:00:00Z"
}
||CODE_BLOCK||
## Vector Store Configurations
To retrieve available configurations for vector stores, use:
||CODE_BLOCK||
GET /api/v1/vector-stores/configs
||CODE_BLOCK||
This will return available embedding models, indexing algorithms, and search metrics that can be used when creating or configuring vector stores.
## Authentication
All Vector Store endpoints require authentication using either:
- JWT token for management operations
- API key for document and search operations
==> developer/api-docs/vector-store/get-vector-stores-information.md <==
# Get vector store's information
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/me" method="get" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
https://api.rememberizer.ai/api/v1/vector-stores/me \
-H "x-api-key: YOUR_API_KEY"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const getVectorStoreInfo = async () => {
const response = await fetch('https://api.rememberizer.ai/api/v1/vector-stores/me', {
method: 'GET',
headers: {
'x-api-key': 'YOUR_API_KEY'
}
});
const data = await response.json();
console.log(data);
};
getVectorStoreInfo();
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def get_vector_store_info():
headers = {
"x-api-key": "YOUR_API_KEY"
}
response = requests.get(
"https://api.rememberizer.ai/api/v1/vector-stores/me",
headers=headers
)
data = response.json()
print(data)
get_vector_store_info()
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key.
{% endhint %}
{% endtab %}
{% endtabs %}
## Response Format
||CODE_BLOCK||json
{
"id": "vs_abc123",
"name": "My Vector Store",
"description": "A vector store for product documentation",
"embedding_model": "sentence-transformers/all-mpnet-base-v2",
"indexing_algorithm": "ivfflat",
"vector_dimension": 128,
"search_metric": "cosine_distance",
"created": "2023-06-01T10:30:00Z",
"modified": "2023-06-15T14:45:00Z"
}
||CODE_BLOCK||
## Authentication
This endpoint requires authentication using an API key in the `x-api-key` header.
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 401 | Unauthorized - Invalid or missing API key |
| 404 | Not Found - Vector Store not found |
| 500 | Internal Server Error |
This endpoint retrieves information about the vector store associated with the provided API key. It's useful for checking configuration details, including the embedding model, dimensionality, and search metric being used. This information can be valuable for optimizing search queries and understanding the vector store's capabilities.
==> developer/api-docs/vector-store/search-for-vector-store-documents-by-semantic-similarity.md <==
---
description: Search Vector Store documents with semantic similarity and batch operations
type: api
last_updated: 2025-04-03
---
# Search for Vector Store documents by semantic similarity
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/search" method="get" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X GET \
"https://api.rememberizer.ai/api/v1/vector-stores/vs_abc123/documents/search?q=How%20to%20integrate%20our%20product%20with%20third-party%20systems&n=5&prev_chunks=1&next_chunks=1" \
-H "x-api-key: YOUR_API_KEY"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const searchVectorStore = async (vectorStoreId, query, numResults = 5, prevChunks = 1, nextChunks = 1) => {
const url = new URL(`https://api.rememberizer.ai/api/v1/vector-stores/${vectorStoreId}/documents/search`);
url.searchParams.append('q', query);
url.searchParams.append('n', numResults);
url.searchParams.append('prev_chunks', prevChunks);
url.searchParams.append('next_chunks', nextChunks);
const response = await fetch(url.toString(), {
method: 'GET',
headers: {
'x-api-key': 'YOUR_API_KEY'
}
});
const data = await response.json();
console.log(data);
};
searchVectorStore(
'vs_abc123',
'How to integrate our product with third-party systems',
5,
1,
1
);
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def search_vector_store(vector_store_id, query, num_results=5, prev_chunks=1, next_chunks=1):
headers = {
"x-api-key": "YOUR_API_KEY"
}
params = {
"q": query,
"n": num_results,
"prev_chunks": prev_chunks,
"next_chunks": next_chunks
}
response = requests.get(
f"https://api.rememberizer.ai/api/v1/vector-stores/{vector_store_id}/documents/search",
headers=headers,
params=params
)
data = response.json()
print(data)
search_vector_store(
'vs_abc123',
'How to integrate our product with third-party systems',
5,
1,
1
)
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% tab title="Ruby" %}
||CODE_BLOCK||ruby
require 'net/http'
require 'uri'
require 'json'
def search_vector_store(vector_store_id, query, num_results=5, prev_chunks=1, next_chunks=1)
uri = URI("https://api.rememberizer.ai/api/v1/vector-stores/#{vector_store_id}/documents/search")
params = {
q: query,
n: num_results,
prev_chunks: prev_chunks,
next_chunks: next_chunks
}
uri.query = URI.encode_www_form(params)
request = Net::HTTP::Get.new(uri)
request['x-api-key'] = 'YOUR_API_KEY'
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
response = http.request(request)
data = JSON.parse(response.body)
puts data
end
search_vector_store(
'vs_abc123',
'How to integrate our product with third-party systems',
5,
1,
1
)
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% endtabs %}
## Path Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| vector-store-id | string | **Required.** The ID of the vector store to search in. |
## Query Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| q | string | **Required.** The search query text. |
| n | integer | Number of results to return. Default: 10. |
| t | number | Matching threshold. Default: 0.7. |
| prev_chunks | integer | Number of chunks before the matched chunk to include. Default: 0. |
| next_chunks | integer | Number of chunks after the matched chunk to include. Default: 0. |
## Response Format
||CODE_BLOCK||json
{
"vector_store": {
"id": "vs_abc123",
"name": "Product Documentation"
},
"matched_chunks": [
{
"document": {
"id": 1234,
"name": "Integration Guide.pdf",
"type": "application/pdf",
"size": 250000,
"indexed_on": "2023-06-15T10:30:00Z",
"vector_store": "vs_abc123",
"created": "2023-06-15T10:15:00Z",
"modified": "2023-06-15T10:30:00Z"
},
"matched_content": "Our product offers several integration options for third-party systems. The primary method is through our RESTful API, which supports OAuth2 authentication. Additionally, you can use our SDK available in Python, JavaScript, and Java.",
"distance": 0.123
},
// ... more matched chunks
]
}
||CODE_BLOCK||
## Authentication
This endpoint requires authentication using an API key in the `x-api-key` header.
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 400 | Bad Request - Missing required parameters or invalid format |
| 401 | Unauthorized - Invalid or missing API key |
| 404 | Not Found - Vector Store not found |
| 500 | Internal Server Error |
## Search Optimization Tips
### Context Windows
Use the `prev_chunks` and `next_chunks` parameters to control how much context is included with each match:
- Set both to 0 for precise matches without context
- Set both to 1-2 for matches with minimal context
- Set both to 3-5 for matches with substantial context
### Matching Threshold
The `t` parameter controls how strictly matches are filtered:
- Higher values (e.g., 0.9) return only very close matches
- Lower values (e.g., 0.5) return more matches with greater variety
- The default (0.7) provides a balanced approach
## Batch Operations
For high-throughput applications, Rememberizer supports efficient batch operations on vector stores. These methods optimize performance when processing multiple search queries.
### Batch Search Implementation
{% tabs %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import time
import concurrent.futures
def batch_search_vector_store(vector_store_id, queries, num_results=5, batch_size=10):
"""
Perform batch searches against a vector store
Args:
vector_store_id: ID of the vector store to search
queries: List of search query strings
num_results: Number of results per query
batch_size: Number of parallel requests
Returns:
List of search results
"""
headers = {
"x-api-key": "YOUR_API_KEY"
}
results = []
# Process in batches to avoid overwhelming the API
for i in range(0, len(queries), batch_size):
batch_queries = queries[i:i+batch_size]
with concurrent.futures.ThreadPoolExecutor(max_workers=batch_size) as executor:
futures = []
for query in batch_queries:
params = {
"q": query,
"n": num_results,
"prev_chunks": 1,
"next_chunks": 1
}
# Submit the request to the thread pool
future = executor.submit(
requests.get,
f"https://api.rememberizer.ai/api/v1/vector-stores/{vector_store_id}/documents/search",
headers=headers,
params=params
)
futures.append(future)
# Collect results from all futures
for future in futures:
response = future.result()
if response.status_code == 200:
results.append(response.json())
else:
results.append({"error": f"Failed with status code: {response.status_code}"})
# Add a delay between batches to avoid rate limiting
if i + batch_size < len(queries):
time.sleep(1)
return results
# Example usage
queries = [
"Integration with REST APIs",
"Authentication protocols",
"How to deploy to production",
"Performance optimization techniques",
"Error handling best practices"
]
search_results = batch_search_vector_store("vs_abc123", queries, num_results=3, batch_size=5)
||CODE_BLOCK||
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
/**
* Perform batch searches against a vector store
*
* @param {string} vectorStoreId - ID of the vector store
* @param {string[]} queries - List of search queries
* @param {Object} options - Configuration options
* @returns {Promise<Array>} - List of search results
*/
async function batchSearchVectorStore(vectorStoreId, queries, options = {}) {
const {
numResults = 5,
batchSize = 10,
delayBetweenBatches = 1000,
prevChunks = 1,
nextChunks = 1,
distanceThreshold = 0.7
} = options;
const results = [];
const apiKey = 'YOUR_API_KEY';
// Process in batches to manage API load
for (let i = 0; i < queries.length; i += batchSize) {
const batchQueries = queries.slice(i, i + batchSize);
// Create promise array for parallel requests
const batchPromises = batchQueries.map(query => {
const url = new URL(`https://api.rememberizer.ai/api/v1/vector-stores/${vectorStoreId}/documents/search`);
url.searchParams.append('q', query);
url.searchParams.append('n', numResults);
url.searchParams.append('prev_chunks', prevChunks);
url.searchParams.append('next_chunks', nextChunks);
url.searchParams.append('t', distanceThreshold);
return fetch(url.toString(), {
method: 'GET',
headers: {
'x-api-key': apiKey
}
})
.then(response => {
if (response.ok) {
return response.json();
} else {
return { error: `Failed with status: ${response.status}` };
}
})
.catch(error => {
return { error: error.message };
});
});
// Wait for all requests in batch to complete
const batchResults = await Promise.all(batchPromises);
results.push(...batchResults);
// Add delay between batches to avoid rate limiting
if (i + batchSize < queries.length) {
await new Promise(resolve => setTimeout(resolve, delayBetweenBatches));
}
}
return results;
}
// Example usage
const queries = [
"Integration with REST APIs",
"Authentication protocols",
"How to deploy to production",
"Performance optimization techniques",
"Error handling best practices"
];
const options = {
numResults: 3,
batchSize: 5,
delayBetweenBatches: 1000,
prevChunks: 1,
nextChunks: 1
};
batchSearchVectorStore("vs_abc123", queries, options)
.then(results => console.log(results))
.catch(error => console.error("Batch search failed:", error));
||CODE_BLOCK||
{% endtab %}
{% tab title="Ruby" %}
||CODE_BLOCK||ruby
require 'net/http'
require 'uri'
require 'json'
require 'concurrent'
# Perform batch searches against a vector store
#
# @param vector_store_id [String] ID of the vector store
# @param queries [Array<String>] List of search queries
# @param num_results [Integer] Number of results per query
# @param batch_size [Integer] Number of parallel requests
# @param delay_between_batches [Float] Seconds to wait between batches
# @return [Array] Search results for each query
def batch_search_vector_store(vector_store_id, queries, num_results: 5, batch_size: 10, delay_between_batches: 1.0)
results = []
api_key = 'YOUR_API_KEY'
# Process in batches
queries.each_slice(batch_size).with_index do |batch_queries, batch_index|
# Create a thread pool for concurrent execution
pool = Concurrent::FixedThreadPool.new(batch_size)
futures = []
batch_queries.each do |query|
# Submit each request to thread pool
futures << Concurrent::Future.execute(executor: pool) do
uri = URI("https://api.rememberizer.ai/api/v1/vector-stores/#{vector_store_id}/documents/search")
params = {
q: query,
n: num_results,
prev_chunks: 1,
next_chunks: 1
}
uri.query = URI.encode_www_form(params)
request = Net::HTTP::Get.new(uri)
request['x-api-key'] = api_key
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
begin
response = http.request(request)
if response.code.to_i == 200
JSON.parse(response.body)
else
{ "error" => "Failed with status code: #{response.code}" }
end
rescue => e
{ "error" => e.message }
end
end
end
# Collect results from all futures
batch_results = futures.map(&:value)
results.concat(batch_results)
# Add delay between batches
if batch_index < (queries.length / batch_size.to_f).ceil - 1
sleep(delay_between_batches)
end
end
pool.shutdown
results
end
# Example usage
queries = [
"Integration with REST APIs",
"Authentication protocols",
"How to deploy to production",
"Performance optimization techniques",
"Error handling best practices"
]
results = batch_search_vector_store(
"vs_abc123",
queries,
num_results: 3,
batch_size: 5
)
puts results
||CODE_BLOCK||
{% endtab %}
{% endtabs %}
### Performance Optimization for Batch Operations
When implementing batch operations for vector store searches, consider these best practices:
1. **Optimal Batch Sizing**: For most applications, processing 5-10 queries in parallel provides a good balance between throughput and resource usage.
2. **Rate Limiting Awareness**: Include delay mechanisms between batches (typically 1-2 seconds) to avoid hitting API rate limits.
3. **Error Handling**: Implement robust error handling for individual queries that may fail within a batch.
4. **Connection Management**: For high-volume applications, implement connection pooling to reduce overhead.
5. **Timeout Configuration**: Set appropriate timeouts for each request to prevent long-running queries from blocking the entire batch.
6. **Result Processing**: Consider processing results asynchronously as they become available rather than waiting for all results.
7. **Monitoring**: Track performance metrics like average response time and success rates to identify optimization opportunities.
For production applications with very high query volumes, consider implementing a queue system with worker processes to manage large batches efficiently.
This endpoint allows you to search your vector store using semantic similarity. It returns documents that are conceptually related to your query, even if they don't contain the exact keywords. This makes it particularly powerful for natural language queries and question answering.
==> developer/api-docs/vector-store/add-new-text-document-to-a-vector-store.md <==
# Add new text document to a Vector Store
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/create" method="post" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X POST \
https://api.rememberizer.ai/api/v1/vector-stores/vs_abc123/documents/create \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Product Overview",
"text": "Our product is an innovative solution for managing vector embeddings. It provides seamless integration with your existing systems and offers powerful semantic search capabilities."
}'
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const addTextDocument = async (vectorStoreId, name, text) => {
const response = await fetch(`https://api.rememberizer.ai/api/v1/vector-stores/${vectorStoreId}/documents/create`, {
method: 'POST',
headers: {
'x-api-key': 'YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
name: name,
text: text
})
});
const data = await response.json();
console.log(data);
};
addTextDocument(
'vs_abc123',
'Product Overview',
'Our product is an innovative solution for managing vector embeddings. It provides seamless integration with your existing systems and offers powerful semantic search capabilities.'
);
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import json
def add_text_document(vector_store_id, name, text):
headers = {
"x-api-key": "YOUR_API_KEY",
"Content-Type": "application/json"
}
payload = {
"name": name,
"text": text
}
response = requests.post(
f"https://api.rememberizer.ai/api/v1/vector-stores/{vector_store_id}/documents/create",
headers=headers,
data=json.dumps(payload)
)
data = response.json()
print(data)
add_text_document(
'vs_abc123',
'Product Overview',
'Our product is an innovative solution for managing vector embeddings. It provides seamless integration with your existing systems and offers powerful semantic search capabilities.'
)
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% endtabs %}
## Path Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| vector-store-id | string | **Required.** The ID of the vector store to add the document to. |
## Request Body
||CODE_BLOCK||json
{
"name": "Product Overview",
"text": "Our product is an innovative solution for managing vector embeddings. It provides seamless integration with your existing systems and offers powerful semantic search capabilities."
}
||CODE_BLOCK||
| Parameter | Type | Description |
|-----------|------|-------------|
| name | string | **Required.** The name of the document. |
| text | string | **Required.** The text content of the document. |
## Response Format
||CODE_BLOCK||json
{
"id": 1234,
"name": "Product Overview",
"type": "text/plain",
"vector_store": "vs_abc123",
"size": 173,
"status": "processing",
"processing_status": "queued",
"indexed_on": null,
"status_error_message": null,
"created": "2023-06-15T10:15:00Z",
"modified": "2023-06-15T10:15:00Z"
}
||CODE_BLOCK||
## Authentication
This endpoint requires authentication using an API key in the `x-api-key` header.
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 400 | Bad Request - Missing required fields or invalid format |
| 401 | Unauthorized - Invalid or missing API key |
| 404 | Not Found - Vector Store not found |
| 500 | Internal Server Error |
This endpoint allows you to add text content directly to your vector store. It's particularly useful for storing information that might not exist in file format, such as product descriptions, knowledge base articles, or custom content. The text will be automatically processed into vector embeddings, making it searchable using semantic similarity.
==> developer/api-docs/vector-store/remove-a-document-in-vector-store.md <==
# Remove a document in Vector Store
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/{document-id}/" method="delete" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X DELETE \
https://api.rememberizer.ai/api/v1/vector-stores/vs_abc123/documents/1234/ \
-H "x-api-key: YOUR_API_KEY"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and `1234` with the document ID.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const deleteDocument = async (vectorStoreId, documentId) => {
const response = await fetch(`https://api.rememberizer.ai/api/v1/vector-stores/${vectorStoreId}/documents/${documentId}/`, {
method: 'DELETE',
headers: {
'x-api-key': 'YOUR_API_KEY'
}
});
if (response.status === 204) {
console.log("Document deleted successfully");
} else {
console.error("Failed to delete document");
}
};
deleteDocument('vs_abc123', 1234);
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and `1234` with the document ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def delete_document(vector_store_id, document_id):
headers = {
"x-api-key": "YOUR_API_KEY"
}
response = requests.delete(
f"https://api.rememberizer.ai/api/v1/vector-stores/{vector_store_id}/documents/{document_id}/",
headers=headers
)
if response.status_code == 204:
print("Document deleted successfully")
else:
print(f"Failed to delete document: {response.text}")
delete_document('vs_abc123', 1234)
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and `1234` with the document ID.
{% endhint %}
{% endtab %}
{% endtabs %}
## Path Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| vector-store-id | string | **Required.** The ID of the vector store containing the document. |
| document-id | integer | **Required.** The ID of the document to delete. |
## Response
A successful request returns a 204 No Content status code with no response body.
## Authentication
This endpoint requires authentication using an API key in the `x-api-key` header.
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 401 | Unauthorized - Invalid or missing API key |
| 404 | Not Found - Vector Store or document not found |
| 500 | Internal Server Error |
This endpoint allows you to remove a document from your vector store. Once deleted, the document and its vector embeddings will no longer be available for search operations. This is useful for removing outdated, irrelevant, or sensitive content from your knowledge base.
{% hint style="warning" %}
Warning: Document deletion is permanent and cannot be undone. Make sure you have a backup of important documents before deleting them.
{% endhint %}
==> developer/api-docs/vector-store/update-files-content-in-a-vector-store.md <==
# Update file's content in a Vector Store
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/{document-id}/" method="patch" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X PATCH \
https://api.rememberizer.ai/api/v1/vector-stores/vs_abc123/documents/1234/ \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Updated Product Overview"
}'
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and `1234` with the document ID.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const updateDocument = async (vectorStoreId, documentId, newName) => {
const response = await fetch(`https://api.rememberizer.ai/api/v1/vector-stores/${vectorStoreId}/documents/${documentId}/`, {
method: 'PATCH',
headers: {
'x-api-key': 'YOUR_API_KEY',
'Content-Type': 'application/json'
},
body: JSON.stringify({
name: newName
})
});
const data = await response.json();
console.log(data);
};
updateDocument('vs_abc123', 1234, 'Updated Product Overview');
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and `1234` with the document ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
import json
def update_document(vector_store_id, document_id, new_name):
headers = {
"x-api-key": "YOUR_API_KEY",
"Content-Type": "application/json"
}
payload = {
"name": new_name
}
response = requests.patch(
f"https://api.rememberizer.ai/api/v1/vector-stores/{vector_store_id}/documents/{document_id}/",
headers=headers,
data=json.dumps(payload)
)
data = response.json()
print(data)
update_document('vs_abc123', 1234, 'Updated Product Overview')
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and `1234` with the document ID.
{% endhint %}
{% endtab %}
{% endtabs %}
## Path Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| vector-store-id | string | **Required.** The ID of the vector store containing the document. |
| document-id | integer | **Required.** The ID of the document to update. |
## Request Body
||CODE_BLOCK||json
{
"name": "Updated Product Overview"
}
||CODE_BLOCK||
| Parameter | Type | Description |
|-----------|------|-------------|
| name | string | The new name for the document. |
## Response Format
||CODE_BLOCK||json
{
"id": 1234,
"name": "Updated Product Overview",
"type": "text/plain",
"vector_store": "vs_abc123",
"size": 173,
"status": "indexed",
"processing_status": "completed",
"indexed_on": "2023-06-15T10:30:00Z",
"status_error_message": null,
"created": "2023-06-15T10:15:00Z",
"modified": "2023-06-15T11:45:00Z"
}
||CODE_BLOCK||
## Authentication
This endpoint requires authentication using an API key in the `x-api-key` header.
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 400 | Bad Request - Invalid request format |
| 401 | Unauthorized - Invalid or missing API key |
| 404 | Not Found - Vector Store or document not found |
| 500 | Internal Server Error |
This endpoint allows you to update the metadata of a document in your vector store. Currently, you can only update the document's name. This is useful for improving document organization and discoverability without needing to re-upload the document.
{% hint style="info" %}
Note: This endpoint only updates the document's metadata, not its content. To update the content, you need to delete the existing document and upload a new one.
{% endhint %}
==> developer/api-docs/vector-store/upload-files-to-a-vector-store.md <==
---
description: Upload file content to Vector Store with batch operations
type: api
last_updated: 2025-04-03
---
# Upload files to a Vector Store
{% swagger src="../../../.gitbook/assets/rememberizer_openapi.yml" path="/vector-stores/{vector-store-id}/documents/upload" method="post" %}
[rememberizer_openapi.yml](../../../.gitbook/assets/rememberizer_openapi.yml)
{% endswagger %}
## Example Requests
{% tabs %}
{% tab title="cURL" %}
||CODE_BLOCK||bash
curl -X POST \
https://api.rememberizer.ai/api/v1/vector-stores/vs_abc123/documents/upload \
-H "x-api-key: YOUR_API_KEY" \
-F "files=@/path/to/document1.pdf" \
-F "files=@/path/to/document2.docx"
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and provide the paths to your local files.
{% endhint %}
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
const uploadFiles = async (vectorStoreId, files) => {
const formData = new FormData();
// Add multiple files to the form data
for (const file of files) {
formData.append('files', file);
}
const response = await fetch(`https://api.rememberizer.ai/api/v1/vector-stores/${vectorStoreId}/documents/upload`, {
method: 'POST',
headers: {
'x-api-key': 'YOUR_API_KEY'
// Note: Do not set Content-Type header, it will be set automatically with the correct boundary
},
body: formData
});
const data = await response.json();
console.log(data);
};
// Example usage with file input element
const fileInput = document.getElementById('fileInput');
uploadFiles('vs_abc123', fileInput.files);
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key and `vs_abc123` with your Vector Store ID.
{% endhint %}
{% endtab %}
{% tab title="Python" %}
||CODE_BLOCK||python
import requests
def upload_files(vector_store_id, file_paths):
headers = {
"x-api-key": "YOUR_API_KEY"
}
files = [('files', (file_path.split('/')[-1], open(file_path, 'rb'))) for file_path in file_paths]
response = requests.post(
f"https://api.rememberizer.ai/api/v1/vector-stores/{vector_store_id}/documents/upload",
headers=headers,
files=files
)
data = response.json()
print(data)
upload_files('vs_abc123', ['/path/to/document1.pdf', '/path/to/document2.docx'])
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and provide the paths to your local files.
{% endhint %}
{% endtab %}
{% tab title="Ruby" %}
||CODE_BLOCK||ruby
require 'net/http'
require 'uri'
require 'json'
def upload_files(vector_store_id, file_paths)
uri = URI("https://api.rememberizer.ai/api/v1/vector-stores/#{vector_store_id}/documents/upload")
# Create a new HTTP object
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
# Create a multipart-form request
request = Net::HTTP::Post.new(uri)
request['x-api-key'] = 'YOUR_API_KEY'
# Create a multipart boundary
boundary = "RubyFormBoundary#{rand(1000000)}"
request['Content-Type'] = "multipart/form-data; boundary=#{boundary}"
# Build the request body
body = []
file_paths.each do |file_path|
file_name = File.basename(file_path)
file_content = File.read(file_path, mode: 'rb')
body << "--#{boundary}\r\n"
body << "Content-Disposition: form-data; name=\"files\"; filename=\"#{file_name}\"\r\n"
body << "Content-Type: #{get_content_type(file_name)}\r\n\r\n"
body << file_content
body << "\r\n"
end
body << "--#{boundary}--\r\n"
request.body = body.join
# Send the request
response = http.request(request)
# Parse and return the response
JSON.parse(response.body)
end
# Helper method to determine content type
def get_content_type(filename)
ext = File.extname(filename).downcase
case ext
when '.pdf' then 'application/pdf'
when '.doc' then 'application/msword'
when '.docx' then 'application/vnd.openxmlformats-officedocument.wordprocessingml.document'
when '.txt' then 'text/plain'
when '.md' then 'text/markdown'
when '.json' then 'application/json'
else 'application/octet-stream'
end
end
# Example usage
result = upload_files('vs_abc123', ['/path/to/document1.pdf', '/path/to/document2.docx'])
puts result
||CODE_BLOCK||
{% hint style="info" %}
Replace `YOUR_API_KEY` with your actual Vector Store API key, `vs_abc123` with your Vector Store ID, and provide the paths to your local files.
{% endhint %}
{% endtab %}
{% endtabs %}
## Path Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| vector-store-id | string | **Required.** The ID of the vector store to upload files to. |
## Request Body
This endpoint accepts a `multipart/form-data` request with one or more files in the `files` field.
## Response Format
||CODE_BLOCK||json
{
"documents": [
{
"id": 1234,
"name": "document1.pdf",
"type": "application/pdf",
"size": 250000,
"status": "processing",
"created": "2023-06-15T10:15:00Z",
"vector_store": "vs_abc123"
},
{
"id": 1235,
"name": "document2.docx",
"type": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"size": 180000,
"status": "processing",
"created": "2023-06-15T10:15:00Z",
"vector_store": "vs_abc123"
}
],
"errors": []
}
||CODE_BLOCK||
If some files fail to upload, they will be listed in the `errors` array:
||CODE_BLOCK||json
{
"documents": [
{
"id": 1234,
"name": "document1.pdf",
"type": "application/pdf",
"size": 250000,
"status": "processing",
"created": "2023-06-15T10:15:00Z",
"vector_store": "vs_abc123"
}
],
"errors": [
{
"file": "document2.docx",
"error": "File format not supported"
}
]
}
||CODE_BLOCK||
## Authentication
This endpoint requires authentication using an API key in the `x-api-key` header.
## Supported File Formats
- PDF (`.pdf`)
- Microsoft Word (`.doc`, `.docx`)
- Microsoft Excel (`.xls`, `.xlsx`)
- Microsoft PowerPoint (`.ppt`, `.pptx`)
- Text files (`.txt`)
- Markdown (`.md`)
- JSON (`.json`)
- HTML (`.html`, `.htm`)
## File Size Limits
- Individual file size limit: 50MB
- Total request size limit: 100MB
- Maximum number of files per request: 20
## Error Responses
| Status Code | Description |
|-------------|-------------|
| 400 | Bad Request - No files provided or invalid request format |
| 401 | Unauthorized - Invalid or missing API key |
| 404 | Not Found - Vector Store not found |
| 413 | Payload Too Large - Files exceed size limit |
| 415 | Unsupported Media Type - File format not supported |
| 500 | Internal Server Error |
| 207 | Multi-Status - Some files were uploaded successfully, but others failed |
## Processing Status
Files are initially accepted with a status of `processing`. You can check the processing status of the documents using the [Get a List of Documents in a Vector Store](get-a-list-of-documents-in-a-vector-store.md) endpoint. Final status will be one of:
- `done`: Document was successfully processed
- `error`: An error occurred during processing
- `processing`: Document is still being processed
Processing time depends on file size and complexity. Typical processing time is between 30 seconds to 5 minutes per document.
## Batch Operations
For efficiently uploading multiple files to your Vector Store, Rememberizer supports batch operations. This approach helps optimize performance when dealing with large numbers of documents.
### Batch Upload Implementation
{% tabs %}
{% tab title="Python" %}
||CODE_BLOCK||python
import os
import requests
import time
import concurrent.futures
from pathlib import Path
def batch_upload_to_vector_store(vector_store_id, folder_path, batch_size=5, file_types=None):
"""
Upload all files from a directory to a Vector Store in batches
Args:
vector_store_id: ID of the vector store
folder_path: Path to folder containing files to upload
batch_size: Number of files to upload in each batch
file_types: Optional list of file extensions to filter by (e.g., ['.pdf', '.docx'])
Returns:
List of upload results
"""
api_key = "YOUR_API_KEY"
headers = {"x-api-key": api_key}
# Get list of files in directory
files = []
for entry in os.scandir(folder_path):
if entry.is_file():
file_path = Path(entry.path)
# Filter by file extension if specified
if file_types is None or file_path.suffix.lower() in file_types:
files.append(file_path)
print(f"Found {len(files)} files to upload")
results = []
# Process files in batches
for i in range(0, len(files), batch_size):
batch = files[i:i+batch_size]
print(f"Processing batch {i//batch_size + 1}/{(len(files) + batch_size - 1)//batch_size}: {len(batch)} files")
# Upload batch
upload_files = []
for file_path in batch:
upload_files.append(('files', (file_path.name, open(file_path, 'rb'))))
try:
response = requests.post(
f"https://api.rememberizer.ai/api/v1/vector-stores/{vector_store_id}/documents/upload",
headers=headers,
files=upload_files
)
# Close all file handles
for _, (_, file_obj) in upload_files:
file_obj.close()
if response.status_code in (200, 201, 207):
batch_result = response.json()
results.append(batch_result)
print(f"Successfully uploaded batch - {len(batch_result.get('documents', []))} documents processed")
# Check for errors
if batch_result.get('errors') and len(batch_result['errors']) > 0:
print(f"Errors encountered: {len(batch_result['errors'])}")
for error in batch_result['errors']:
print(f"- {error['file']}: {error['error']}")
else:
print(f"Batch upload failed with status code {response.status_code}: {response.text}")
results.append({"error": f"Batch failed: {response.text}"})
except Exception as e:
print(f"Exception during batch upload: {str(e)}")
results.append({"error": str(e)})
# Close any remaining file handles in case of exception
for _, (_, file_obj) in upload_files:
try:
file_obj.close()
except:
pass
# Rate limiting - pause between batches
if i + batch_size < len(files):
print("Pausing before next batch...")
time.sleep(2)
return results
# Example usage
results = batch_upload_to_vector_store(
'vs_abc123',
'/path/to/documents/folder',
batch_size=5,
file_types=['.pdf', '.docx', '.txt']
)
||CODE_BLOCK||
{% endtab %}
{% tab title="JavaScript" %}
||CODE_BLOCK||javascript
/**
* Upload files to a Vector Store in batches
*
* @param {string} vectorStoreId - ID of the Vector Store
* @param {FileList|File[]} files - Files to upload
* @param {Object} options - Configuration options
* @returns {Promise<Array>} - List of upload results
*/
async function batchUploadToVectorStore(vectorStoreId, files, options = {}) {
const {
batchSize = 5,
delayBetweenBatches = 2000,
onProgress = null
} = options;
const apiKey = 'YOUR_API_KEY';
const results = [];
const fileList = Array.from(files);
const totalBatches = Math.ceil(fileList.length / batchSize);
console.log(`Preparing to upload ${fileList.length} files in ${totalBatches} batches`);
// Process files in batches
for (let i = 0; i < fileList.length; i += batchSize) {
const batch = fileList.slice(i, i + batchSize);
const batchNumber = Math.floor(i / batchSize) + 1;
console.log(`Processing batch ${batchNumber}/${totalBatches}: ${batch.length} files`);
if (onProgress) {
onProgress({
currentBatch: batchNumber,
totalBatches: totalBatches,
filesInBatch: batch.length,
totalFiles: fileList.length,
completedFiles: i
});
}
// Create FormData for this batch
const formData = new FormData();
batch.forEach(file => {
formData.append('files', file);
});
try {
const response = await fetch(
`https://api.rememberizer.ai/api/v1/vector-stores/${vectorStoreId}/documents/upload`,
{
method: 'POST',
headers: {
'x-api-key': apiKey
},
body: formData
}
);
if (response.ok) {
const batchResult = await response.json();
results.push(batchResult);
console.log(`Successfully uploaded batch - ${batchResult.documents?.length || 0} documents processed`);
// Check for errors
if (batchResult.errors && batchResult.errors.length > 0) {
console.warn(`Errors encountered: ${batchResult.errors.length}`);
batchResult.errors.forEach(error => {
console.warn(`- ${error.file}: ${error.error}`);
});
}
} else {
console.error(`Batch upload failed with status ${response.status}: ${await response.text()}`);
results.push({ error: `Batch failed with status: ${response.status}` });
}
} catch (error) {
console.error(`Exception during batch upload: ${error.message}`);
results.push({ error: error.message });
}
// Add delay between batches to avoid rate limiting
if (i + batchSize < fileList.length) {
console.log(`Pausing for ${delayBetweenBatches}ms before next batch...`);
await new Promise(resolve => setTimeout(resolve, delayBetweenBatches));
}
}
console.log(`Upload complete. Processed ${fileList.length} files.`);
return results;
}
// Example usage with file input element
document.getElementById('upload-button').addEventListener('click', async () => {
const fileInput = document.getElementById('file-input');
const vectorStoreId = 'vs_abc123';
const progressBar = document.getElementById('progress-bar');
try {
const results = await batchUploadToVectorStore(vectorStoreId, fileInput.files, {
batchSize: 5,
onProgress: (progress) => {
// Update progress UI
const percentage = Math.round((progress.completedFiles / progress.totalFiles) * 100);
progressBar.style.width = `${percentage}%`;
progressBar.textContent = `${percentage}% (Batch ${progress.currentBatch}/${progress.totalBatches})`;
}
});
console.log('Complete upload results:', results);
} catch (error) {
console.error('Upload failed:', error);
}
});
||CODE_BLOCK||
{% endtab %}
{% tab title="Ruby" %}
||CODE_BLOCK||ruby
require 'net/http'
require 'uri'
require 'json'
require 'mime/types'
# Upload files to a Vector Store in batches
#
# @param vector_store_id [String] ID of the Vector Store
# @param folder_path [String] Path to folder containing files to upload
# @param batch_size [Integer] Number of files to upload in each batch
# @param file_types [Array<String>] Optional array of file extensions to filter by
# @param delay_between_batches [Float] Seconds to wait between batches
# @return [Array] List of upload results
def batch_upload_to_vector_store(vector_store_id, folder_path, batch_size: 5, file_types: nil, delay_between_batches: 2.0)
api_key = 'YOUR_API_KEY'
results = []
# Get list of files in directory
files = Dir.entries(folder_path)
.select { |f| File.file?(File.join(folder_path, f)) }
.select { |f| file_types.nil? || file_types.include?(File.extname(f).downcase) }
.map { |f| File.join(folder_path, f) }
puts "Found #{files.count} files to upload"
total_batches = (files.count.to_f / batch_size).ceil
# Process files in batches
files.each_slice(batch_size).with_index do |batch, batch_index|
puts "Processing batch #{batch_index + 1}/#{total_batches}: #{batch.count} files"
# Prepare the HTTP request
uri = URI("https://api.rememberizer.ai/api/v1/vector-stores/#{vector_store_id}/documents/upload")
request = Net::HTTP::Post.new(uri)
request['x-api-key'] = api_key
# Create a multipart form boundary
boundary = "RubyBoundary#{rand(1000000)}"
request['Content-Type'] = "multipart/form-data; boundary=#{boundary}"
# Build the request body
body = []
batch.each do |file_path|
file_name = File.basename(file_path)
mime_type = MIME::Types.type_for(file_path).first&.content_type || 'application/octet-stream'
begin
file_content = File.binread(file_path)
body << "--#{boundary}\r\n"
body << "Content-Disposition: form-data; name=\"files\"; filename=\"#{file_name}\"\r\n"
body << "Content-Type: #{mime_type}\r\n\r\n"
body << file_content
body << "\r\n"
rescue => e
puts "Error reading file #{file_path}: #{e.message}"
end
end
body << "--#{boundary}--\r\n"
request.body = body.join
# Send the request
begin
http = Net::HTTP.new(uri.host, uri.port)
http.use_ssl = true
response = http.request(request)
if response.code.to_i == 200 || response.code.to_i == 201 || response.code.to_i == 207
batch_result = JSON.parse(response.body)
results << batch_result
puts "Successfully uploaded batch - #{batch_result['documents']&.count || 0} documents processed"
# Check for errors
if batch_result['errors'] && !batch_result['errors'].empty?
puts "Errors encountered: #{batch_result['errors'].count}"
batch_result['errors'].each do |error|
puts "- #{error['file']}: #{error['error']}"
end
end
else
puts "Batch upload failed with status code #{response.code}: #{response.body}"
results << { "error" => "Batch failed: #{response.body}" }
end
rescue => e
puts "Exception during batch upload: #{e.message}"
results << { "error" => e.message }
end
# Rate limiting - pause between batches
if batch_index < total_batches - 1
puts "Pausing for #{delay_between_batches} seconds before next batch..."
sleep(delay_between_batches)
end
end
puts "Upload complete. Processed #{files.count} files."
results
end
# Example usage
results = batch_upload_to_vector_store(
'vs_abc123',
'/path/to/documents/folder',
batch_size: 5,
file_types: ['.pdf', '.docx', '.txt'],
delay_between_batches: 2.0
)
||CODE_BLOCK||
{% endtab %}
{% endtabs %}
### Batch Upload Best Practices
To optimize performance and reliability when uploading large volumes of files:
1. **Manage Batch Size**: Keep batch sizes between 5-10 files for optimal performance. Too many files in a single request increases the risk of timeouts.
2. **Implement Rate Limiting**: Add delays between batches (2-3 seconds recommended) to avoid hitting API rate limits.
3. **Add Error Retry Logic**: For production systems, implement retry logic for failed uploads with exponential backoff.
4. **Validate File Types**: Pre-filter files to ensure they're supported types before attempting upload.
5. **Monitor Batch Progress**: For user-facing applications, provide progress feedback on batch operations.
6. **Handle Partial Success**: The API may return a 207 status code for partial success. Always check individual document statuses.
7. **Clean Up Resources**: Ensure all file handles are properly closed, especially when errors occur.
8. **Parallelize Wisely**: For very large uploads (thousands of files), consider multiple concurrent batch processes targeting different vector stores, then combine results later if needed.
9. **Implement Checksums**: For critical data, verify file integrity before and after upload with checksums.
10. **Log Comprehensive Results**: Maintain detailed logs of all upload operations for troubleshooting.
By following these best practices, you can efficiently manage large-scale document ingestion into your vector stores.
Last updated