Search for Vector Store documents by semantic similarity
Search Vector Store documents with semantic similarity and batch operations
Last updated
Search Vector Store documents with semantic similarity and batch operations
Last updated
vector-store-id
string
Required. The ID of the vector store to search in.
q
string
Required. The search query text.
n
integer
Number of results to return. Default: 10.
t
number
Matching threshold. Default: 0.7.
prev_chunks
integer
Number of chunks before the matched chunk to include. Default: 0.
next_chunks
integer
Number of chunks after the matched chunk to include. Default: 0.
This endpoint requires authentication using an API key in the x-api-key
header.
400
Bad Request - Missing required parameters or invalid format
401
Unauthorized - Invalid or missing API key
404
Not Found - Vector Store not found
500
Internal Server Error
Use the prev_chunks
and next_chunks
parameters to control how much context is included with each match:
Set both to 0 for precise matches without context
Set both to 1-2 for matches with minimal context
Set both to 3-5 for matches with substantial context
The t
parameter controls how strictly matches are filtered:
Higher values (e.g., 0.9) return only very close matches
Lower values (e.g., 0.5) return more matches with greater variety
The default (0.7) provides a balanced approach
For high-throughput applications, Rememberizer supports efficient batch operations on vector stores. These methods optimize performance when processing multiple search queries.
When implementing batch operations for vector store searches, consider these best practices:
Optimal Batch Sizing: For most applications, processing 5-10 queries in parallel provides a good balance between throughput and resource usage.
Rate Limiting Awareness: Include delay mechanisms between batches (typically 1-2 seconds) to avoid hitting API rate limits.
Error Handling: Implement robust error handling for individual queries that may fail within a batch.
Connection Management: For high-volume applications, implement connection pooling to reduce overhead.
Timeout Configuration: Set appropriate timeouts for each request to prevent long-running queries from blocking the entire batch.
Result Processing: Consider processing results asynchronously as they become available rather than waiting for all results.
Monitoring: Track performance metrics like average response time and success rates to identify optimization opportunities.
For production applications with very high query volumes, consider implementing a queue system with worker processes to manage large batches efficiently.
This endpoint allows you to search your vector store using semantic similarity. It returns documents that are conceptually related to your query, even if they don't contain the exact keywords. This makes it particularly powerful for natural language queries and question answering.
Initiate a search operation with a query text and receive most semantically similar responses from the vector store.
The ID of the vector store.
The search query text.
Number of chunks to return.
Matching threshold.
Number of chunks before the matched chunk to include.
Number of chunks after the matched chunk to include.
The API key for authentication.