Upload files to a Vector Store

Upload file content to Vector Store with batch operations

post

Upload files to a vector store.

Path parameters
vector-store-idstringRequired

The ID of the vector store.

Header parameters
x-api-keystringRequired

The API key for authentication.

Body
filesstring · binary[]Optional

The files to upload.

Responses
post
/vector-stores/{vector-store-id}/documents/upload

Example Requests

curl -X POST \
  https://api.rememberizer.ai/api/v1/vector-stores/vs_abc123/documents/upload \
  -H "x-api-key: YOUR_API_KEY" \
  -F "files=@/path/to/document1.pdf" \
  -F "files=@/path/to/document2.docx"

Replace YOUR_API_KEY with your actual Vector Store API key, vs_abc123 with your Vector Store ID, and provide the paths to your local files.

Path Parameters

Parameter
Type
Description

vector-store-id

string

Required. The ID of the vector store to upload files to.

Request Body

This endpoint accepts a multipart/form-data request with one or more files in the files field.

Response Format

If some files fail to upload, they will be listed in the errors array:

Authentication

This endpoint requires authentication using an API key in the x-api-key header.

Supported File Formats

  • PDF (.pdf)

  • Microsoft Word (.doc, .docx)

  • Microsoft Excel (.xls, .xlsx)

  • Microsoft PowerPoint (.ppt, .pptx)

  • Text files (.txt)

  • Markdown (.md)

  • JSON (.json)

  • HTML (.html, .htm)

File Size Limits

  • Individual file size limit: 50MB

  • Total request size limit: 100MB

  • Maximum number of files per request: 20

Error Responses

Status Code
Description

400

Bad Request - No files provided or invalid request format

401

Unauthorized - Invalid or missing API key

404

Not Found - Vector Store not found

413

Payload Too Large - Files exceed size limit

415

Unsupported Media Type - File format not supported

500

Internal Server Error

207

Multi-Status - Some files were uploaded successfully, but others failed

Processing Status

Files are initially accepted with a status of processing. You can check the processing status of the documents using the Get a List of Documents in a Vector Store endpoint. Final status will be one of:

  • done: Document was successfully processed

  • error: An error occurred during processing

  • processing: Document is still being processed

Processing time depends on file size and complexity. Typical processing time is between 30 seconds to 5 minutes per document.

Batch Operations

For efficiently uploading multiple files to your Vector Store, Rememberizer supports batch operations. This approach helps optimize performance when dealing with large numbers of documents.

Batch Upload Implementation

Batch Upload Best Practices

To optimize performance and reliability when uploading large volumes of files:

  1. Manage Batch Size: Keep batch sizes between 5-10 files for optimal performance. Too many files in a single request increases the risk of timeouts.

  2. Implement Rate Limiting: Add delays between batches (2-3 seconds recommended) to avoid hitting API rate limits.

  3. Add Error Retry Logic: For production systems, implement retry logic for failed uploads with exponential backoff.

  4. Validate File Types: Pre-filter files to ensure they're supported types before attempting upload.

  5. Monitor Batch Progress: For user-facing applications, provide progress feedback on batch operations.

  6. Handle Partial Success: The API may return a 207 status code for partial success. Always check individual document statuses.

  7. Clean Up Resources: Ensure all file handles are properly closed, especially when errors occur.

  8. Parallelize Wisely: For very large uploads (thousands of files), consider multiple concurrent batch processes targeting different vector stores, then combine results later if needed.

  9. Implement Checksums: For critical data, verify file integrity before and after upload with checksums.

  10. Log Comprehensive Results: Maintain detailed logs of all upload operations for troubleshooting.

By following these best practices, you can efficiently manage large-scale document ingestion into your vector stores.

Last updated