检索文档内容

Retrieve contents of a document by its ID.

get

Returns the content of the document with the specified ID, along with the index of the latest retrieved chunk. Each call fetches up to 20 chunks. To get more, use the end_chunk value from the response as the start_chunk for the next call.

Path parameters

document_idintegerRequired

The ID of the document to retrieve contents for.

Query parameters

start_chunkintegerOptional

Indicate the starting chunk that you want to retrieve. If not specified, the default value is 0.

end_chunkintegerOptional

Indicate the ending chunk that you want to retrieve. If not specified, the default value is start_chunk + 20.

Responses

200

Content of the document and index of the latest retrieved chunk.

application/json

404

Document not found.

500

Internal server error.

get

GET /api/v1/documents/{document_id}/contents/ HTTP/1.1
Host: api.rememberizer.ai
Accept: */*

{
  "content": "text",
  "end_chunk": 20
}

示例请求

curl -X GET \
  "https://api.rememberizer.ai/api/v1/documents/12345/contents/?start_chunk=0&end_chunk=20" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

将 YOUR_JWT_TOKEN 替换为您的实际 JWT 令牌，将 12345 替换为实际的文档 ID。

const getDocumentContents = async (documentId, startChunk = 0, endChunk = 20) => {
  const url = new URL(`https://api.rememberizer.ai/api/v1/documents/${documentId}/contents/`);
  url.searchParams.append('start_chunk', startChunk);
  url.searchParams.append('end_chunk', endChunk);
  
  const response = await fetch(url.toString(), {
    method: 'GET',
    headers: {
      'Authorization': 'Bearer YOUR_JWT_TOKEN'
    }
  });
  
  const data = await response.json();
  console.log(data);
  
  // 如果还有更多块，您可以获取它们
  if (data.end_chunk < totalChunks) {
    // 获取下一组块
    await getDocumentContents(documentId, data.end_chunk, data.end_chunk + 20);
  }
};

getDocumentContents(12345);

将 YOUR_JWT_TOKEN 替换为您的实际 JWT 令牌，将 12345 替换为实际的文档 ID。

import requests

def get_document_contents(document_id, start_chunk=0, end_chunk=20):
    headers = {
        "Authorization": "Bearer YOUR_JWT_TOKEN"
    }
    
    params = {
        "start_chunk": start_chunk,
        "end_chunk": end_chunk
    }
    
    response = requests.get(
        f"https://api.rememberizer.ai/api/v1/documents/{document_id}/contents/",
        headers=headers,
        params=params
    )
    
    data = response.json()
    print(data)
    
    # 如果还有更多块，您可以获取它们
    # 这是一个简单的示例 - 您可能想要实现一个适当的递归检查
    if 'end_chunk' in data and data['end_chunk'] < total_chunks:
        get_document_contents(document_id, data['end_chunk'], data['end_chunk'] + 20)

get_document_contents(12345)

将 YOUR_JWT_TOKEN 替换为您的实际 JWT 令牌，将 12345 替换为实际的文档 ID。

路径参数

参数

类型

描述

document_id

整数

必填。 要检索内容的文档ID。

查询参数

参数

类型

描述

start_chunk

整数

起始块索引。默认值为 0。

end_chunk

整数

结束块索引。默认值为 start_chunk + 20。

响应格式

{
  "content": "文档块的完整文本内容...",
  "end_chunk": 20
}

错误响应

状态码

描述

404

文档未找到

500

服务器内部错误

大型文档的分页

对于大型文档，内容被分成多个块。您可以通过多次请求来检索完整文档：

先发起一个请求，使用 start_chunk=0
将返回的 end_chunk 值作为下一个请求的 start_chunk
继续直到您检索到所有块

此端点返回文档的原始文本内容，使您能够访问完整信息以进行详细处理或分析。

Previous检索文档 Next检索 Slack 内容

Last updated 2 months ago