检索文档内容
Returns the content of the document with the specified ID, along with the index of the latest retrieved chunk. Each call fetches up to 20 chunks. To get more, use the end_chunk value from the response as the start_chunk for the next call.
The ID of the document to retrieve contents for.
Indicate the starting chunk that you want to retrieve. If not specified, the default value is 0.
Indicate the ending chunk that you want to retrieve. If not specified, the default value is start_chunk + 20.
Content of the document and index of the latest retrieved chunk.
Document not found.
Internal server error.
GET /api/v1/documents/{document_id}/contents/ HTTP/1.1
Host: api.rememberizer.ai
Accept: */*
{
"content": "text",
"end_chunk": 20
}示例请求
curl -X GET \
"https://api.rememberizer.ai/api/v1/documents/12345/contents/?start_chunk=0&end_chunk=20" \
-H "Authorization: Bearer YOUR_JWT_TOKEN"const getDocumentContents = async (documentId, startChunk = 0, endChunk = 20) => {
const url = new URL(`https://api.rememberizer.ai/api/v1/documents/${documentId}/contents/`);
url.searchParams.append('start_chunk', startChunk);
url.searchParams.append('end_chunk', endChunk);
const response = await fetch(url.toString(), {
method: 'GET',
headers: {
'Authorization': 'Bearer YOUR_JWT_TOKEN'
}
});
const data = await response.json();
console.log(data);
// 如果还有更多块,您可以获取它们
if (data.end_chunk < totalChunks) {
// 获取下一组块
await getDocumentContents(documentId, data.end_chunk, data.end_chunk + 20);
}
};
getDocumentContents(12345);import requests
def get_document_contents(document_id, start_chunk=0, end_chunk=20):
headers = {
"Authorization": "Bearer YOUR_JWT_TOKEN"
}
params = {
"start_chunk": start_chunk,
"end_chunk": end_chunk
}
response = requests.get(
f"https://api.rememberizer.ai/api/v1/documents/{document_id}/contents/",
headers=headers,
params=params
)
data = response.json()
print(data)
# 如果还有更多块,您可以获取它们
# 这是一个简单的示例 - 您可能想要实现一个适当的递归检查
if 'end_chunk' in data and data['end_chunk'] < total_chunks:
get_document_contents(document_id, data['end_chunk'], data['end_chunk'] + 20)
get_document_contents(12345)路径参数
document_id
整数
必填。 要检索内容的文档ID。
查询参数
start_chunk
整数
起始块索引。默认值为 0。
end_chunk
整数
结束块索引。默认值为 start_chunk + 20。
响应格式
{
"content": "文档块的完整文本内容...",
"end_chunk": 20
}错误响应
404
文档未找到
500
服务器内部错误
大型文档的分页
对于大型文档,内容被分成多个块。您可以通过多次请求来检索完整文档:
先发起一个请求,使用
start_chunk=0将返回的
end_chunk值作为下一个请求的start_chunk继续直到您检索到所有块
此端点返回文档的原始文本内容,使您能够访问完整信息以进行详细处理或分析。
Last updated