Reading the Document
The AI Agent reads the document in chunks to handle documents of any size efficiently. This chunking mechanism is essential because:
- Large documents may exceed the context window of the LLM
- Editing chunks is more efficient for multi-turn conversations
- It allows the AI Agent to focus on specific parts of the document
Chunking mechanism
By default, the document is split into chunks of approximately 8000 tokens each, while preserving HTML structure. The AI Agent maintains a pointer to the current chunk and can navigate through the document using specific tools.
const provider = new AiAgentProvider({
chunkSize: 2000, // Define a smaller chunk size
// ...other options
})
Custom chunking
You can customize how the document is chunked by providing a custom chunkHtml
function:
const provider = new AiAgentProvider({
// ...other options
chunkHtml: ({ html, chunkSize }) => {
// Custom logic to split HTML into chunks
// Must return an array of HTML strings
return customSplitFunction(html, chunkSize)
},
})