---
title: "Read large documents"
description: "Learn how the AI agent reads the document in Tiptap."
canonical_url: "https://tiptap.dev/docs/content-ai/capabilities/agent/features/large-documents"
---

# Read large documents

Learn how the AI agent reads the document in Tiptap.

The AI agent splits the document in chunks to handle documents of any size efficiently. This chunking mechanism is essential because:

1. Large documents may exceed the context window of the LLM
2. Editing chunks is more efficient for multi-turn conversations
3. It allows the AI agent to focus on specific parts of the document

## Chunking mechanism

By default, the document is split into chunks of approximately 32000 characters each, while preserving HTML structure. The AI agent reads one chunk at a time and can navigate to the next or previous chunk as needed.

Customize the amount of characters per chunk by setting the `chunkSize` option:

```tsx
const provider = new AiAgentProvider({
  chunkSize: 2000, // Define a smaller chunk size
  // ...other options
})
```

## Custom chunking

You can customize how the document is chunked by providing a custom `chunkHtml` function:

```tsx
const provider = new AiAgentProvider({
  // ...other options
  chunkHtml: ({ html, chunkSize }) => {
    // Custom logic to split HTML into chunks
    // Must return an array of HTML strings
    return customSplitFunction(html, chunkSize)
  },
})
```
