AI engineering with the AI Toolkit
Get the most out of the AI Toolkit by choosing the best AI model and prompt for every situation.
Choose an AI model
The AI Toolkit's tools are designed to work with any AI model that supports function calling. They've been tested with a large variety of models from different AI providers.
A key distinction for choosing the right AI model is whether you need to build an agent or a workflow:
- AI agents are given a task and work independently towards a task, making decisions about what next action to take at each step. For example: an AI chatbot assistant
- In AI workflows, the AI performs one or more pre-defined actions, but does not get to choose the next action to take. This is the case in simple AI content generation or autocompletion.
Best models for agentic tasks
For complex agentic AI assistants like the one in the AI agent chatbot guide, it is recommended to use using a frontier model with advanced tool calling capabilities. In particular, the following models have shown positive results:
- Claude Sonnet 4.5 (Anthropic): This is the model we've had the best results with. It excels at applying small edits to the document precisely and efficiently.
- GPT-5 (OpenAI).
- Grok 4 Fast (xAI).
- GLM-4.6 (Z.ai)
- Gemini 2.5 Pro (Google)
Best budget models for agentic tasks
If you have an agentic document editing use case but keeping costs down is a priority, you can use a smaller model that still supports function calling. In particular, you can consider the following models:
- GPT-5 mini (OpenAI)
- Gemini 2.5 Flash (Google)
However, we've noticed smaller models can sometimes fail to apply small edits to the document precisely and efficiently. To avoid tool call failures, disable the applyPatch
tool.
import { toolDefinitions } from '@tiptap-pro/ai-toolkit-ai-sdk'
const tools = toolDefinitions({
tools: {
applyPatch: false,
},
})
Choosing a model for non-agentic workflows.
Not all usages of AI for document editing are agentic. Sometimes, you just need to use AI to generate content and insert it into the document. A great amount of budget models are suitable for this purpose. Just to name a few:
- GPT-5 mini (OpenAI)
- Gemini 2.5 Flash (Google)
- Mistral Medium 3.1 (Mistral)
Should you enable reasoning?
From our internal testing, we've noticed no visible improvement in accuracy and performance when enabling reasoning for document editing tasks. Since reasoning increases token consumption and latency, the recommendation is to disable reasoning or keep it to a minimum, except for when your AI agent uses reasoning for other purposes (for example, planning tasks or solving complex math problems).
For example, the recommended configuration of GPT-5 for the AI Toolkit is to set the reasoning
parameter to 'minimal'
.
Craft the right prompt
You can adjust how the AI model acts and generates content by providing a custom system prompt.
// app/api/chat/route.ts
import { openai } from '@ai-sdk/openai'
import { toolDefinitions } from '@tiptap-pro/ai-toolkit-ai-sdk'
import { convertToModelMessages, streamText, UIMessage } from 'ai'
export async function POST(req: Request) {
const { messages }: { messages: UIMessage[] } = await req.json()
const result = streamText({
model: openai('gpt-5'),
system: `You are an assistant that edits rich text documents in the style of Shakespeare.
You should respond in the style of Shakespeare, and when editing the document,
the content you generate and add to the document should be written in the style
of Shakespeare's plays.`,
messages: convertToModelMessages(messages),
tools: toolDefinitions(),
providerOptions: {
openai: {
reasoningEffort: 'minimal',
},
},
})
return result.toUIMessageStreamResponse()
}
In the system prompt, you don't need to mention what tools the AI model has available. The AI model is automatically aware of them, because they are included in the tool definitions.
However, in the system prompt, you can reference the available tools by their name. This way, you can instruct the AI model on how and when to use the tools, and guide the AI model to make it more verbose, creative, or thoughtful.
// app/api/chat/route.ts
import { openai } from '@ai-sdk/openai'
import { toolDefinitions } from '@tiptap-pro/ai-toolkit-ai-sdk'
import { convertToModelMessages, streamText, UIMessage } from 'ai'
export async function POST(req: Request) {
const { messages }: { messages: UIMessage[] } = await req.json()
const result = streamText({
model: openai('gpt-5'),
system: `You are an assistant that can edit rich text documents.
Before calling the document editing tools like insertContent or applyPatch,
you should first read the document to get a sense of the content and context.
Then, you should inform the user of the plan of action you will take to edit
the document, in a very detailed step-by-step description. Only after planning
in detail, you should call the document editing tools.`,
messages: convertToModelMessages(messages),
tools: toolDefinitions(),
providerOptions: {
openai: {
reasoningEffort: 'minimal',
},
},
})
return result.toUIMessageStreamResponse()
}
We recommend these resources for learning more about prompt engineering and AI engineering:
- Anthropic prompt engineering guide
- GPT-5 prompting guide
- OpenAI prompt engineering guide
- A practical guide to building agents
Improve speed and latency
To improve the speed of the AI Toolkit, you can use the following strategies:
Implement response streaming
Streaming the content into the editor makes the response feel faster to the user. Follow the tool streaming guide to implement it.
Choose a faster model or provider
When choosing your model, consider its speed and latency. Refer to leaderboards like Artificial Analysis to compare models across different metrics.
The speed and latency does not only depend on the model, but also on the provider where the model is hosted. If output speed is your priority, there are providers specialized in fast inference like groq, SambaNova or Cerebras.
Disable reasoning
Reasoning increases token consumption and latency. If you don't need it for your specific use case, disable it or set it to a minimum. For example, in GPT-5, it's recommended to set the reasoningEffort
provider option to 'minimal'
.
// app/api/chat/route.ts
import { openai } from '@ai-sdk/openai'
import { toolDefinitions } from '@tiptap-pro/ai-toolkit-ai-sdk'
import { convertToModelMessages, streamText, UIMessage } from 'ai'
export async function POST(req: Request) {
const { messages }: { messages: UIMessage[] } = await req.json()
const result = streamText({
model: openai('gpt-5'),
system: `You are an assistant that can edit rich text documents.`,
messages: convertToModelMessages(messages),
tools: toolDefinitions(),
// Set the reasoning effort to 'minimal'
providerOptions: {
openai: {
reasoningEffort: 'minimal',
},
},
})
return result.toUIMessageStreamResponse()
}
Provide the AI model with the sufficient context
Before editing the document, the AI model needs to read it. You can speed up the process by including the content of the document in the user message or in the system prompt.
const toolkit = getAiToolkit(editor)
// Before sending the user message, read the beginning of the document
const { output } = toolkit.executeTool({
toolName: 'readNodeRange',
input: {
nodeRange: '0:',
},
})
// Then, include the content of the document in the user message
let userMessage = `Replace the last paragraph with a short story`
userMessage += `
---
The user already called the 'readNodeRange' tool and it returned the following content:
${output}`
// Send the user message to the AI model
This way, the AI model doesn't need to call the readNodeRange
tool to read the document, and can jump straight to editing.