Import DOCX via REST API

Available in Start planBetav2.12.0

The DOCX import API converts .docx files into Tiptap's JSON format.

Review the postman collection

You can also experiment with the Document Conversion API by heading over to our Postman Collection.

Import DOCX

POST /v2/convert/import/docx

The /v2/convert/import/docx endpoint converts docx files into Tiptap's JSON format. Users can POST documents to this endpoint and use various parameters to customize how different document elements are handled during the conversion process.

Example (cURL)

curl -X POST "https://api.tiptap.dev/v2/convert/import/docx" \
    -H "Authorization: Bearer YOUR_TOKEN" \
    -H "X-App-Id: YOUR_APP_ID" \
    -F "file=@/path/to/your/file.docx" \
    -F 'imageUploadConfig={"url":"https://your-image-upload-endpoint.com","headers":{"Authorization":"Bearer your-upload-token"}}' \
    -F "prosemirrorNodes={\"nodeKey\":\"nodeValue\"}" \
    -F "prosemirrorMarks={\"markKey\":\"markValue\"}"

Subscription required

This endpoint requires a valid Tiptap subscription. For more details review our pricing page.

Required headers

NameDescription
AuthorizationThe JWT token to authenticate the request. Example: Bearer your-jwt-token
X-App-IdThe Convert App-ID from the Convert settings page: https://cloud.tiptap.dev/v2/cloud/convert

Body

NameTypeDescription
fileFileThe file to convert
imageUploadConfigJSON stringA JSON object configuring image uploads. Contains url (required), and optionally headers, method, and queryParams. See the editor extension docs for the full schema.
prosemirrorNodesObject stringCustom node mapping for the conversion, see more info.
prosemirrorMarksObject stringCustom mark mapping for the conversion, see more info
verbosestring | numberA configuration property to help you control the level of diagnostic output during the import process. This is especially useful for debugging or for getting more insight into what happens during conversion. See more at Verbose output
extractCssStyles"true" | "false"Experimental. When "true", the response includes a cssStyles field with the DOCX style catalog translated to CSS-compatible key/value pairs. Defaults to "false". See CSS style extraction for the output shape and caveats.

Import verbose output

The DOCX import extension provides a verbose configuration property to help you control the level of diagnostic output during the import process. This is especially useful for debugging or for getting more insight into what happens during conversion.

The verbose property is a bitmask number that determines which types of log messages are emitted. The extension uses the following levels:

ValueLevelDescription
1logGeneral informational logs
2warnWarnings
4errorErrors

Verbose bitmask

You can combine levels by adding their values together. For example, verbose: 3 will enable both log (1) and warn (2) messages.

The verbose output will give you, along the data property, one more property called logs, which will contain info, warn, and error properties, each of them being an array with all of the information related to that specific verbosity.

{
  "data": {
    "content": {
        // Tiptap JSON
    }
  },
  "logs": {
    "info": [],
    "warn": [
      {
        "message": "Image file not found in media files",
        "fileName": "image1.gif",
        "availableMediaFiles": []
      }
    ],
    "error": [
      {
        "message": "Image upload failed: General error",
        "fileName": "image1.gif",
        "url": "https://your-image-upload-endpoint.com",
        "error": "Unable to connect. Is the computer able to access the url?",
        "context": "uploadImage general error"
      }
    ]
  }
}

Headers & Footers in the response

The import API extracts headers and footers from .docx files. When the document contains header or footer content, additional fields are included in the data object of the response.

Response fields

FieldTypeDescription
contentObjectThe main document body as Tiptap JSON
headerObject | nullDefault header content as Tiptap JSON
footerObject | nullDefault footer content as Tiptap JSON
headerFirstPageObject | nullFirst page header (when "Different First Page" is enabled in Word)
footerFirstPageObject | nullFirst page footer (when "Different First Page" is enabled in Word)
headerOddObject | nullOdd page header (when "Different Odd & Even Pages" is enabled in Word)
footerOddObject | nullOdd page footer (when "Different Odd & Even Pages" is enabled in Word)
headerEvenObject | nullEven page header (when "Different Odd & Even Pages" is enabled in Word)
footerEvenObject | nullEven page footer (when "Different Odd & Even Pages" is enabled in Word)

Example response

{
  "data": {
    "content": {
      "type": "doc",
      "content": [
        {
          "type": "paragraph",
          "content": [{ "type": "text", "text": "Document body content..." }]
        }
      ]
    },
    "header": {
      "type": "doc",
      "content": [
        {
          "type": "paragraph",
          "content": [{ "type": "text", "text": "Default Header" }]
        }
      ]
    },
    "footer": {
      "type": "doc",
      "content": [
        {
          "type": "paragraph",
          "content": [{ "type": "text", "text": "Page {page} of {total}" }]
        }
      ]
    },
    "headerFirstPage": {
      "type": "doc",
      "content": [
        {
          "type": "paragraph",
          "content": [
            {
              "type": "text",
              "text": "Title Page Header",
              "marks": [{ "type": "bold" }]
            }
          ]
        }
      ]
    },
    "footerFirstPage": null,
    "headerOdd": null,
    "footerOdd": null,
    "headerEven": null,
    "footerEven": null
  }
}

Fields that are null indicate the document does not define content for that specific header or footer slot.

Footnotes & Endnotes in the response

The import API extracts footnotes and endnotes from .docx files. When the document contains footnote or endnote content, additional fields are included in the data object of the response.

Response fields

FieldTypeDescription
footnotesObjectAn object keyed by footnote ID, where each value is a Tiptap JSON document ({ type: "doc", content: [...] })
endnotesObjectAn object keyed by endnote ID, where each value is a Tiptap JSON document ({ type: "doc", content: [...] })

Documents without footnotes or endnotes return empty objects ({}), so existing integrations are unaffected.

Inline references in the document body are represented as footnoteReference and endnoteReference nodes, each carrying a noteId attribute that links to the corresponding entry in footnotes or endnotes.

Example response

{
  "data": {
    "content": {
      "type": "doc",
      "content": [
        {
          "type": "paragraph",
          "content": [
            { "type": "text", "text": "This sentence has a footnote" },
            { "type": "footnoteReference", "attrs": { "noteId": "1" } },
            { "type": "text", "text": " and an endnote" },
            { "type": "endnoteReference", "attrs": { "noteId": "1" } },
            { "type": "text", "text": "." }
          ]
        }
      ]
    },
    "footnotes": {
      "1": {
        "type": "doc",
        "content": [
          {
            "type": "paragraph",
            "content": [{ "type": "text", "text": "This is the footnote content." }]
          }
        ]
      }
    },
    "endnotes": {
      "1": {
        "type": "doc",
        "content": [
          {
            "type": "paragraph",
            "content": [{ "type": "text", "text": "This is the endnote content." }]
          }
        ]
      }
    },
    "header": null,
    "footer": null,
    "headerFirstPage": null,
    "footerFirstPage": null,
    "headerOdd": null,
    "footerOdd": null,
    "headerEven": null,
    "footerEven": null
  }
}

CSS style extraction (experimental)

Experimental feature

extractCssStyles is opt-in and experimental. The shape of the response may change without notice. Pin exact client versions if you depend on it, and read the CSS injection documentation for the full list of supported selectors, CSS properties, and known limitations.

Pass extractCssStyles=true on the request to have the response include a cssStyles field. The field contains the DOCX default-style catalog translated to CSS-compatible key/value pairs, keyed by CSS selector.

Request

curl -X POST "https://api.tiptap.dev/v2/convert/import/docx" \
    -H "Authorization: Bearer YOUR_TOKEN" \
    -H "X-App-Id: YOUR_APP_ID" \
    -F "file=@/path/to/your/file.docx" \
    -F "extractCssStyles=true"

Response field

FieldTypeDescription
cssStylesObject | nullAn object keyed by CSS selector, where each value is an object of CSS properties. null or empty ({}) when the document has no extractable default styles or when the feature was not requested.

Supported selectors

The extractor maps DOCX named styles to this fixed set of CSS selectors. Styles that don't match the map are dropped silently.

p, h1, h2, h3, h4, h5, h6, blockquote, ul li, ol li, strong, em, u, s, a, code

Supported CSS properties

Each selector above can carry any subset of the following, depending on what the DOCX style defined.

color, fontSize, fontFamily, fontWeight, fontStyle, textDecoration, backgroundColor, textAlign, marginTop, marginBottom, lineHeight

Example response

{
  "data": {
    "content": {
      "type": "doc",
      "content": [
        {
          "type": "paragraph",
          "content": [{ "type": "text", "text": "Document body..." }]
        }
      ]
    },
    "cssStyles": {
      "p": {
        "fontSize": "16px",
        "fontFamily": "Calibri",
        "lineHeight": 1.15,
        "marginBottom": "8pt"
      },
      "h1": {
        "fontSize": "28px",
        "fontWeight": "bold",
        "color": "#1F3864",
        "marginTop": "24pt",
        "marginBottom": "6pt"
      },
      "a": {
        "color": "#0563C1",
        "textDecoration": "underline"
      },
      "blockquote": {
        "fontStyle": "italic",
        "color": "#595959"
      }
    }
  }
}

Notes on the output:

  • Values are strings or numbers. Pixel-based properties (fontSize) are emitted as "Npx" strings; point-based spacing (marginTop, marginBottom) is emitted as "Npt"; unitless multipliers (lineHeight under the auto line rule) are emitted as plain numbers.
  • Style inheritance is resolved. Child DOCX styles include everything they inherit from their parents (via basedOn) — the output is flattened, you do not have to walk a chain.
  • Localized documents are supported. Named styles in non-English Word builds are mapped back to their canonical selector before emission.
  • Missing fields mean "not defined". A selector only appears when at least one property could be extracted. If the DOCX has no style catalog at all, cssStyles is null.

Custom node and mark mapping

You can override the default node/mark types used during import by specifying them in the body of your request within prosemirrorNodes and prosemirrorMarks respectively. You would need to provide these if your editor uses custom nodes/marks and you want the imported JSON to use those.

For example, if your schema uses a custom node type called textBlock instead of the default paragraph, you can include "{\"textBlock\":\"paragraph\"}" in the request body.

You can similarly adjust headings, lists, marks like bold or italic, and more.

Default nodes

NameDescription
paragraphDefines which ProseMirror type is used for paragraph conversion
headingDefines which ProseMirror type is used for heading conversion
blockquoteDefines which ProseMirror type is used for blockquote conversion
codeblockDefines which ProseMirror type is used for codeblock conversion
bulletlistDefines which ProseMirror type is used for bulletList conversion
orderedlistDefines which ProseMirror type is used for orderedList conversion
listitemDefines which ProseMirror type is used for listItem conversion
hardbreakDefines which ProseMirror type is used for hardbreak conversion
horizontalruleDefines which ProseMirror type is used for horizontalRule conversion
tableDefines which ProseMirror type is used for table conversion
tablecellDefines which ProseMirror type is used for tableCell conversion
tableheaderDefines which ProseMirror type is used for tableHeader conversion
tablerowDefines which ProseMirror type is used for tableRow conversion
imageDefines which ProseMirror mark is used for image conversion
footnoteReferenceDefines which ProseMirror type is used for footnote reference conversion
endnoteReferenceDefines which ProseMirror type is used for endnote reference conversion

Default marks

NameDescription
boldDefines which ProseMirror mark is used for bold conversion
italicDefines which ProseMirror mark is used for italic conversion
underlineDefines which ProseMirror mark is used for underline conversion
strikethroughDefines which ProseMirror mark is used for strikethrough conversion
linkDefines which ProseMirror mark is used for link conversion
codeDefines which ProseMirror mark is used for code conversion

Support & Limitations

Currently supported features and known limitations for DOCX import:

FeatureSupport
Text content✓ Basic text, spacing, punctuation
Text formatting✓ Bold, italic, underline, strikethrough, alignment, line height
Block elements✓ Paragraphs, headings (1–6), blockquotes, ordered and unordered lists
Tables✓ Basic structure, header rows, colspan
Links✓ Hyperlinks
Media (Images)✓ Embedded images, size preserved
Styles✓ Font families*, Font colors, font sizes, background colors, line heights
Headers & Footers✓ Supported (requires Pages extension)
Sections & Page Breaks✓ Page breaks supported (requires PageBreak extension)
Footnotes & Endnotes✓ Supported
Math~ In development
Comments & Revisions
Table of Contents
Advanced Formatting✗ Columns, text direction, forms, macros, embedded scripts
Metadata
Text Boxes, Shapes, SmartArt
* Font families are supported as long as the target font is installed on the operative system when the .docx file is opened.