Create an Emoji Inline Node with Markdown Support

Beta

This guide shows how to add Markdown support for a small atomic inline node that renders emoji shortcodes (for example :smile:). We'll walk through four clear steps and include a full example at each step so you always have the complete context:

  1. Create the basic emoji Node (no Markdown).
  2. Step 2: Add a tokenizer to convert :name: into tokens.
  3. Step 3: Add the parser to turn tokens into Tiptap JSON.
  4. Step 4: Add the renderer to serialize Tiptap JSON back to Markdown.

Example shorthand we'll support:

Hello :smile: world!

Step 1: Create the basic emoji node

Start by defining a small atomic inline node that stores a name attribute and renders an emoji in HTML.

import { Node } from '@tiptap/core'

const emojiMap = {
  smile: '😊',
  heart: '❤️',
  thumbsup: '👍',
  fire: '🔥',
  // add more mappings as needed
}

export const Emoji = Node.create({
  name: 'emoji',

  group: 'inline',
  inline: true,
  atom: true,

  addAttributes() {
    return {
      name: { default: 'smile' },
    }
  },

  parseHTML() {
    return [{ tag: 'span[data-emoji]' }]
  },

  renderHTML({ node }) {
    const emoji = emojiMap[node.attrs.name] || node.attrs.name || 'smile'
    return ['span', { 'data-emoji': node.attrs.name }, emoji]
  },
})

Notes:

  • The node is atom: true and inline so it behaves like an indivisible inline piece of content.
  • emojiMap is used to map short names to actual emoji characters for HTML output.

Step 2: Add a custom Markdown tokenizer

The tokenizer recognizes :name: shortcodes in inline Markdown and returns a token with the emoji name. Below is the full extension including the tokenizer so you can see how it integrates with the base Node.

import { Node } from '@tiptap/core'

const emojiMap = {
  smile: '😊',
  heart: '❤️',
  thumbsup: '👍',
  fire: '🔥',
  // add more mappings as needed
}

export const Emoji = Node.create({
  name: 'emoji',

  group: 'inline',
  inline: true,
  atom: true,

  addAttributes() {
    return {
      name: { default: 'smile' },
    }
  },

  parseHTML() {
    return [{ tag: 'span[data-emoji]' }]
  },

  renderHTML({ node }) {
    const emoji = emojiMap[node.attrs.name] || node.attrs.name || 'smile'
    return ['span', { 'data-emoji': node.attrs.name }, emoji]
  },

  // define a Markdown tokenizer to recognize :name: shortcodes
  markdownTokenizer: {
    name: 'emoji',
    level: 'inline',
    // Fast hint for the lexer
    start: (src) => src.indexOf(':'),
    tokenize: (src, tokens, lexer) => {
      // Match :name: where name can include letters, numbers, underscores, plus signs
      const match = /^:([a-z0-9_+]+):/i.exec(src)
      if (!match) return undefined

      return {
        type: 'emoji',
        raw: match[0],      // full match like ":smile:"
        emojiName: match[1], // captured name like "smile"
      }
    },
  },
})

Implementation notes:

  • start is an optimization used by the lexer to quickly find candidate positions.
  • The tokenizer returns a token object with type, raw, and emojiName fields that the parser will consume.
  • Keep the tokenizer inline-level (level: 'inline') so it integrates with inline parsing.

Step 3: Add the parser

The parseMarkdown function converts the tokenizer token into a Tiptap node. For an atomic inline node, it should return a node object with the type and attrs. Here's the full extension now including tokenizer + parse.

import { Node } from '@tiptap/core'

const emojiMap = {
  smile: '😊',
  heart: '❤️',
  thumbsup: '👍',
  fire: '🔥',
  // add more mappings as needed
}

export const Emoji = Node.create({
  name: 'emoji',

  group: 'inline',
  inline: true,
  atom: true,

  addAttributes() {
    return {
      name: { default: 'smile' },
    }
  },

  parseHTML() {
    return [{ tag: 'span[data-emoji]' }]
  },

  renderHTML({ node }) {
    const emoji = emojiMap[node.attrs.name] || node.attrs.name || 'smile'
    return ['span', { 'data-emoji': node.attrs.name }, emoji]
  },

  markdownTokenizer: {
    name: 'emoji',
    level: 'inline',
    // Fast hint for the lexer
    start: (src) => src.indexOf(':'),
    tokenize: (src, tokens, lexer) => {
      // Match :name: where name can include letters, numbers, underscores, plus signs
      const match = /^:([a-z0-9_+]+):/i.exec(src)
      if (!match) return undefined

      return {
        type: 'emoji',
        raw: match[0],      // full match like ":smile:"
        emojiName: match[1], // captured name like "smile"
      }
    },
  },

  // Parse token into a tiptap node
  parseMarkdown: (token, helpers) => {
    return {
      type: 'emoji',
      attrs: { name: token.emojiName },
    }
  },
})

Notes:

  • The parseMarkdown function should return an object where type matches the node name.
  • Atomic nodes do not supply content; they are standalone nodes with attributes.

Step 4: Add the renderer

To support serializing the editor state back to Markdown shortcodes, implement the renderMarkdown function. It receives a Tiptap node and should return a Markdown string representing that node. Below is the full extension with tokenizer, parse, and render included.

import { Node } from '@tiptap/core'

const emojiMap = {
  smile: '😊',
  heart: '❤️',
  thumbsup: '👍',
  fire: '🔥',
  // add more mappings as needed
}

export const Emoji = Node.create({
  name: 'emoji',

  group: 'inline',
  inline: true,
  atom: true,

  addAttributes() {
    return {
      name: { default: 'smile' },
    }
  },

  parseHTML() {
    return [{ tag: 'span[data-emoji]' }]
  },

  renderHTML({ node }) {
    const emoji = emojiMap[node.attrs.name] || node.attrs.name || 'smile'
    return ['span', { 'data-emoji': node.attrs.name }, emoji]
  },

  markdownTokenizer: {
    name: 'emoji',
    level: 'inline',
    // Fast hint for the lexer
    start: (src) => src.indexOf(':'),
    tokenize: (src, tokens, lexer) => {
      // Match :name: where name can include letters, numbers, underscores, plus signs
      const match = /^:([a-z0-9_+]+):/i.exec(src)
      if (!match) return undefined

      return {
        type: 'emoji',
        raw: match[0],       // full match like ":smile:"
        emojiName: match[1], // captured name like "smile"
      }
    },
  },

  // Parse token into a tiptap node
  parseMarkdown: (token, helpers) => {
    return {
      type: 'emoji',
      attrs: { name: token.emojiName },
    }
  },

  // Render tiptap node back to Markdown
  renderMarkdown: (node, helpers) => {
    // Serialize back to :name: shortcode. Use the stored name attribute.
    return `:${node.attrs?.name || 'unknown'}:`
  },
})

Usage

Set the editor content from Markdown that contains emoji shortcodes. Depending on your Markdown integration, pass contentType: 'markdown' or use the API your setup provides:

editor.commands.setContent('Hello :smile: :heart: :thumbsup:', { contentType: 'markdown' })

This will produce inline emoji nodes with corresponding name attributes, and HTML rendering will display the mapped emoji characters (via emojiMap).


Testing and edge cases

  • Unknown names: If a shortcode isn't in emojiMap, the node currently renders the raw name in the span (you may prefer to show a fallback or remove the node). Validate or normalize names in markdownTokenizer or parseMarkdown if needed.
  • Case sensitivity: The tokenizer uses an i flag to allow case-insensitive names; ensure your emojiMap keys match your chosen convention or normalize them with .toLowerCase().
  • Inline parsing: Because this is an inline tokenizer, it will be tried within paragraph and inline contexts — ensure it doesn't conflict with other inline tokenizers (like emphasis) by adjusting the regex or using surrounding whitespace checks if necessary.
  • Atomic behavior: The node is atomic, so it won't be editable as text. This is appropriate for emoji elements but might need different behavior for editable shortcodes.