Hl

Yuio supports basic code highlighting; it is just enough to format help messages for CLI, and color tracebacks when an error occurs.

Yuio supports the following languages:

  • python,

  • traceback,

  • bash,

  • diff,

  • json.

Highlighters registry

yuio.hl.get_highlighter(syntax: str, /) tuple[SyntaxHighlighter, str]

Look up highlighter by a syntax name.

Parameters:

syntax – name of the syntax highlighter.

Returns:

a highlighter instance and a string with canonical syntax name. If highlighter with the given name can’t be found, returns a dummy highlighter that does nothing.

Example:
highlighter, syntax_name = yuio.hl.get_highlighter("python")

highlighted = highlighter.highlight(
    code,
    theme=theme,
    syntax=syntax_name,
)
yuio.hl.register_highlighter(
syntaxes: list[str],
highlighter: SyntaxHighlighter,
)

Register a highlighter in a global registry, and allow looking it up via the get_highlighter() method.

Parameters:
  • syntaxes – syntax names which correspond to this highlighter. The first syntax is considered canonical, meaning that it should be used to look up colors in a theme.

  • highlighter – a highlighter instance.

Highlighter base class

class yuio.hl.SyntaxHighlighter
abstractmethod highlight(
code: str,
/,
*,
theme: Theme,
syntax: str,
default_color: Color | str | None = None,
) ColorizedString

Highlight the given code using the given theme.

Parameters:
  • code – code to highlight.

  • syntax – canonical name of the syntax.

  • theme – theme that will be used to look up color tags.

  • default_color – color or color path to apply to the entire code.

class yuio.hl.ReSyntaxHighlighter(
patterns: list[tuple[str, str | SyntaxHighlighter | None | tuple[str | SyntaxHighlighter | None, ...]]],
*,
base_color: str | None = None,
)

A highlighter implementation that uses regular expressions to tokenize source code.

This highlighter accepts regular expressions for tokens, and corresponding token names.

Regular expressions are compiled with flag re.MULTILINE; they should not contain global flags or named groups. ReSyntaxHighlighter will combine all given regexps into a single regular expression, and run it using re.finditer() (similar to a tokenizer example from Python documentation.

Parameters:
  • patterns

    regular expressions and corresponding colors that will be used to tokenize code.

    Each pattern should be a tuple of two elements:

    • the first is a string with a regular expression, which will be combined with multiline flag;

    • the second is name of a token, or another SyntaxHighlighter.

      It can also be a tuple of token names and syntax highlighters, one for every capturing group in the regular expression.

      Token names will be converted to colors by looking up hl/{token}:{syntax} in a Theme.

  • base_color – color that will be added to the entire code regardless of tokens

Implementing regexp-based highlighter

Let’s implement a syntax highlighter for JSON.

We will start by creating regular expressions for JSON tokens. We will need:

  • built-in literals: \b(true|false|null)\b, token name "lit/builtin";

  • numbers: -?\d+(\.\d+)?([eE][+-]?\d+)?, token name lit/num;

  • strings: "(\\.|[^\\"])*", token name str;

  • punctuation: [{}\[\],:], token name punct.

Now that we know our tokens and regular expressions to parse them, we can pass them to ReSyntaxHighlighter:

json_highlighter = yuio.hl.ReSyntaxHighlighter(
    [
        (
            # Literals.
            r"\b(true|false|null)\b",
            "lit/builtin",
        ),
        (
            # Numbers.
            r"-?\d+(\.\d+)?([eE][+-]?\d+)?",
            "lit/num",
        ),
        (
            # Strings.
            r'"(\\.|[^\\"])*"',
            "str",
        ),
        (
            # Punctuation.
            r"[{}\[\],:]",
            "punct",
        ),
    ],
)

ReSyntaxHighlighter will scan source code and look for given regular expressions. If found, it will color matched part of the code depending on the associated token name.

We can also color different parts of matched code with different colors, and even pass them to nested syntax highlighters.

For example, let’s define a highlighter that searches for escape sequences in strings:

str_highlighter = yuio.hl.ReSyntaxHighlighter(
    [
        (
            # Escape sequence.
            r'\([\/"bfnrt]|u[0-9a-fA-F]{4})',
            "str/esc",
        )
    ],
    base_color="str",
)

We can now apply str_highlighter to strings when json_highlighter matches them:

json_highlighter = yuio.hl.ReSyntaxHighlighter(
    [
        ...,
        (
            # Strings.
            r'(")(\\.|[^\\"])*(")',
            ("str", str_highlighter, "str"),
        ),
        ...,
    ],
)

Our regular expression for strings contains three capturing groups:

    (")((?:\\.|[^\\"])*)(")
     │  └┬────────────┘  │
     │   │              #2, matches closing quote
     │   └ #1, matches string content
     └ #0, matches opening quote

And we’ve passed token names and highlighters for each of these groups:

    ("str", str_highlighter, "str")
     └┬──┘  └┬────────────┘  └┬──┘
      │      │                │
      │      │               highlights contents of group #2
      │      └ highlights contents of group #1
      └ highlights contents of group #0