Tokenizers documentation

Visualizer

Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Visualizer

Python
Rust
Node

Annotation

class tokenizers.tools.Annotation

< >

( start: int end: int label: str )

EncodingVisualizer

class tokenizers.tools.EncodingVisualizer

< >

( tokenizer: Tokenizer default_to_notebook: bool = True annotation_converter: typing.Union[typing.Callable[[typing.Any], tokenizers.tools.visualizer.Annotation], NoneType] = None )

Parameters

  • tokenizer (Tokenizer) — A tokenizer instance
  • default_to_notebook (bool) — Whether to render html output in a notebook by default
  • annotation_converter (Callable, optional) — An optional (lambda) function that takes an annotation in any format and returns an Annotation object

Build an EncodingVisualizer

__call__

< >

( text: str annotations: typing.List[tokenizers.tools.visualizer.Annotation] = [] default_to_notebook: typing.Optional[bool] = None )

Parameters

  • text (str) — The text to tokenize
  • annotations (List[Annotation], optional) — An optional list of annotations of the text. The can either be an annotation class or anything else if you instantiated the visualizer with a converter function
  • default_to_notebook (bool, optional, defaults to False) — If True, will render the html in a notebook. Otherwise returns an html string.

Build a visualization of the given text