faster_tokenizer

to_tensor(string_values, name='text')[source]

Create the tensor that the value holds the list of string. NOTICE: The value will be holded in the cpu place.

Parameters
  • string_values (list[string]) – The value will be setted to the tensor.

  • name (string) – The name of the tensor.

to_vocab_buffer(vocab_dict, name)[source]

Create the tensor that the value holds the map, the type of key is the string. NOTICE: The value will be holded in the cpu place.

Parameters
  • vocab_dict (dict) – The value will be setted to the tensor. The key is token and the value is the token index.

  • name (string) – The name of the tensor.

class FasterTokenizer(vocab, do_lower_case=False, is_split_into_words=False)[source]

Bases: paddle.fluid.dygraph.layers.Layer

forward(text, text_pair=None, max_seq_len=0, pad_to_max_seq_len=False)[source]

Defines the computation performed at every call. Should be overridden by all subclasses.

Parameters
  • *inputs (tuple) – unpacked tuple arguments

  • **kwargs (dict) – unpacked dict arguments