modeling

Modeling classes for FNet model.

class FNetPretrainedModel(name_scope=None, dtype='float32')[源代码]

基类:paddlenlp.transformers.model_utils.PretrainedModel

An abstract class for pretrained FNet models. It provides FNet related model_config_file, pretrained_init_configuration, resource_files_names, pretrained_resource_files_map, base_model_prefix for downloading and loading pretrained models. See PretrainedModel for more details.

base_model_class

alias of paddlenlp.transformers.fnet.modeling.FNetModel

class FNetModel(vocab_size=32000, hidden_size=768, num_hidden_layers=12, intermediate_size=3072, hidden_act='gelu_new', hidden_dropout_prob=0.1, max_position_embeddings=512, type_vocab_size=4, initializer_range=0.02, layer_norm_eps=1e-12, pad_token_id=3, bos_token_id=1, eos_token_id=2, add_pooling_layer=True)[源代码]

基类:paddlenlp.transformers.fnet.modeling.FNetPretrainedModel

The model can behave as an encoder, following the architecture described in FNet: Mixing Tokens with Fourier Transforms by James Lee-Thorp, Joshua Ainslie, Ilya Eckstein, Santiago Ontanon.

参数
  • vocab_size (int, optional) -- Vocabulary size of inputs_ids in FNetModel. Also is the vocab size of token embedding matrix. Defines the number of different tokens that can be represented by the inputs_ids passed when calling FNetModel. Defaults to 32000.

  • hidden_size (int, optional) -- Dimensionality of the encoder layer and pooler layer. Defaults to 768.

  • num_hidden_layers (int, optional) -- Number of hidden layers in the Transformer encoder. Defaults to 12.

  • intermediate_size (int, optional) -- Dimensionality of the feed-forward (ff) layer in the encoder. Input tensors to ff layers are firstly projected from hidden_size to intermediate_size, and then projected back to hidden_size. Typically intermediate_size is larger than hidden_size. Defaults to 3072.

  • hidden_act (str, optional) -- The non-linear activation function in the feed-forward layer. "gelu", "relu" and any other paddle supported activation functions are supported. Defaults to glue_new.

  • hidden_dropout_prob (float, optional) -- The dropout probability for all fully connected layers in the embeddings and encoder. Defaults to 0.1.

  • max_position_embeddings (int, optional) -- The maximum value of the dimensionality of position encoding, which dictates the maximum supported length of an input sequence. Defaults to 512.

  • type_vocab_size (int, optional) -- The vocabulary size of token_type_ids. Defaults to 4.

  • initializer_range (float, optional) --

    The standard deviation of the normal initializer. Defaults to 0.02. .. note:

    A normal_initializer initializes weight matrices as normal distributions.
    See :meth:`BertPretrainedModel.init_weights()` for how weights are initialized in `ElectraModel`.
    

  • layer_norm_eps (float, optional) -- The epsilon parameter used in paddle.nn.LayerNorm for initializing layer normalization layers. A small value to the variance added to the normalization layer to prevent division by zero. Defaults to 1e-12.

  • pad_token_id (int, optional) -- The index of padding token in the token vocabulary. Defaults to 3.

  • add_pooling_layer (bool, optional) -- Whether or not to add the pooling layer. Defaults to True.

forward(input_ids=None, token_type_ids=None, position_ids=None, inputs_embeds=None, output_hidden_states=None, return_dict=None)[源代码]

The FNetModel forward method.

参数
  • input_ids (Tensor) -- Indices of input sequence tokens in the vocabulary. They are numerical representations of tokens that build the input sequence. Its data type should be int64 and it has a shape of [batch_size, sequence_length].

  • token_type_ids (Tensor, optional) --

    Segment token indices to indicate different portions of the inputs. Selected in the range [0, type_vocab_size - 1]. If type_vocab_size is 2, which means the inputs have two portions. Indices can either be 0 or 1:

    • 0 corresponds to a sentence A token,

    • 1 corresponds to a sentence B token.

    Its data type should be int64 and it has a shape of [batch_size, sequence_length]. Defaults to None, which means we don't add segment embeddings.

  • position_ids (Tensor, optional) -- Indices of positions of each input sequence tokens in the position embeddings. Selected in the range [0, max_position_embeddings - 1]. Shape as (batch_size, num_tokens) and dtype as int64. Defaults to None.

  • inputs_embeds -- If you want to control how to convert inputs_ids indices into associated vectors, you can pass an embedded representation directly instead of passing inputs_ids.

返回

Returns tuple (sequence_output, pooled_output, encoder_outputs[1:]) or a dict with last_hidden_state`, pooled_output, all_hidden_states, fields.

With the fields:

  • sequence_output (Tensor):

    Sequence of hidden-states at the last layer of the model. It's data type should be float32 and has a shape of [batch_size, sequence_length, hidden_size].

  • pooled_output (Tensor):

    The output of first token ([CLS]) in sequence. We "pool" the model by simply taking the hidden state corresponding to the first token. Its data type should be float32 and has a shape of [batch_size, hidden_size].

  • last_hidden_state (Tensor):

    The output of the last encoder layer, it is also the sequence_output. It's data type should be float32 and has a shape of [batch_size, sequence_length, hidden_size].

  • all_hidden_states (Tensor):

    Hidden_states of all layers in the Transformer encoder. The length of all_hidden_states is num_hidden_layers + 1. For all element in the tuple, its data type should be float32 and its shape is [batch_size, sequence_length, hidden_size].

返回类型

tuple or Dict

示例

import paddle
from paddlenlp.transformers.fnet.modeling import FNetModel
from paddlenlp.transformers.fnet.tokenizer import FNetTokenizer

tokenizer = FNetTokenizer.from_pretrained('fnet-base')
model = FNetModel.from_pretrained('fnet-base')

inputs = tokenizer("Welcome to use PaddlePaddle and PaddleNLP!")
inputs = {k:paddle.to_tensor([v]) for (k, v) in inputs.items()}
output = model(**inputs)
class FNetForSequenceClassification(fnet, num_classes=2)[源代码]

基类:paddlenlp.transformers.fnet.modeling.FNetPretrainedModel

FNet Model with a linear layer on top of the output layer, designed for sequence classification/regression tasks like GLUE tasks.

参数
  • fnet (FNetModel) -- An instance of FNetModel.

  • num_classes (int, optional) -- The number of classes. Defaults to 2.

forward(input_ids=None, token_type_ids=None, position_ids=None, inputs_embeds=None, labels=None, output_hidden_states=None, return_dict=None)[源代码]

The FNetForSequenceClassification forward method.

参数
  • input_ids (Tensor) -- Indices of input sequence tokens in the vocabulary. They are numerical representations of tokens that build the input sequence. Its data type should be int64 and it has a shape of [batch_size, sequence_length].

  • token_type_ids (Tensor, optional) --

    Segment token indices to indicate different portions of the inputs. Selected in the range [0, type_vocab_size - 1]. If type_vocab_size is 2, which means the inputs have two portions. Indices can either be 0 or 1:

    • 0 corresponds to a sentence A token,

    • 1 corresponds to a sentence B token.

    Its data type should be int64 and it has a shape of [batch_size, sequence_length]. Defaults to None, which means we don't add segment embeddings.

  • position_ids (Tensor, optional) -- Indices of positions of each input sequence tokens in the position embeddings. Selected in the range [0, max_position_embeddings - 1]. Shape as (batch_size, num_tokens) and dtype as int64. Defaults to None.

  • inputs_embeds -- If you want to control how to convert inputs_ids indices into associated vectors, you can pass an embedded representation directly instead of passing inputs_ids.

返回

Returns tensor logits, or a dict with logits, hidden_states, attentions fields.

With the fields:

  • logits (Tensor):

    A tensor of the input text classification logits. Shape as [batch_size, num_classes] and dtype as float32.

  • hidden_states (Tensor):

    Hidden_states of all layers in the Transformer encoder. The length of hidden_states is num_hidden_layers + 1. For all element in the tuple, its data type should be float32 and its shape is [batch_size, sequence_length, hidden_size].

返回类型

Tensor or Dict

示例

import paddle
from paddlenlp.transformers.fnet.modeling import FNetForSequenceClassification
from paddlenlp.transformers.fnet.tokenizer import FNetTokenizer

tokenizer = FNetTokenizer.from_pretrained('fnet-base')
model = FNetModel.from_pretrained('fnet-base')

inputs = tokenizer("Welcome to use PaddlePaddle and PaddleNLP!")
inputs = {k:paddle.to_tensor([v]) for (k, v) in inputs.items()}
output = model(**inputs)
class FNetForPreTraining(fnet)[源代码]

基类:paddlenlp.transformers.fnet.modeling.FNetPretrainedModel

FNet Model with two heads on top as done during the pretraining: a masked language modeling head and a next sentence prediction (classification) head.

forward(input_ids=None, token_type_ids=None, position_ids=None, inputs_embeds=None, labels=None, next_sentence_label=None, output_hidden_states=None, return_dict=None)[源代码]

The FNetForPretraining forward method.

参数
  • input_ids (Tensor) -- See FNetModel.

  • token_type_ids (Tensor, optional) -- See FNetModel.

  • position_ids (Tensor, optional) -- See FNetModel.

  • labels (LongTensor of shape (batch_size, sequence_length), optional) -- Labels for computing the masked language modeling loss.

  • inputs_embeds (Tensor, optional) -- See FNetModel.

  • next_sentence_labels (Tensor) -- The labels of the next sentence prediction task, the dimensionality of next_sentence_labels is equal to seq_relation_labels. Its data type should be int64 and its shape is [batch_size, 1]

  • output_hidden_states (bool, optional) -- See FNetModel.

  • return_dict (bool, optional) -- See FNetModel.

返回

Returns tuple (prediction_scores, seq_relationship_score) or a dict with prediction_logits, seq_relationship_logits, hidden_states fields.

返回类型

tuple or Dict

class FNetForMaskedLM(fnet)[源代码]

基类:paddlenlp.transformers.fnet.modeling.FNetPretrainedModel

FNet Model with a masked language modeling head on top.

参数

fnet (FNetModel) -- An instance of FNetModel.

forward(input_ids=None, token_type_ids=None, position_ids=None, inputs_embeds=None, labels=None, next_sentence_label=None, output_hidden_states=None, return_dict=None)[源代码]

The FNetForMaskedLM forward method.

参数
返回

Returns tensor prediction_scores or a dict with prediction_logits, hidden_states fields.

With the fields:

  • prediction_scores (Tensor):

    The scores of masked token prediction. Its data type should be float32. and its shape is [batch_size, sequence_length, vocab_size].

  • hidden_states (Tensor):

    Hidden_states of all layers in the Transformer encoder. The length of hidden_states is num_hidden_layers + 1. For all element in the tuple, its data type should be float32 and its shape is [batch_size, sequence_length, hidden_size].

返回类型

Tensor or Dict

class FNetForNextSentencePrediction(fnet)[源代码]

基类:paddlenlp.transformers.fnet.modeling.FNetPretrainedModel

FNet Model with a next sentence prediction head on top.

参数

fnet (FNetModel) -- An instance of FNetModel.

forward(input_ids=None, token_type_ids=None, position_ids=None, inputs_embeds=None, labels=None, next_sentence_label=None, output_hidden_states=None, return_dict=None)[源代码]

Defines the computation performed at every call. Should be overridden by all subclasses.

参数
  • *inputs (tuple) -- unpacked tuple arguments

  • **kwargs (dict) -- unpacked dict arguments

class FNetForMultipleChoice(fnet)[源代码]

基类:paddlenlp.transformers.fnet.modeling.FNetPretrainedModel

FNet Model with a linear layer on top of the hidden-states output layer, designed for multiple choice tasks like SWAG tasks .

参数

fnet (FNetModel) -- An instance of FNetModel.

forward(input_ids=None, token_type_ids=None, position_ids=None, inputs_embeds=None, labels=None, output_hidden_states=None, return_dict=None)[源代码]

Defines the computation performed at every call. Should be overridden by all subclasses.

参数
  • *inputs (tuple) -- unpacked tuple arguments

  • **kwargs (dict) -- unpacked dict arguments

class FNetForTokenClassification(fnet, num_classes=2)[源代码]

基类:paddlenlp.transformers.fnet.modeling.FNetPretrainedModel

FNet Model with a linear layer on top of the hidden-states output layer, designed for token classification tasks like NER tasks.

参数
  • fnet (FNetModel) -- An instance of FNetModel.

  • num_classes (int, optional) -- The number of classes. Defaults to 2.

forward(input_ids=None, token_type_ids=None, position_ids=None, inputs_embeds=None, labels=None, output_hidden_states=None, return_dict=None)[源代码]

Defines the computation performed at every call. Should be overridden by all subclasses.

参数
  • *inputs (tuple) -- unpacked tuple arguments

  • **kwargs (dict) -- unpacked dict arguments

class FNetForQuestionAnswering(fnet, num_labels)[源代码]

基类:paddlenlp.transformers.fnet.modeling.FNetPretrainedModel

FNet Model with a linear layer on top of the hidden-states output to compute span_start_logits and span_end_logits, designed for question-answering tasks like SQuAD.

参数
  • fnet (FNetModel) -- An instance of FNetModel.

  • num_labels (int) -- The number of labels.

forward(input_ids=None, token_type_ids=None, position_ids=None, inputs_embeds=None, start_positions=None, end_positions=None, output_hidden_states=None, return_dict=None)[源代码]

Defines the computation performed at every call. Should be overridden by all subclasses.

参数
  • *inputs (tuple) -- unpacked tuple arguments

  • **kwargs (dict) -- unpacked dict arguments