modeling¶
Modeling classes for FNet model.
-
class
FNetPretrainedModel
(*args, **kwargs)[源代码]¶ 基类:
paddlenlp.transformers.model_utils.PretrainedModel
An abstract class for pretrained FNet models. It provides FNet related
model_config_file
,pretrained_init_configuration
,resource_files_names
,pretrained_resource_files_map
,base_model_prefix
for downloading and loading pretrained models. SeePretrainedModel
for more details.-
base_model_class
¶
-
-
class
FNetModel
(vocab_size=32000, hidden_size=768, num_hidden_layers=12, intermediate_size=3072, hidden_act='gelu_new', hidden_dropout_prob=0.1, max_position_embeddings=512, type_vocab_size=4, initializer_range=0.02, layer_norm_eps=1e-12, pad_token_id=3, bos_token_id=1, eos_token_id=2, add_pooling_layer=True)[源代码]¶ 基类:
paddlenlp.transformers.fnet.modeling.FNetPretrainedModel
The model can behave as an encoder, following the architecture described in FNet: Mixing Tokens with Fourier Transforms by James Lee-Thorp, Joshua Ainslie, Ilya Eckstein, Santiago Ontanon.
- 参数
vocab_size (int, optional) -- Vocabulary size of
inputs_ids
inFNetModel
. Also is the vocab size of token embedding matrix. Defines the number of different tokens that can be represented by theinputs_ids
passed when callingFNetModel
. Defaults to32000
.hidden_size (int, optional) -- Dimensionality of the encoder layer and pooler layer. Defaults to
768
.num_hidden_layers (int, optional) -- Number of hidden layers in the Transformer encoder. Defaults to
12
.intermediate_size (int, optional) -- Dimensionality of the feed-forward (ff) layer in the encoder. Input tensors to ff layers are firstly projected from
hidden_size
tointermediate_size
, and then projected back tohidden_size
. Typicallyintermediate_size
is larger thanhidden_size
. Defaults to3072
.hidden_act (str, optional) -- The non-linear activation function in the feed-forward layer.
"gelu"
,"relu"
and any other paddle supported activation functions are supported. Defaults toglue_new
.hidden_dropout_prob (float, optional) -- The dropout probability for all fully connected layers in the embeddings and encoder. Defaults to
0.1
.max_position_embeddings (int, optional) -- The maximum value of the dimensionality of position encoding, which dictates the maximum supported length of an input sequence. Defaults to
512
.type_vocab_size (int, optional) -- The vocabulary size of
token_type_ids
. Defaults to4
.initializer_range (float, optional) --
The standard deviation of the normal initializer. Defaults to
0.02
. .. note:A normal_initializer initializes weight matrices as normal distributions. See :meth:`BertPretrainedModel.init_weights()` for how weights are initialized in `ElectraModel`.
layer_norm_eps (float, optional) -- The
epsilon
parameter used inpaddle.nn.LayerNorm
for initializing layer normalization layers. A small value to the variance added to the normalization layer to prevent division by zero. Defaults to1e-12
.pad_token_id (int, optional) -- The index of padding token in the token vocabulary. Defaults to
3
.add_pooling_layer (bool, optional) -- Whether or not to add the pooling layer. Defaults to
True
.
-
set_input_embeddings
(value)[源代码]¶ set new input embedding for model
- 参数
value (Embedding) -- the new embedding of model
- 引发
NotImplementedError -- Model has not implement
set_input_embeddings
method
-
forward
(input_ids=None, token_type_ids=None, position_ids=None, inputs_embeds=None, output_hidden_states=None, return_dict=None)[源代码]¶ The FNetModel forward method.
- 参数
input_ids (Tensor) -- Indices of input sequence tokens in the vocabulary. They are numerical representations of tokens that build the input sequence. Its data type should be
int64
and it has a shape of [batch_size, sequence_length].token_type_ids (Tensor, optional) --
Segment token indices to indicate different portions of the inputs. Selected in the range
[0, type_vocab_size - 1]
. Iftype_vocab_size
is 2, which means the inputs have two portions. Indices can either be 0 or 1:0 corresponds to a sentence A token,
1 corresponds to a sentence B token.
Its data type should be
int64
and it has a shape of [batch_size, sequence_length]. Defaults toNone
, which means we don't add segment embeddings.position_ids (Tensor, optional) -- Indices of positions of each input sequence tokens in the position embeddings. Selected in the range
[0, max_position_embeddings - 1]
. Shape as(batch_size, num_tokens)
and dtype as int64. Defaults toNone
.inputs_embeds -- If you want to control how to convert
inputs_ids
indices into associated vectors, you can pass an embedded representation directly instead of passinginputs_ids
.
- 返回
Returns tuple (
sequence_output
,pooled_output
,encoder_outputs[1:]
) or a dict with last_hidden_state`,pooled_output
,all_hidden_states
, fields.With the fields:
sequence_output
(Tensor):Sequence of hidden-states at the last layer of the model. It's data type should be float32 and has a shape of [
batch_size, sequence_length, hidden_size
].
pooled_output
(Tensor):The output of first token (
[CLS]
) in sequence. We "pool" the model by simply taking the hidden state corresponding to the first token. Its data type should be float32 and has a shape of [batch_size, hidden_size].
last_hidden_state
(Tensor):The output of the last encoder layer, it is also the
sequence_output
. It's data type should be float32 and has a shape of [batch_size, sequence_length, hidden_size].
all_hidden_states
(Tensor):Hidden_states of all layers in the Transformer encoder. The length of
all_hidden_states
isnum_hidden_layers + 1
. For all element in the tuple, its data type should be float32 and its shape is [batch_size, sequence_length, hidden_size
].
- 返回类型
tuple or Dict
示例
import paddle from paddlenlp.transformers.fnet.modeling import FNetModel from paddlenlp.transformers.fnet.tokenizer import FNetTokenizer tokenizer = FNetTokenizer.from_pretrained('fnet-base') model = FNetModel.from_pretrained('fnet-base') inputs = tokenizer("Welcome to use PaddlePaddle and PaddleNLP!") inputs = {k:paddle.to_tensor([v]) for (k, v) in inputs.items()} output = model(**inputs)
-
class
FNetForSequenceClassification
(fnet, num_classes=2)[源代码]¶ 基类:
paddlenlp.transformers.fnet.modeling.FNetPretrainedModel
FNet Model with a linear layer on top of the output layer, designed for sequence classification/regression tasks like GLUE tasks.
- 参数
fnet (
FNetModel
) -- An instance of FNetModel.num_classes (int, optional) -- The number of classes. Defaults to
2
.
-
forward
(input_ids=None, token_type_ids=None, position_ids=None, inputs_embeds=None, labels=None, output_hidden_states=None, return_dict=None)[源代码]¶ The FNetForSequenceClassification forward method.
- 参数
input_ids (Tensor) -- Indices of input sequence tokens in the vocabulary. They are numerical representations of tokens that build the input sequence. Its data type should be
int64
and it has a shape of [batch_size, sequence_length].token_type_ids (Tensor, optional) --
Segment token indices to indicate different portions of the inputs. Selected in the range
[0, type_vocab_size - 1]
. Iftype_vocab_size
is 2, which means the inputs have two portions. Indices can either be 0 or 1:0 corresponds to a sentence A token,
1 corresponds to a sentence B token.
Its data type should be
int64
and it has a shape of [batch_size, sequence_length]. Defaults toNone
, which means we don't add segment embeddings.position_ids (Tensor, optional) -- Indices of positions of each input sequence tokens in the position embeddings. Selected in the range
[0, max_position_embeddings - 1]
. Shape as(batch_size, num_tokens)
and dtype as int64. Defaults toNone
.inputs_embeds -- If you want to control how to convert
inputs_ids
indices into associated vectors, you can pass an embedded representation directly instead of passinginputs_ids
.
- 返回
Returns tensor
logits
, or a dict withlogits
,hidden_states
,attentions
fields.With the fields:
logits
(Tensor):A tensor of the input text classification logits. Shape as
[batch_size, num_classes]
and dtype as float32.
hidden_states
(Tensor):Hidden_states of all layers in the Transformer encoder. The length of
hidden_states
isnum_hidden_layers + 1
. For all element in the tuple, its data type should be float32 and its shape is [batch_size, sequence_length, hidden_size
].
- 返回类型
Tensor or Dict
示例
import paddle from paddlenlp.transformers.fnet.modeling import FNetForSequenceClassification from paddlenlp.transformers.fnet.tokenizer import FNetTokenizer tokenizer = FNetTokenizer.from_pretrained('fnet-base') model = FNetModel.from_pretrained('fnet-base') inputs = tokenizer("Welcome to use PaddlePaddle and PaddleNLP!") inputs = {k:paddle.to_tensor([v]) for (k, v) in inputs.items()} output = model(**inputs)
-
class
FNetForPreTraining
(fnet)[源代码]¶ 基类:
paddlenlp.transformers.fnet.modeling.FNetPretrainedModel
FNet Model with two heads on top as done during the pretraining: a
masked language modeling
head and anext sentence prediction (classification)
head.-
get_output_embeddings
()[源代码]¶ To be overwrited for models with output embeddings
- 返回
the otuput embedding of model
- 返回类型
Optional[Embedding]
-
forward
(input_ids=None, token_type_ids=None, position_ids=None, inputs_embeds=None, labels=None, next_sentence_label=None, output_hidden_states=None, return_dict=None)[源代码]¶ The FNetForPretraining forward method.
- 参数
input_ids (Tensor) -- See
FNetModel
.token_type_ids (Tensor, optional) -- See
FNetModel
.position_ids (Tensor, optional) -- See
FNetModel
.labels (LongTensor of shape (batch_size, sequence_length), optional) -- Labels for computing the masked language modeling loss.
inputs_embeds (Tensor, optional) -- See
FNetModel
.next_sentence_labels (Tensor) -- The labels of the next sentence prediction task, the dimensionality of
next_sentence_labels
is equal toseq_relation_labels
. Its data type should be int64 and its shape is [batch_size, 1]output_hidden_states (bool, optional) -- See
FNetModel
.return_dict (bool, optional) -- See
FNetModel
.
- 返回
Returns tuple (
prediction_scores
,seq_relationship_score
) or a dict withprediction_logits
,seq_relationship_logits
,hidden_states
fields.- 返回类型
tuple or Dict
-
-
class
FNetForMaskedLM
(fnet)[源代码]¶ 基类:
paddlenlp.transformers.fnet.modeling.FNetPretrainedModel
FNet Model with a
masked language modeling
head on top.-
get_output_embeddings
()[源代码]¶ To be overwrited for models with output embeddings
- 返回
the otuput embedding of model
- 返回类型
Optional[Embedding]
-
forward
(input_ids=None, token_type_ids=None, position_ids=None, inputs_embeds=None, labels=None, next_sentence_label=None, output_hidden_states=None, return_dict=None)[源代码]¶ The FNetForMaskedLM forward method.
- 参数
input_ids (Tensor) -- See
FNetModel
.token_type_ids (Tensor, optional) -- See
FNetModel
.position_ids (Tensor, optional) -- See
FNetModel
.inputs_embeds (Tensor, optional) -- See
FNetModel
.labels (Tensor, optional) -- See
FNetForPreTraining
.next_sentence_label (Tensor, optional) -- See
FNetForPreTraining
.output_hidden_states (Tensor, optional) -- See
FNetModel
.return_dict (bool, optional) -- See
FNetModel
.
- 返回
Returns tensor
prediction_scores
or a dict withprediction_logits
,hidden_states
fields.With the fields:
prediction_scores
(Tensor):The scores of masked token prediction. Its data type should be float32. and its shape is [batch_size, sequence_length, vocab_size].
hidden_states
(Tensor):Hidden_states of all layers in the Transformer encoder. The length of
hidden_states
isnum_hidden_layers + 1
. For all element in the tuple, its data type should be float32 and its shape is [batch_size, sequence_length, hidden_size
].
- 返回类型
Tensor or Dict
-
-
class
FNetForNextSentencePrediction
(fnet)[源代码]¶ 基类:
paddlenlp.transformers.fnet.modeling.FNetPretrainedModel
FNet Model with a
next sentence prediction
head on top.-
get_output_embeddings
()[源代码]¶ To be overwrited for models with output embeddings
- 返回
the otuput embedding of model
- 返回类型
Optional[Embedding]
-
forward
(input_ids=None, token_type_ids=None, position_ids=None, inputs_embeds=None, labels=None, next_sentence_label=None, output_hidden_states=None, return_dict=None)[源代码]¶ Defines the computation performed at every call. Should be overridden by all subclasses.
- 参数
*inputs (tuple) -- unpacked tuple arguments
**kwargs (dict) -- unpacked dict arguments
-
-
class
FNetForMultipleChoice
(fnet)[源代码]¶ 基类:
paddlenlp.transformers.fnet.modeling.FNetPretrainedModel
FNet Model with a linear layer on top of the hidden-states output layer, designed for multiple choice tasks like SWAG tasks .
- 参数
fnet (
FNetModel
) -- An instance of FNetModel.
-
forward
(input_ids=None, token_type_ids=None, position_ids=None, inputs_embeds=None, labels=None, output_hidden_states=None, return_dict=None)[源代码]¶ Defines the computation performed at every call. Should be overridden by all subclasses.
- 参数
*inputs (tuple) -- unpacked tuple arguments
**kwargs (dict) -- unpacked dict arguments
-
class
FNetForTokenClassification
(fnet, num_classes=2)[源代码]¶ 基类:
paddlenlp.transformers.fnet.modeling.FNetPretrainedModel
FNet Model with a linear layer on top of the hidden-states output layer, designed for token classification tasks like NER tasks.
- 参数
fnet (
FNetModel
) -- An instance of FNetModel.num_classes (int, optional) -- The number of classes. Defaults to
2
.
-
forward
(input_ids=None, token_type_ids=None, position_ids=None, inputs_embeds=None, labels=None, output_hidden_states=None, return_dict=None)[源代码]¶ Defines the computation performed at every call. Should be overridden by all subclasses.
- 参数
*inputs (tuple) -- unpacked tuple arguments
**kwargs (dict) -- unpacked dict arguments
-
class
FNetForQuestionAnswering
(fnet, num_labels)[源代码]¶ 基类:
paddlenlp.transformers.fnet.modeling.FNetPretrainedModel
FNet Model with a linear layer on top of the hidden-states output to compute
span_start_logits
andspan_end_logits
, designed for question-answering tasks like SQuAD.- 参数
fnet (
FNetModel
) -- An instance of FNetModel.num_labels (int) -- The number of labels.
-
forward
(input_ids=None, token_type_ids=None, position_ids=None, inputs_embeds=None, start_positions=None, end_positions=None, output_hidden_states=None, return_dict=None)[源代码]¶ Defines the computation performed at every call. Should be overridden by all subclasses.
- 参数
*inputs (tuple) -- unpacked tuple arguments
**kwargs (dict) -- unpacked dict arguments