modeling#
- class SqueezeBertModel(config: SqueezeBertConfig)[源代码]#
-
- set_input_embeddings(new_embeddings)[源代码]#
set new input embedding for model
- 参数:
value (Embedding) -- the new embedding of model
- 抛出:
NotImplementedError -- Model has not implement
set_input_embeddings
method
- forward(input_ids=None, attention_mask=None, token_type_ids=None, position_ids=None, output_attentions=None, output_hidden_states=None)[源代码]#
The forward method, overrides the
__call__()
special method. :param input_ids: Indices of input sequence tokens in the vocabulary. They arenumerical representations of tokens that build the input sequence. Its data type should be
int64
and it has a shape of [batch_size, sequence_length].- 参数:
attention_mask (Tensor, optional) -- Mask used in multi-head attention to avoid performing attention on to some unwanted positions, usually the paddings or the subsequent positions. Its data type can be int, float and bool. If its data type is int, the values should be either 0 or 1. - 1 for tokens that not masked, - 0 for tokens that masked. It is a tensor with shape broadcasted to
[batch_size, num_attention_heads, sequence_length, sequence_length]
. Defaults toNone
, which means nothing needed to be prevented attention to.token_type_ids (Tensor, optional) -- Segment token indices to indicate different portions of the inputs. Selected in the range
[0, type_vocab_size - 1]
. Iftype_vocab_size
is 2, which means the inputs have two portions. Indices can either be 0 or 1: - 0 corresponds to a sentence A token, - 1 corresponds to a sentence B token. Its data type should beint64
and it has a shape of [batch_size, sequence_length]. Defaults toNone
, which means we don't add segment embeddings.position_ids (Tensor, optional) -- Indices of positions of each input sequence tokens in the position embeddings. Selected in the range
[0, max_position_embeddings - 1]
. Shape as(batch_size, num_tokens)
and dtype as int64. Defaults toNone
.output_attentions (bool, optional) -- Whether to return the attention_weight of each hidden layers. Defaults to
False
.output_hidden_states (bool, optional) -- Whether to return the output of each hidden layers. Defaults to
False
.
- 返回:
Returns tuple (
sequence_output
,pooled_output
) with (encoder_outputs
,encoder_attentions
) by optional. With the fields: -sequence_output
(Tensor):Sequence of hidden-states at the last layer of the model. It's data type should be float32 and its shape is [batch_size, sequence_length, hidden_size].
pooled_output
(Tensor):The output of first token (
[CLS]
) in sequence. We "pool" the model by simply taking the hidden state corresponding to the first token. Its data type should be float32 and its shape is [batch_size, hidden_size].
encoder_outputs
(List(Tensor)):A list of Tensor containing hidden-states of the model at each hidden layer in the Transformer encoder. The length of the list is
num_hidden_layers
+ 1 (Embedding Layer output). Each Tensor has a data type of float32 and its shape is [batch_size, sequence_length, hidden_size].
- 返回类型:
tuple
- class SqueezeBertPreTrainedModel(*args, **kwargs)[源代码]#
-
An abstract class for pretrained SqueezBert models. It provides SqueezBert related
model_config_file
,resource_files_names
,pretrained_resource_files_map
,pretrained_init_configuration
,base_model_prefix
for downloading and loading pretrained models. SeePretrainedModel
for more details.- config_class#
SqueezeBertConfig
的别名
- base_model_class#
SqueezeBertModel
的别名
- class SqueezeBertForSequenceClassification(config: SqueezeBertConfig)[源代码]#
-
SqueezeBert Model with a sequence classification/regression head on top (a linear layer on top of the pooled output) e.g. for GLUE tasks. :param config: An instance of SqueezeBertConfig. :type config:
SqueezeBertConfig
- forward(input_ids, token_type_ids=None, position_ids=None, attention_mask=None)[源代码]#
The SqueezeBertForSequenceClassification forward method, overrides the __call__() special method. :param input_ids: See
SqueezeBertModel
. :type input_ids: Tensor :param token_type_ids: SeeSqueezeBertModel
. :type token_type_ids: Tensor, optional :param position_ids: SeeSqueezeBertModel
. :type position_ids: Tensor, optional :param attention_mask: SeeSqueezeBertModel
. :type attention_mask: list, optional- 返回:
Returns tensor
logits
, a tensor of the input text classification logits. Shape as[batch_size, num_classes]
and dtype as float32.- 返回类型:
Tensor
- class SqueezeBertForTokenClassification(config: SqueezeBertConfig)[源代码]#
-
SqueezeBert Model with a token classification head on top (a linear layer on top of the hidden-states output) e.g. for Named-Entity-Recognition (NER) tasks. :param config: An instance of SqueezeBertConfig. :type config:
SqueezeBertConfig
- forward(input_ids, token_type_ids=None, position_ids=None, attention_mask=None)[源代码]#
The SqueezeBertForTokenClassification forward method, overrides the __call__() special method. :param input_ids: See
SqueezeBertModel
. :type input_ids: Tensor :param token_type_ids: SeeSqueezeBertModel
. :type token_type_ids: Tensor, optional :param position_ids: SeeSqueezeBertModel
. :type position_ids: Tensor, optional :param attention_mask: SeeSqueezeBertModel
. :type attention_mask: list, optional- 返回:
Returns tensor
logits
, a tensor of the input token classification logits. Shape as[batch_size, sequence_length, num_classes]
and dtype asfloat32
.- 返回类型:
Tensor
- class SqueezeBertForQuestionAnswering(config: SqueezeBertConfig)[源代码]#
-
SqueezeBert Model with a span classification head on top for extractive question-answering tasks like SQuAD (a linear layers on top of the hidden-states output to compute
span start logits
andspan end logits
). :param config: An instance of SqueezeBertConfig. :type config:SqueezeBertConfig
- forward(input_ids, token_type_ids=None)[源代码]#
The SqueezeBertForQuestionAnswering forward method, overrides the __call__() special method. :param input_ids: See
SqueezeBertModel
. :type input_ids: Tensor :param token_type_ids: SeeSqueezeBertModel
. :type token_type_ids: Tensor, optional- 返回:
Returns tuple (
start_logits
,end_logits
). With the fields: -start_logits
(Tensor):A tensor of the input token classification logits, indicates the start position of the labelled span. Its data type should be float32 and its shape is [batch_size, sequence_length].
end_logits
(Tensor):A tensor of the input token classification logits, indicates the end position of the labelled span. Its data type should be float32 and its shape is [batch_size, sequence_length].
- 返回类型:
tuple