modeling#
Modeling classes for UNIMO model.
- class UNIMOPretrainedModel(*args, **kwargs)[source]#
Bases:
PretrainedModel
An abstract class for pretrained UNIMO models. It provides UNIMO related
model_config_file
,pretrained_init_configuration
,resource_files_names
,pretrained_resource_files_map
,base_model_prefix
for downloading and loading pretrained models. SeePretrainedModel
for more details.- config_class#
alias of
UNIMOConfig
- base_model_class#
alias of
UNIMOModel
- class UNIMOModel(config: UNIMOConfig)[source]#
Bases:
UNIMOPretrainedModel
The bare UNIMO Model outputting raw hidden-states.
This model inherits from
PretrainedModel
. Refer to the superclass documentation for the generic methods.This model is also a paddle.nn.Layer subclass. Use it as a regular Paddle Layer and refer to the Paddle documentation for all matter related to general usage and behavior.
- Parameters:
config (
UNIMOConfig
) – An instance of UNIMOConfig used to construct UNIMOModel.
- get_input_embeddings()[source]#
get input embedding of model
- Returns:
embedding of model
- Return type:
nn.Embedding
- set_input_embeddings(value)[source]#
set new input embedding for model
- Parameters:
value (Embedding) – the new embedding of model
- Raises:
NotImplementedError – Model has not implement
set_input_embeddings
method
- forward(input_ids: Tensor | None = None, token_type_ids: Tensor | None = None, position_ids: Tensor | None = None, attention_mask: Tensor | None = None, use_cache: bool | None = None, cache: Tuple[Tensor] | None = None, inputs_embeds: Tensor | None = None, output_attentions: bool | None = None, output_hidden_states: bool | None = None, return_dict: bool | None = None)[source]#
The UNIMOModel forward method, overrides the special
__call__()
method.- Parameters:
input_ids (Tensor, optional) – Indices of input sequence tokens in the vocabulary. They are numerical representations of tokens that build the input sequence. It’s data type should be
int64
and has a shape of [batch_size, sequence_length].token_type_ids (Tensor) –
Segment token indices to indicate first and second portions of the inputs. Indices can be either 0 or 1:
0 corresponds to a sentence A token,
1 corresponds to a sentence B token.
It’s data type should be
int64
and has a shape of [batch_size, sequence_length]. Defaults to None, which means no segment embeddings is added to token embeddings.position_ids (Tensor) – Indices of positions of each input sequence tokens in the position embeddings. Selected in the range
[0, max_position_embeddings - 1]
. It’s data type should beint64
and has a shape of [batch_size, sequence_length]. Defaults toNone
.attention_mask (Tensor) – Mask used in multi-head attention to avoid performing attention to some unwanted positions, usually the paddings or the subsequent positions. Its data type can be int, float and bool. When the data type is bool, the
masked
tokens haveFalse
values and the others haveTrue
values. When the data type is int, themasked
tokens have0
values and the others have1
values. When the data type is float, themasked
tokens have-INF
values and the others have0
values. It is a tensor with shape broadcasted to[batch_size, num_attention_heads, sequence_length, sequence_length]
. For example, its shape can be [batch_size, sequence_length], [batch_size, sequence_length, sequence_length], [batch_size, num_attention_heads, sequence_length, sequence_length]. Defaults toNone
, which means nothing needed to be prevented attention to.use_cache – (bool, optional): Whether or not use the model cache to speed up decoding. Defaults to
False
.cache (list, optional) – It is a list, and each element in the list is
incremental_cache
produced bypaddle.nn.TransformerEncoderLayer.gen_cache()
method. Seepaddle.nn.TransformerEncoder.gen_cache()
method for more details. It is only used for inference and should be None for training. Defaults toNone
.inputs_embeds (Tensor, optional) – Optionally, instead of passing
input_ids
you can choose to directly pass an embedded representation of shape(batch_size, sequence_length, hidden_size)
. This is useful if you want more control over how to convertinput_ids
indices into associated vectors than the model’s internal embedding lookup matrix. Default to None.output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See
attentions
under returned tensors for more detail. Defaults toFalse
.output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See
hidden_states
under returned tensors for more detail. Defaults toFalse
.return_dict (bool, optional) – Whether to return a
BaseModelOutputWithPastAndCrossAttentions
object. IfFalse
, the output will be a tuple of tensors. Defaults toFalse
.
- Returns:
An instance of
BaseModelOutputWithPastAndCrossAttentions
ifreturn_dict=True
. Otherwise it returns a tuple of tensors corresponding to ordered and not None (depending on the input arguments) fields ofBaseModelOutputWithPastAndCrossAttentions
. Especially, Whenreturn_dict=output_hidden_states=output_attentions=False
andcache=None
, returns tensorSequence_output
of shape [batch_size, sequence_length, hidden_size], which is the output at the last layer of the model.
Example
from paddlenlp.transformers import UNIMOModel from paddlenlp.transformers import UNIMOTokenizer model = UNIMOModel.from_pretrained('unimo-text-1.0') tokenizer = UNIMOTokenizer.from_pretrained('unimo-text-1.0') inputs = tokenizer.gen_encode("Welcome to use PaddlePaddle and PaddleNLP!", return_tensors=True) outputs = model(**inputs)
- class UNIMOLMHeadModel(config: UNIMOConfig)[source]#
Bases:
UNIMOPretrainedModel
The UNIMO Model with a
language modeling
head on top designed for generation tasks.- Parameters:
unimo (
UNIMOModel
) – An instance ofUNIMOModel
.
- forward(input_ids: Tensor | None = None, token_type_ids: Tensor | None = None, position_ids: Tensor | None = None, attention_mask: Tensor | None = None, masked_positions: Tensor | None = None, use_cache: bool | None = None, cache: Tuple[Tensor] | None = None, inputs_embeds: Tensor | None = None, labels: Tensor | None = None, output_attentions: bool | None = None, output_hidden_states: bool | None = None, return_dict: bool | None = None)[source]#
The UNIMOLMHeadModel forward method, overrides the special
__call__()
method.- Parameters:
input_ids (Tensor, optional) – See
UNIMOModel
.token_type_ids (Tensor) – See
UNIMOModel
.position_ids (Tensor) – See
UNIMOModel
.attention_mask (Tensor) – See
UNIMOModel
.use_cache – (bool, optional): See
UNIMOModel
.cache (list, optional) – See
UNIMOModel
.inputs_embeds (Tensor, optional) – See
UNIMOModel
.labels (Tensor, optional) – Labels for computing the left-to-right language modeling loss. Indices should be in
[-100, 0, ..., vocab_size]
(seeinput_ids
docstring) Tokens with indices set to-100
are ignored (masked), the loss is only computed for the tokens with labels n[0, ..., vocab_size]
output_attentions (bool, optional) – See
UNIMOModel
.output_hidden_states (bool, optional) – See
UNIMOModel
.return_dict (bool, optional) – Whether to return a
CausalLMOutputWithPastAndCrossAttentions
object. IfFalse
, the output will be a tuple of tensors. Defaults toFalse
.
- Returns:
An instance of
CausalLMOutputWithPastAndCrossAttentions
ifreturn_dict=True
. Otherwise it returns a tuple of tensors corresponding to ordered and not None (depending on the input arguments) fields ofCausalLMOutputWithPastAndCrossAttentions
. Especially, Whenreturn_dict=output_hidden_states=output_attentions=False
andcache=labels=None
, returns tensorlogits
of shape [batch_size, sequence_length, hidden_size], which is the output at the last layer of the model.
Example
from paddlenlp.transformers import UNIMOLMHeadModel from paddlenlp.transformers import UNIMOTokenizer model = UNIMOLMHeadModel.from_pretrained('unimo-text-1.0') tokenizer = UNIMOTokenizer.from_pretrained('unimo-text-1.0') inputs = tokenizer.gen_encode( "Welcome to use PaddlePaddle and PaddleNLP!", return_tensors=True, is_split_into_words=False) logits = model(**inputs)
- UNIMOForMaskedLM#
alias of
UNIMOLMHeadModel
- UNIMOForConditionalGeneration#
alias of
UNIMOLMHeadModel