modeling#

class OPTModel(config: OPTConfig)[source]#

Bases: OPTPretrainedModel

The bare OPT Model transformer outputting raw hidden-states.

This model inherits from PretrainedModel. Refer to the superclass documentation for the generic methods.

This model is also a Paddle paddle.nn.Layer subclass. Use it as a regular Paddle Layer and refer to the Paddle documentation for all matter related to general usage and behavior.

Parameters:

config (OPTConfig) – An instance of OPTConfig used to construct OPTModel.

forward(input_ids=None, position_ids=None, attention_mask=None, inputs_embeds=None, use_cache=False, cache=None, output_attentions=None, output_hidden_states=None, return_dict=None)[source]#

The OPTModel forward method, overrides the __call__() special method.

Parameters:
  • input_ids (Tensor) – Indices of input sequence tokens in the vocabulary. They are numerical representations of tokens that build the input sequence. Its data type should be int64 and it has a shape of [batch_size, sequence_length].

  • position_ids (Tensor, optional) – Indices of positions of each input sequence tokens in the position embeddings. Selected in the range [0, max_position_embeddings - 1]. Shape as (batch_size, num_tokens) and dtype as int64. Defaults to None.

  • attention_mask (Tensor, optional) – Mask used in self attention to avoid performing attention to some unwanted positions, usually the subsequent positions. It is a tensor with shape broadcasted to [batch_size, num_attention_heads, sequence_length, sequence_length]. For example, its shape can be [batch_size, sequence_length], [batch_size, sequence_length, sequence_length], [batch_size, num_attention_heads, sequence_length, sequence_length]. Its data type should be float32. The masked tokens have -1e9 values, and the unmasked tokens have 0 values. Defaults to None, which means nothing needed to be prevented attention to.

  • inputs_embeds (Tensor, optional) – Optionally, instead of passing input_ids you can choose to directly pass an embedded representation of shape (batch_size, sequence_length, hidden_size). This is useful if you want more control over how to convert input_ids indices into associated vectors than the model’s internal embedding lookup matrix. Default to None.

  • use_cache (bool, optional) – Whether or not to use cache. Defaults to False. If set to True, key value states will be returned and can be used to speed up decoding.

  • cache (list, optional) – It is a list, and each element in the list is a tuple (incremental_cache, static_cache). See TransformerDecoder.gen_cache for more details. It is only used for inference and should be None for training. Default to None.

  • output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned tensors for more detail. Defaults to None.

  • output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for more detail. Defaults to None.

  • return_dict (bool, optional) – Whether to return a BaseModelOutputWithPastAndCrossAttentions object. If False, the output will be a tuple of tensors. Defaults to None.

Returns:

Returns tensor encoder_output, which is the output at the last layer of the model. Its data type should be float32 and has a shape of [batch_size, sequence_length, hidden_size].

Return type:

Tensor

Example

import paddle
from paddlenlp.transformers import OPTModel, GPTTokenizer

tokenizer = GPTTokenizer.from_pretrained('facebook/opt-125m')

model = OPTModel.from_pretrained('facebook/opt-125m')

inputs = tokenizer("Welcome to use PaddlePaddle and PaddleNLimage.pngP!", return_token_type_ids=False)
inputs = {k:paddle.to_tensor([v]) for (k, v) in inputs.items()}
output = model(**inputs)
get_input_embeddings()[source]#

get opt input word embedding :returns: the input word embedding of opt mdoel :rtype: nn.Embedding

set_input_embeddings(embedding: Embedding)[source]#

set opt input embedding :returns: the instance of new word embedding :rtype: nn.Embedding

class OPTPretrainedModel(*args, **kwargs)[source]#

Bases: PretrainedModel

An abstract class for pretrained OPT models. It provides OPT related model_config_file, resource_files_names, pretrained_resource_files_map, pretrained_init_configuration, base_model_prefix for downloading and loading pretrained models. See PretrainedModel for more details.

config_class#

alias of OPTConfig

base_model_class#

alias of OPTModel

class OPTForCausalLM(config: OPTConfig)[source]#

Bases: OPTPretrainedModel

The OPT Model with a language modeling head on top.

Parameters:

config (OPTConfig) – An instance of OPTConfig used to construct OPTModel.

forward(input_ids=None, attention_mask=None, inputs_embeds=None, labels=None, use_cache=False, cache=None, output_attentions=None, output_hidden_states=None, return_dict=None, **kwargs)[source]#
Parameters:
  • input_ids (Tensor) – See OPTModel.

  • attention_mask (Tensor, optional) – See OPTModel.

  • inputs_embeds (Tensor, optional) – See GPTModel.

  • use_cache (bool, optional) – See OPTModel.

  • cache (Tensor, optional) – See OPTModel.

  • labels (paddle.Tensor, optional) – A Tensor of shape (batch_size, sequence_length). Labels for language modeling. Note that the labels are shifted inside the model, i.e. you can set labels = input_ids Indices are selected in [-100, 0, ..., vocab_size] All labels set to -100 are ignored (masked), the loss is only computed for labels in [0, ..., vocab_size] Defaults to None.

  • output_attentions (bool, optional) – See GPTModel.

  • output_hidden_states (bool, optional) – See GPTModel.

  • return_dict (bool, optional) – See GPTModel.

Returns:

Returns tensor logits or tuple (logits, cached_kvs). If use_cache is True, tuple (logits, cached_kvs) will be returned. Otherwise, tensor logits will be returned. logits is the output of the opt model. cache_kvs is the cache output of opt model if use_cache is True.

Return type:

Tensor or tuple

Example

import paddle
from paddlenlp.transformers import OPTForCausalLM, GPTTokenizer

tokenizer = GPTTokenizer.from_pretrained('facebook/opt-125m')
model = OPTForCausalLM.from_pretrained('facebook/opt-125m')

inputs = tokenizer("Welcome to use PaddlePaddle and PaddleNLP!")
inputs = {k:paddle.to_tensor([v]) for (k, v) in inputs.items()}
output_ids, score = model.generate(input_ids=inputs['input_ids'])
print(tokenizer.batch_decode(output_ids[0]))
OPTForConditionalGeneration#

alias of OPTForCausalLM