modeling¶
-
class
BlenderbotSmallModel
(vocab_size, bos_token_id=1, pad_token_id=0, eos_token_id=2, decoder_start_token_id=1, d_model=512, num_encoder_layers=8, num_decoder_layers=8, encoder_attention_heads=16, decoder_attention_heads=16, encoder_ffn_dim=2048, decoder_ffn_dim=2048, dropout=0.1, activation_function='gelu', attention_dropout=0.0, activation_dropout=0.0, max_position_embeddings=512, init_std=0.02, scale_embedding=True, normalize_before=False)[source]¶ Bases:
paddlenlp.transformers.blenderbot_small.modeling.BlenderbotSmallPretrainedModel
Construct a bare BlenderbotSmall Model.
This model inherits from
PretrainedModel
. Check the superclass documentation for the generic methods and the library implements for all its model.This model is also a Paddle paddle.nn.Layer subclass. Use it as a regular Paddle Layer and refer to the Paddle documentation for all matter related to general usage and behavior.
- Parameters
vocab_size (
int
) – Vocabulary size of the BlenderbotSmall model.bos_token_id (
int
, optional) – The id for begging of sentences token. Defaults to1
.pad_token_id (
int
, optional) – The id for padding token. Defaults to0
.eos_token_id (
int
, optional) – The id for end of sentence token. Defaults to2
.decoder_start_token_id (
int
, optional) – The id indicating the start of decoding sentence. Defaults to1
.d_model (
int
, optional) – Dimensionality of the layers and the pooler layer. Defaults to512
.num_encoder_layers (
int
, optional) – Number of Transformer encoder layers for BlenderbotSmallEncoder. Defaults to8
.num_decoder_layers (
int
, optional) – Number of Transformer decoder layers for BlenderbotSmallDecoder. Defaults to8
.encoder_attention_heads (
int
, optional) – Number of attention heads for each Transformer encoder layer in BlenderbotSmallEncoder. Defaults to16
.decoder_attention_heads (
int
, optional) – Number of attention heads for each Transformer decoder layer in BlenderbotSmallDecoder. Defaults to16
.encoder_ffn_dim (
int
, optional) – Dimensionality of the feed-forward layer for each Transformer encoder layer in BlenderbotSmallEncoder. Defaults to2048
.decoder_ffn_dim (
int
, optional) – Dimensionality of the feed-forward layer for each Transformer dncoder layer in BlenderbotSmallDncoder. Defaults to2048
.dropout (
float
, optional) – The dropout probability for all fully connected layers in the embeddings, encoder, and pooler. Defaults to0.1
.activation_function (
str
, optional) – The non-linear activation function (function or string) in the encoder and pooler."gelu"
,"relu"
and any other paddle supported activation functions are supported. Defaults to"gelu"
.attention_dropout (
float
, optional) – The dropout ratio for the attention probabilities. Defaults to0.0
.activation_dropout (
float
, optional) – The dropout ratio for activations inside the fully connected layer.max_position_embeddings (
int
, optional) –, The max position index of an input sequence. Defaults to
512
.init_std (
float
, optional) – The standard deviation of the truncated_normal_initializer for initializing all weight matrices. Defaults to0.02
.scale_embedding (
bool
, optional) – Indicate whether to scale embeddings by diving by sqrt(d_model). Defaults toTrue
.normalize_before (bool, optional) – Indicate whether to put layer normalization into preprocessing of MHA and FFN sub-layers. If True, pre-process is layer normalization and post-precess includes dropout, residual connection. Otherwise, no pre-process and post-precess includes dropout, residual connection, layer normalization. Defaults to
False
.
-
forward
(input_ids=None, attention_mask=None, decoder_input_ids=None, decoder_attention_mask=None, encoder_output=None, use_cache=False, cache=None, **kwargs)[source]¶ - Parameters
input_ids (Tensor) – Indices of input sequence tokens in the vocabulary. They are numerical representations of tokens that build the input sequence. It’s data type should be
int64
and has a shape of [batch_size, sequence_length].attention_mask (Tensor, optional) –
Mask to indicate whether to perform attention on each input token or not. The values should be either 0 or 1. The attention scores will be set to -infinity for any positions in the mask that are 0, and will be unchanged for positions that are 1.
1 for tokens that are not masked,
0 for tokens that are masked.
It’s data type should be
float32
and has a shape of [batch_size, sequence_length]. Defaults toNone
.decoder_input_ids (Tensor, optional) – If not provided,
decoder_input_ids
will be automatically generated based ondecoder_start_token_id
andinput_ids
.decoder_attention_mask (Tensor, optional) – If not provided, the default
decoder_attention_mask
will be a tensor with upper triangular part being-np.inf
. the shape will be(decoder_length, decoder_length)
encoder_output (Tensor, optional) – The output of encoder. If not provided, a new
encoder_output
will be generated from BlenderbotEncoder. Defaults toNone
.use_cache (bool, optional) – Indicates whether to use cache to speed up decoding. Defaults to
False
cache (list, optional) – It is a list, and each element in the list is a tuple(
(incremental_cache, static_cache)
). SeeTransformerDecoder.gen_cache
for more details. It is only used for inference and should be None for training. Default None.
- Returns
If
use_cache=False
, the return will be the last hidden state of decoder with shape of [batch_size, seq_lens, hidden_size].seq_lens
corresponds to the length of input sequence. Otherwise, the return will be a tuple of(decoder_output, cache)
. Please refer to classpaddle.nn.TransformerDecoder
for more information regardingcache
.- Return type
Tensor|tuple
Example
import paddle from paddlenlp.transformers import BlenderbotSmallTokenizer, BlenderbotSmallModel
# “blenderbot_small-90M” is pretrained weight of BlenderbotSmallForConditionalGeneration, # Therefore some weight of additional layers in BlenderbotSmallForConditionalGeneration # might not be loaded and used. pretrained_model_name = “blenderbot_small-90M” tokenizer = BlenderbotSmallTokenizer.from_pretrained(pretrained_model_name) model = BlenderbotSmallModel.from_pretrained(pretrained_model_name)
sample_text = “My friends are cool but they eat too many carbs.” inputs = tokenizer(sample_text, return_attention_mask=True, return_token_type_ids=False) inputs = {k:paddle.to_tensor([v]) for (k, v) in inputs.items()} decoder_output = model(**inputs)
-
class
BlenderbotSmallPretrainedModel
(*args, **kwargs)[source]¶ Bases:
paddlenlp.transformers.model_utils.PretrainedModel
An abstract class for pretrained BlenderbotSmall models. It provides BlenderbotSmall related
model_config_file
,resource_files_names
,pretrained_resource_files_map
,pretrained_init_configuration
,base_model_prefix
for downloading and loading pretrained models. Refer toPretrainedModel
for more details.-
base_model_class
¶ alias of
paddlenlp.transformers.blenderbot_small.modeling.BlenderbotSmallModel
-
-
class
BlenderbotSmallEncoder
(vocab_size, embed_tokens=None, pad_token_id=0, d_model=512, num_encoder_layers=6, encoder_attention_heads=12, encoder_ffn_dim=2048, dropout=0.1, activation_function='gelu', attention_dropout=0.0, activation_dropout=0.0, max_position_embeddings=1024, init_std=0.02, scale_embedding=True, normalize_before=False)[source]¶ Bases:
paddlenlp.transformers.blenderbot_small.modeling.BlenderbotSmallPretrainedModel
The encoder of BlenderbotSmall Model. Please refer to
PretrainedModel
orBlenderbotSmallModel
for more details regarding methods and arguments.
-
class
BlenderbotSmallDecoder
(vocab_size, embed_tokens=None, pad_token_id=1, d_model=768, num_decoder_layers=6, decoder_attention_heads=12, decoder_ffn_dim=3072, dropout=0.1, activation_function='gelu', attention_dropout=0.1, activation_dropout=0.1, max_position_embeddings=1024, init_std=0.02, scale_embedding=True, normalize_before=False)[source]¶ Bases:
paddlenlp.transformers.blenderbot_small.modeling.BlenderbotSmallPretrainedModel
The decoder of BlenderbotSmall Model. Please refer to
PretrainedModel
andBlenderbotModel
for more information regarding methods and arguments.
-
class
BlenderbotSmallForConditionalGeneration
(blenderbot_small)[source]¶ Bases:
paddlenlp.transformers.blenderbot_small.modeling.BlenderbotSmallPretrainedModel
Please refer to
BlenderbotModel
for more information regarding arguments. :returns:- If
use_cache=False
, the return will be a tensor with shape of [batch_size, seq_lens, hidden_size]. Otherwise, the return will be a tuple of
(decoder_output, cache)
.
- Return type
Tensor|tuple
Example
import paddle from paddlenlp.transformers import BlenderbotSmallTokenizer, BlenderbotSmallForConditionalGeneration pretrained_model_name = "blenderbot_small-90M" tokenizer = BlenderbotSmallTokenizer.from_pretrained(pretrained_model_name) model = BlenderbotSmallForConditionalGeneration.from_pretrained(pretrained_model_name) sample_text = "My friends are cool but they eat too many carbs." inputs = tokenizer(sample_text, return_attention_mask=True, return_token_type_ids=False) inputs = {k: paddle.to_tensor([v]) for (k, v) in inputs.items()} result_ids, score = model.generate(input_ids=inputs['input_ids'], max_length=60, min_length=20, decode_strategy='beam_search', num_beams=10, length_penalty=0.65 ) for sequence_ids in result_ids.numpy().tolist(): print("User: ", sample_text) print("bot: ", tokenizer.convert_ids_to_string(sequence_ids))
-
forward
(input_ids=None, attention_mask=None, decoder_input_ids=None, decoder_attention_mask=None, encoder_output=None, use_cache=False, cache=None)[source]¶ Defines the computation performed at every call. Should be overridden by all subclasses.
- Parameters
*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments
- If
-
class
BlenderbotSmallForCausalLM
(blenderbot_small)[source]¶ Bases:
paddlenlp.transformers.blenderbot_small.modeling.BlenderbotSmallPretrainedModel
Constructs BLenderbotSmall For Causal Language Model. This model is equivalent to the blenderbotSmall decoder without cross-attention.
-
forward
(input_ids=None, attention_mask=None, use_cache=False, cache=None, **kwargs)[source]¶ - Parameters
input_ids (Tensor) – Indices of input sequence tokens in the vocabulary. They are numerical representations of tokens that build the input sequence. It’s data type should be
int64
and has a shape of [batch_size, sequence_length].attention_mask (Tensor, optional) –
Mask to indicate whether to perform attention on each input token or not. The values should be either 0 or 1. The attention scores will be set to -infinity for any positions in the mask that are 0, and will be unchanged for positions that are 1.
1 for tokens that are not masked,
0 for tokens that are masked.
It’s data type should be
float32
and has a shape of [batch_size, sequence_length]. Defaults toNone
.use_cache (bool, optional) – Indicates whether to use cache to speed up decoding. Defaults to
False
cache (list, optional) – It is a list, and each element in the list is a tuple(
(incremental_cache, static_cache)
). Seepaddle.nn.TransformerDecoder.gen_cache
for more details. It is only used for inference and should be None for training. Default None.
- Returns
- If
use_cache=False
, the return will be a tensor with shape of [batch_size, seq_lens, hidden_size]. Otherwise, the return will be a tuple of
(lm_logits, cache)
.
- If
- Return type
Tensor|tuple
Example
import paddle from paddlenlp.transformers import BlenderbotSmallTokenizer, BlenderbotSmallForCausalLM use_cache = False text = “My friends are cool but they eat too many carbs.” model_name = “blenderbot_small-90M” tokenizer = BlenderbotSmallTokenizer.from_pretrained(model_name) model = BlenderbotSmallForCausalLM.from_pretrained(model_name) model.eval() inputs = tokenizer(text, return_attention_mask=True, return_token_type_ids=False) inputs = {k: paddle.to_tensor([v]) for (k, v) in inputs.items()}
- with paddle.no_grad():
outputs = model(**inputs, use_cache=use_cache) # outputs is a tuple of (lm_logits, cache) if
use_cache=True
.
-