modeling¶
-
class
ErnieGenPretrainedModel
(*args, **kwargs)[source]¶ Bases:
paddlenlp.transformers.model_utils.PretrainedModel
An abstract class for pretrained ErnieGen models. It provides ErnieGen related
model_config_file
,pretrained_init_configuration
,resource_files_names
,pretrained_resource_files_map
,base_model_prefix
for downloading and loading pretrained models. SeePretrainedModel
for more details.-
classmethod
from_pretrained
(pretrained_model_name_or_path, *args, **kwargs)[source]¶ Creates an instance of
PretrainedModel
. Model weights are loaded by specifying name of a built-in pretrained model, or a community contributed model, or a local file directory path.- Parameters
pretrained_model_name_or_path (str) –
Name of pretrained model or dir path to load from. The string can be:
Name of a built-in pretrained model
Name of a pretrained model from HF hub
Name of a community-contributed pretrained model.
Local directory path which contains model weights file(“model_state.pdparams”) and model config file (“model_config.json”).
from_hf_hub (bool, optional) – whether to load from Huggingface Hub
subfolder (str, optional) – Only works when loading from Huggingface Hub.
*args (tuple) – Position arguments for model
__init__
. If provided, use these as position argument values for model initialization.**kwargs (dict) – Keyword arguments for model
__init__
. If provided, use these to update pre-defined keyword argument values for model initialization. If the keyword is in__init__
argument names of base model, update argument values of the base model; else update argument values of derived model.load_state_as_np (bool, optional) – The weights read in can be choosed to place on CPU or GPU though the model is on the default device. If
True
, load the model weights asnumpy.ndarray
on CPU. Otherwise, weights would be loaded as tensors on the default device. Note that if on GPU, the latter would creates extra temporary tensors in addition to the model weights, which doubles the memory usage . Thus it is suggested to useTrue
for big models on GPU. Default toFalse
.
- Returns
An instance of
PretrainedModel
.- Return type
Example
from paddlenlp.transformers import BertForSequenceClassification # Name of built-in pretrained model model = BertForSequenceClassification.from_pretrained('bert-base-uncased') # Name of community-contributed pretrained model model = BertForSequenceClassification.from_pretrained('yingyibiao/bert-base-uncased-sst-2-finetuned') # Load from local directory path model = BertForSequenceClassification.from_pretrained('./my_bert/')
-
base_model_class
¶ alias of
paddlenlp.transformers.ernie_gen.modeling.ErnieModel
-
classmethod
-
class
ErnieForGeneration
(cfg, name=None)[source]¶ Bases:
paddlenlp.transformers.ernie_gen.modeling.ErnieModel
Ernie Model for sequence to sequence generation.
This model inherits from
ErnieModel
. Refer to the superclass documentation for the generic methods.-
forward
(*args, **kwargs)[source]¶ - Parameters
tgt_labels (Tensor, optional) – The ground truth target sequence id (hard label) or distribution (soft label). It’s data type should be
int64
and has a shape of [batch_size, sequence_length] or [batch_size, sequence_length, sequence_length].tgt_pos (Tensor, optional) – Index of tgt_labels in
src_ids
. It’s data type should beint64
and has a shape of [n_targets, 2]).encode_only (bool, optional) – Whether the model will output the logits or only encode the inputs. If
encode_only
isTrue
,loss
andlogits_2d
will not be returned.
- Returns
Returns tuple (
None
,None
,info
) ifencode_only
isTrue
, returns (output_ids
,logits
,info
) iftgt_labels
ortgt_pos
isNone
, else, returns (loss
,logits_2d
,info
).With the fields:
- `info`(dict):
Middle level info, includes all hidden stats and k/v caches.
- `output_ids`(Tensor):
The output index. Its data type should be float32 and its shape is [batch_size]. If
encode_only
, returns None.
- `logits`(Tensor):
Logits for every targets. Its data type should be float32 and its shape is [batch_size, sequence_length]. If
encode_only
, returns None.
- `loss`(Tensor):
Cross entropy loss mean over every target label. If
encode_only
, returns None.
- `logits_2d`(Tensor):
Logits for every targets if
tgt_labels
ortgt_pos
is notNone
. Its data type should be float32 and its shape is [batch_size, sequence_length].
- Return type
tuple
-
-
ErnieGenModel
¶ alias of
paddlenlp.transformers.ernie_gen.modeling.ErnieForGeneration