class ErnieGenPretrainedModel(*args, **kwargs)[source]

Bases: paddlenlp.transformers.model_utils.PretrainedModel

An abstract class for pretrained ErnieGen models. It provides ErnieGen related model_config_file, pretrained_init_configuration, resource_files_names, pretrained_resource_files_map, base_model_prefix for downloading and loading pretrained models. See PretrainedModel for more details.

classmethod from_pretrained(pretrained_model_name_or_path, *args, **kwargs)[source]

Creates an instance of PretrainedModel. Model weights are loaded by specifying name of a built-in pretrained model, or a community contributed model, or a local file directory path.

  • pretrained_model_name_or_path (str) –

    Name of pretrained model or dir path to load from. The string can be:

    • Name of a built-in pretrained model

    • Name of a pretrained model from HF hub

    • Name of a community-contributed pretrained model.

    • Local directory path which contains model weights file(“model_state.pdparams”) and model config file (“model_config.json”).

  • from_hf_hub (bool, optional) – whether to load from Huggingface Hub

  • subfolder (str, optional) – Only works when loading from Huggingface Hub.

  • *args (tuple) – Position arguments for model __init__. If provided, use these as position argument values for model initialization.

  • **kwargs (dict) – Keyword arguments for model __init__. If provided, use these to update pre-defined keyword argument values for model initialization. If the keyword is in __init__ argument names of base model, update argument values of the base model; else update argument values of derived model.

  • load_state_as_np (bool, optional) – The weights read in can be choosed to place on CPU or GPU though the model is on the default device. If True, load the model weights as numpy.ndarray on CPU. Otherwise, weights would be loaded as tensors on the default device. Note that if on GPU, the latter would creates extra temporary tensors in addition to the model weights, which doubles the memory usage . Thus it is suggested to use True for big models on GPU. Default to False.


An instance of PretrainedModel.

Return type



from paddlenlp.transformers import BertForSequenceClassification

# Name of built-in pretrained model
model = BertForSequenceClassification.from_pretrained('bert-base-uncased')

# Name of community-contributed pretrained model
model = BertForSequenceClassification.from_pretrained('yingyibiao/bert-base-uncased-sst-2-finetuned')

# Load from local directory path
model = BertForSequenceClassification.from_pretrained('./my_bert/')

alias of paddlenlp.transformers.ernie_gen.modeling.ErnieModel

class ErnieForGeneration(cfg, name=None)[source]

Bases: paddlenlp.transformers.ernie_gen.modeling.ErnieModel

Ernie Model for sequence to sequence generation.

This model inherits from ErnieModel. Refer to the superclass documentation for the generic methods.

forward(*args, **kwargs)[source]
  • tgt_labels (Tensor, optional) – The ground truth target sequence id (hard label) or distribution (soft label). It’s data type should be int64 and has a shape of [batch_size, sequence_length] or [batch_size, sequence_length, sequence_length].

  • tgt_pos (Tensor, optional) – Index of tgt_labels in src_ids. It’s data type should be int64 and has a shape of [n_targets, 2]).

  • encode_only (bool, optional) – Whether the model will output the logits or only encode the inputs. If encode_only is True, loss and logits_2d will not be returned.


Returns tuple (None, None, info) if encode_only is True, returns (output_ids, logits, info) if tgt_labels or tgt_pos is None, else, returns (loss, logits_2d, info).

With the fields:

  • `info`(dict):

    Middle level info, includes all hidden stats and k/v caches.

  • `output_ids`(Tensor):

    The output index. Its data type should be float32 and its shape is [batch_size]. If encode_only, returns None.

  • `logits`(Tensor):

    Logits for every targets. Its data type should be float32 and its shape is [batch_size, sequence_length]. If encode_only, returns None.

  • `loss`(Tensor):

    Cross entropy loss mean over every target label. If encode_only, returns None.

  • `logits_2d`(Tensor):

    Logits for every targets if tgt_labels or tgt_pos is not None . Its data type should be float32 and its shape is [batch_size, sequence_length].

Return type



alias of paddlenlp.transformers.ernie_gen.modeling.ErnieForGeneration