model_utils¶
-
class
PretrainedModel
(*args, **kwargs)[source]¶ The base class for all pretrained models. It mainly provides common methods for loading (construction and loading) and saving pretrained models. Loading and saving also rely on the following class attributes which should be overridden by derived classes accordingly:
model_config_file (str): Represents the file name of model configuration for configuration saving and loading in local file system. The value is
model_config.json
.resource_files_names (dict): Name of local file where the model configuration can be saved and loaded locally. Currently, resources only include the model state, thus the dict only includes
'model_state'
as key with corresponding value'model_state.pdparams'
for model weights saving and loading.pretrained_init_configuration (dict): Provides the model configurations of built-in pretrained models (contrasts to models in local file system). It has pretrained model names as keys (such as
bert-base-uncased
), and the values are dict preserving corresponding configuration for model initialization.pretrained_resource_files_map (dict): Provides resource URLs of built-in pretrained models (contrasts to models in local file system). It has the same key as resource_files_names (that is “model_state”), and the corresponding value is a dict with specific model name to model weights URL mapping (such as “bert-base-uncased” -> “https://bj.bcebos.com/paddlenlp/models/transformers/bert-base-uncased.pdparams”).
base_model_prefix (str): Represents the attribute associated to the base model in derived classes of the same architecture adding layers on top of the base model. Note: A base model class is pretrained model class decorated by
register_base_model
, such asBertModel
; A derived model class is a pretrained model class adding layers on top of the base model, and it has a base model as attribute, such asBertForSequenceClassification
.
Methods common to models for text generation are defined in
GenerationMixin
and also inherited here.Besides, metaclass
InitTrackerMeta
is used to createPretrainedModel
, by which subclasses can track arguments for initialization automatically.-
property
base_model
¶ The body of the same model architecture. It is the base model itself for base model or the base model attribute for derived model.
- Type
-
property
model_name_list
¶ Contains all supported built-in pretrained model names of the current PretrainedModel class.
- Type
list
-
get_input_embeddings
() → paddle.nn.layer.common.Embedding[source]¶ get input embedding of model
- Returns
embedding of model
- Return type
nn.Embedding
-
set_input_embeddings
(value: paddle.nn.layer.common.Embedding)[source]¶ set new input embedding for model
- Parameters
value (Embedding) – the new embedding of model
- Raises
NotImplementedError – Model has not implement
set_input_embeddings
method
-
get_output_embeddings
() → Optional[paddle.nn.layer.common.Embedding][source]¶ To be overwrited for models with output embeddings
- Returns
the otuput embedding of model
- Return type
Optional[Embedding]
-
resize_position_embeddings
(new_num_position_embeddings: int)[source]¶ resize position embedding, this method should be overrited overwrited by downstream models
- Parameters
new_num_position_embeddings (int) – the new position size
- Raises
NotImplementedError – when called and not be implemented
-
classmethod
constructed_from_pretrained_config
(init_func=None) → bool[source]¶ check if the model is constructed from
PretrainedConfig
- Returns
if the model is constructed from
PretrainedConfig
- Return type
bool
-
classmethod
from_pretrained
(pretrained_model_name_or_path, *args, from_hf_hub=False, subfolder=None, **kwargs)[source]¶ Creates an instance of
PretrainedModel
. Model weights are loaded by specifying name of a built-in pretrained model, or a community contributed model, or a local file directory path.- Parameters
pretrained_model_name_or_path (str) –
Name of pretrained model or dir path to load from. The string can be:
Name of a built-in pretrained model
Name of a pretrained model from HF hub
Name of a community-contributed pretrained model.
Local directory path which contains model weights file(“model_state.pdparams”) and model config file (“model_config.json”).
from_hf_hub (bool, optional) – whether to load from Huggingface Hub
subfolder (str, optional) – Only works when loading from Huggingface Hub.
*args (tuple) – Position arguments for model
__init__
. If provided, use these as position argument values for model initialization.**kwargs (dict) – Keyword arguments for model
__init__
. If provided, use these to update pre-defined keyword argument values for model initialization. If the keyword is in__init__
argument names of base model, update argument values of the base model; else update argument values of derived model.load_state_as_np (bool, optional) – The weights read in can be choosed to place on CPU or GPU though the model is on the default device. If
True
, load the model weights asnumpy.ndarray
on CPU. Otherwise, weights would be loaded as tensors on the default device. Note that if on GPU, the latter would creates extra temporary tensors in addition to the model weights, which doubles the memory usage . Thus it is suggested to useTrue
for big models on GPU. Default toFalse
.
- Returns
An instance of
PretrainedModel
.- Return type
Example
from paddlenlp.transformers import BertForSequenceClassification # Name of built-in pretrained model model = BertForSequenceClassification.from_pretrained('bert-base-uncased') # Name of community-contributed pretrained model model = BertForSequenceClassification.from_pretrained('yingyibiao/bert-base-uncased-sst-2-finetuned') # Load from local directory path model = BertForSequenceClassification.from_pretrained('./my_bert/')
-
get_model_config
()[source]¶ Get model configuration.
- Returns
The config of the model.
- Return type
config
-
save_model_config
(save_dir: str)[source]¶ Saves model configuration to a file named “config.json” under
save_dir
.- Parameters
save_dir (str) – Directory to save model_config file into.
-
save_pretrained
(save_dir: str)[source]¶ Saves model configuration and related resources (model state) as files under
save_dir
. The model configuration would be saved into a file named “model_config.json”, and model state would be saved into a file named “model_state.pdparams”.The
save_dir
can be used infrom_pretrained
as argument value ofpretrained_model_name_or_path
to re-load the trained model.- Parameters
save_dir (str) – Directory to save files into.
Example
from paddlenlp.transformers import BertForSequenceClassification model = BertForSequenceClassification.from_pretrained('bert-base-uncased') model.save_pretrained('./trained_model/') # reload from save_directory model = BertForSequenceClassification.from_pretrained('./trained_model/')
-
save_to_hf_hub
(repo_id: str, private: Optional[bool] = None, subfolder: Optional[str] = None, commit_message: Optional[str] = None, revision: Optional[str] = None, create_pr: bool = False)[source]¶ Uploads all elements of this model to a new HuggingFace Hub repository. :param repo_id: Repository name for your model/tokenizer in the Hub. :type repo_id: str :param private: Whether the model/tokenizer is set to private :type private: bool, optional :param subfolder: Push to a subfolder of the repo instead of the root :type subfolder: str, optional :param commit_message: f”Upload {path_in_repo} with huggingface_hub” :type commit_message: str, optional :param revision: :type revision: str, optional :param create_pr: If revision is not set, PR is opened against the “main” branch. If revision is set and is a branch, PR is opened against this branch.
If revision is set and is not a branch name (example: a commit oid), an RevisionNotFoundError is returned by the server.
Returns: The url of the commit of your model in the given repository.
-
resize_token_embeddings
(new_num_tokens: Optional[int] = None) → paddle.nn.layer.common.Embedding[source]¶ Resizes input token embeddings matrix of the model according to new_num_tokens.
- Parameters
new_num_tokens (Optional[int]) – The number of new tokens in the embedding matrix. Increasing the size will add newly initialized vectors at the end. Reducing the size will remove vectors from the end. If not provided or None, just returns a pointer to the input tokens embedding module of the model without doing anything.
- Returns
The input tokens Embeddings Module of the model.
- Return type
paddle.nn.Embedding
-
classmethod
from_pretrained_v2
(pretrained_model_name_or_path, from_hf_hub: bool = False, subfolder: str | None = None, *args, **kwargs)[source]¶ Creates an instance of
PretrainedModel
. Model weights are loaded by specifying name of a built-in pretrained model, a pretrained model from HF Hub, a community contributed model, or a local file directory path.- Parameters
pretrained_model_name_or_path (str) –
Name of pretrained model or dir path to load from. The string can be:
Name of a built-in pretrained model
Name of a pretrained model from HF Hub
Name of a community-contributed pretrained model.
Local directory path which contains model weights file(“model_state.pdparams”) and model config file (“model_config.json”).
from_hf_hub (bool) – load model from huggingface hub. Default to
False
.subfolder (str, optional) – Only works when loading from Huggingface Hub.
*args (tuple) – Position arguments for model
__init__
. If provided, use these as position argument values for model initialization.**kwargs (dict) – Keyword arguments for model
__init__
. If provided, use these to update pre-defined keyword argument values for model initialization. If the keyword is in__init__
argument names of base model, update argument values of the base model; else update argument values of derived model.load_state_as_np (bool, optional) – The weights read in can be choosed to place on CPU or GPU though the model is on the default device. If
True
, load the model weights asnumpy.ndarray
on CPU. Otherwise, weights would be loaded as tensors on the default device. Note that if on GPU, the latter would creates extra temporary tensors in addition to the model weights, which doubles the memory usage . Thus it is suggested to useTrue
for big models on GPU. Default toFalse
.
- Returns
An instance of
PretrainedModel
.- Return type
Example
from paddlenlp.transformers import BertForSequenceClassification # Name of built-in pretrained model model = BertForSequenceClassification.from_pretrained('bert-base-uncased') # Name of pretrained model from PaddleHub model = BertForSequenceClassification.from_pretrained('bert-base-uncased') # Name of community-contributed pretrained model model = BertForSequenceClassification.from_pretrained('yingyibiao/bert-base-uncased-sst-2-finetuned', num_labels=3) # Load from local directory path model = BertForSequenceClassification.from_pretrained('./my_bert/'
-
save_pretrained_v2
(save_dir: str)[source]¶ Saves model configuration and related resources (model state) as files under
save_dir
. The model configuration would be saved into a file named “model_config.json”, and model state would be saved into a file named “model_state.pdparams”.The
save_dir
can be used infrom_pretrained
as argument value ofpretrained_model_name_or_path
to re-load the trained model.- Parameters
save_dir (str) – Directory to save files into.
Example
from paddlenlp.transformers import BertForSequenceClassification model = BertForSequenceClassification.from_pretrained('bert-base-uncased') model.save_pretrained('./trained_model/') # reload from save_directory model = BertForSequenceClassification.from_pretrained('./trained_model/')
-
register_base_model
(cls)[source]¶ A decorator for
PretrainedModel
class. It first retrieves the parent class of the class being decorated, then sets thebase_model_class
attribute of that parent class to be the class being decorated. In summary, the decorator registers the decorated class as the base model class in all derived classes under the same architecture.- Parameters
cls (PretrainedModel) – The class (inherited from PretrainedModel) to be decorated .
- Returns
The input class
cls
after decorating.- Return type
Example
from paddlenlp.transformers import BertModel, register_base_model BertModel = register_base_model(BertModel) assert BertModel.base_model_class == BertModel