modeling¶
-
class
ErnieDualEncoder
(query_model_name_or_path=None, title_model_name_or_path=None, share_parameters=False, output_emb_size=None, dropout=None, reinitialize=False, use_cross_batch=False)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
This class encapsulates two ErnieEncoder models into one model, so query embedding and title embedding could be obtained using one model. And this class allows two ErnieEncoder models to be trained at the same time.
Example
import paddle from paddlenlp.transformers import ErnieDualEncoder, ErnieTokenizer model = ErnieDualEncoder("rocketqa-zh-dureader-query-encoder", "rocketqa-zh-dureader-para-encoder") tokenizer = ErnieTokenizer.from_pretrained("rocketqa-zh-dureader-query-encoder") inputs = tokenizer("Welcome to use PaddlePaddle and PaddleNLP!") inputs = {k:paddle.to_tensor([v]) for (k, v) in inputs.items()} # Get query embedding query_embedding = model.get_pooled_embedding(**inputs) # Get title embedding title_embedding = model.get_pooled_embedding(**inputs, is_query=False)
-
get_pooled_embedding
(input_ids, token_type_ids=None, position_ids=None, attention_mask=None, is_query=True)[source]¶ Get the first feature of each sequence for classification
-
forward
(query_input_ids, pos_title_input_ids, neg_title_input_ids, is_prediction=False, query_token_type_ids=None, query_position_ids=None, query_attention_mask=None, pos_title_token_type_ids=None, pos_title_position_ids=None, pos_title_attention_mask=None, neg_title_token_type_ids=None, neg_title_position_ids=None, neg_title_attention_mask=None)[source]¶ Defines the computation performed at every call. Should be overridden by all subclasses.
- Parameters
*inputs (tuple) – unpacked tuple arguments
**kwargs (dict) – unpacked dict arguments
-
-
class
ErnieCrossEncoder
(pretrain_model_name_or_path, num_classes=2, reinitialize=False, dropout=None)[source]¶ Bases:
paddle.fluid.dygraph.layers.Layer
Example
import paddle from paddlenlp.transformers import ErnieCrossEncoder, ErnieTokenizer model = ErnieCrossEncoder("rocketqa-zh-dureader-cross-encoder") tokenizer = ErnieTokenizer.from_pretrained("rocketqa-zh-dureader-cross-encoder") inputs = tokenizer("你们好", text_pair="你好") inputs = {k:paddle.to_tensor([v]) for (k, v) in inputs.items()} # Get embedding of text pair. embedding = model.matching(**inputs)
-
matching
(input_ids, token_type_ids=None, position_ids=None, attention_mask=None, return_prob_distributation=False)[source]¶ Use the pooled_output as the feature for pointwise prediction, eg. RocketQAv1
-
matching_v2
(input_ids, token_type_ids=None, position_ids=None, attention_mask=None)[source]¶ Use the cls token embedding as the feature for listwise prediction, eg. RocketQAv2
-
-
class
ErnieEncoder
(config: ErnieConfig, output_emb_size: int | None = None)[source]¶ Bases:
paddlenlp.transformers.ernie.modeling.ErniePretrainedModel