modeling

class ErnieDualEncoder(query_model_name_or_path=None, title_model_name_or_path=None, share_parameters=False, output_emb_size=None, dropout=None, reinitialize=False, use_cross_batch=False)[源代码]

基类:paddle.fluid.dygraph.layers.Layer

This class encapsulates two ErnieEncoder models into one model, so query embedding and title embedding could be obtained using one model. And this class allows two ErnieEncoder models to be trained at the same time.

示例

import paddle
from paddlenlp.transformers import ErnieDualEncoder, ErnieTokenizer

model = ErnieDualEncoder("rocketqa-zh-dureader-query-encoder", "rocketqa-zh-dureader-para-encoder")
tokenizer = ErnieTokenizer.from_pretrained("rocketqa-zh-dureader-query-encoder")

inputs = tokenizer("Welcome to use PaddlePaddle and PaddleNLP!")
inputs = {k:paddle.to_tensor([v]) for (k, v) in inputs.items()}

# Get query embedding
query_embedding = model.get_pooled_embedding(**inputs)

# Get title embedding
title_embedding = model.get_pooled_embedding(**inputs, is_query=False)
init_weights(layer)[源代码]

Initialization hook

get_pooled_embedding(input_ids, token_type_ids=None, position_ids=None, attention_mask=None, is_query=True)[源代码]

Get the first feature of each sequence for classification

forward(query_input_ids, pos_title_input_ids, neg_title_input_ids, is_prediction=False, query_token_type_ids=None, query_position_ids=None, query_attention_mask=None, pos_title_token_type_ids=None, pos_title_position_ids=None, pos_title_attention_mask=None, neg_title_token_type_ids=None, neg_title_position_ids=None, neg_title_attention_mask=None)[源代码]

Defines the computation performed at every call. Should be overridden by all subclasses.

参数
  • *inputs (tuple) -- unpacked tuple arguments

  • **kwargs (dict) -- unpacked dict arguments

class ErnieCrossEncoder(pretrain_model_name_or_path, num_classes=2, reinitialize=False, dropout=None)[源代码]

基类:paddle.fluid.dygraph.layers.Layer

示例

import paddle
from paddlenlp.transformers import ErnieCrossEncoder, ErnieTokenizer

model = ErnieCrossEncoder("rocketqa-zh-dureader-cross-encoder")
tokenizer = ErnieTokenizer.from_pretrained("rocketqa-zh-dureader-cross-encoder")

inputs = tokenizer("你们好", text_pair="你好")
inputs = {k:paddle.to_tensor([v]) for (k, v) in inputs.items()}

# Get embedding of text pair.
embedding = model.matching(**inputs)
init_weights(layer)[源代码]

Initialization hook

matching(input_ids, token_type_ids=None, position_ids=None, attention_mask=None, return_prob_distributation=False)[源代码]

Use the pooled_output as the feature for pointwise prediction, eg. RocketQAv1

matching_v2(input_ids, token_type_ids=None, position_ids=None, attention_mask=None)[源代码]

Use the cls token embedding as the feature for listwise prediction, eg. RocketQAv2

matching_v3(input_ids, token_type_ids=None, position_ids=None, attention_mask=None)[源代码]

Use the pooled_output as the feature for listwise prediction, eg. ERNIE-Search

forward(input_ids, token_type_ids=None, position_ids=None, attention_mask=None, labels=None)[源代码]

Defines the computation performed at every call. Should be overridden by all subclasses.

参数
  • *inputs (tuple) -- unpacked tuple arguments

  • **kwargs (dict) -- unpacked dict arguments