bleu#
- class BLEU(trans_func=None, vocab=None, n_size=4, weights=None, name='bleu')[源代码]#
基类:
MetricBLEU (bilingual evaluation understudy) is an algorithm for evaluating the quality of text which has been machine-translated from one natural language to another. This metric uses a modified form of precision to compare a candidate translation against multiple reference translations.
BLEU could be used as
paddle.metric.Metricclass, or an ordinary class. When BLEU is used aspaddle.metric.Metricclass. A function is needed that transforms the network output to reference string list, and transforms the label to candidate string. By default, a default functiondefault_trans_funcis provided, which gets target sequence id by calculating the maximum probability of each step. In this case, user must providevocab. It should be noted that the BLEU here is different from the BLEU calculated in prediction, and it is only for observation during training and evaluation.\[ \begin{align}\begin{aligned}\begin{split}BP & = \begin{cases} 1, & \text{if }c>r \\ e_{1-r/c}, & \text{if }c\leq r \end{cases}\end{split}\\BLEU & = BP\exp(\sum_{n=1}^N w_{n} \log{p_{n}})\end{aligned}\end{align} \]where
cis the length of candidate sentence, andris the length of reference sentence.- 参数:
trans_func (callable, optional) --
trans_functransforms the network output to string to calculate.vocab (dict|paddlenlp.data.vocab, optional) -- Vocab for target language. If
trans_funcis None and BLEU is used aspaddle.metric.Metricinstance,default_trans_funcwill be performed andvocabmust be provided.n_size (int, optional) -- Number of gram for BLEU metric. Defaults to 4.
weights (list, optional) -- The weights of precision of each gram. Defaults to None.
name (str, optional) -- Name of
paddle.metric.Metricinstance. Defaults to "bleu".
示例
Using as a general evaluation object.
from paddlenlp.metrics import BLEU bleu = BLEU() cand = ["The","cat","The","cat","on","the","mat"] ref_list = [["The","cat","is","on","the","mat"], ["There","is","a","cat","on","the","mat"]] bleu.add_inst(cand, ref_list) print(bleu.score()) # 0.4671379777282001
Using as an instance of
paddle.metric.Metric.
# You could add the code below to Seq2Seq example in this repo to # use BLEU as `paddlenlp.metric.Metric' class. If you run the # following code alone, you may get an error. # log example: # Epoch 1/12 # step 100/507 - loss: 308.7948 - Perplexity: 541.5600 - bleu: 2.2089e-79 - 923ms/step # step 200/507 - loss: 264.2914 - Perplexity: 334.5099 - bleu: 0.0093 - 865ms/step # step 300/507 - loss: 236.3913 - Perplexity: 213.2553 - bleu: 0.0244 - 849ms/step from paddlenlp.data import Vocab from paddlenlp.metrics import BLEU bleu_metric = BLEU(vocab=src_vocab.idx_to_token) model.prepare(optimizer, CrossEntropyCriterion(), [ppl_metric, bleu_metric])
- update(output, label, seq_mask=None)[源代码]#
Update states for metric
Inputs of
updateis the outputs ofMetric.compute, ifcomputeis not defined, the inputs ofupdatewill be flatten arguments of output of mode and label from data:update(output1, output2, ..., label1, label2,...)see
Metric.compute
- add_inst(cand, ref_list)[源代码]#
Update the states based on a pair of candidate and references.
- 参数:
cand (list) -- Tokenized candidate sentence.
ref_list (list of list) -- List of tokenized ground truth sentences.
- class BLEUForDuReader(n_size=4, alpha=1.0, beta=1.0)[源代码]#
基类:
BLEUBLEU metric with bonus for DuReader contest.
Please refer to `DuReader Homepage<https://ai.baidu.com//broad/subordinate?dataset=dureader>`_ for more details.
- 参数:
n_size (int, optional) -- Number of gram for BLEU metric. Defaults to 4.
alpha (float, optional) -- Weight of YesNo dataset when adding bonus for DuReader contest. Defaults to 1.0.
beta (float, optional) -- Weight of Entity dataset when adding bonus for DuReader contest. Defaults to 1.0.