compute_prediction(examples, features, predictions, version_2_with_negative=False, n_best_size=20, max_answer_length=30, null_score_diff_threshold=0.0)¶
Post-processes the predictions of a question-answering model to convert them to answers that are substrings of the original contexts. This is the base postprocessing functions for models that only return start and end logits.
examples (list) -- List of raw squad-style data (see run_squad.py for more information).
features (list) -- List of processed squad-style features (see run_squad.py for more information).
predictions (tuple) -- The predictions of the model. Should be a tuple of two list containing the start logits and the end logits.
version_2_with_negative (bool, optional) -- Whether the dataset contains examples with no answers. Defaults to False.
n_best_size (int, optional) -- The total number of candidate predictions to generate. Defaults to 20.
max_answer_length (int, optional) -- The maximum length of predicted answer. Defaults to 20.
null_score_diff_threshold (float, optional) -- The threshold used to select the null answer. Only useful when
version_2_with_negativeis True. Defaults to 0.0.
A tuple of three dictionaries containing final selected answer, all n_best answers along with their probability and scores, and the score_diff of each example.
squad_evaluate(examples, preds, na_probs=None, na_prob_thresh=1.0, is_whitespace_splited=True)¶
Computes and prints the f1 score and em score of input prediction. :param examples: List of raw squad-style data (see `run_squad.py
<https://github.com/PaddlePaddle/PaddleNLP/blob/develop/examples/ machine_reading_comprehension/SQuAD/run_squad.py>`__ for more information).
preds (dict) -- Dictionary of final predictions. Usually generated by
na_probs (dict, optional) -- Dictionary of score_diffs of each example. Used to decide if answer exits and compute best score_diff threshold of null. Defaults to None.
na_prob_thresh (float, optional) -- The threshold used to select the null answer. Defaults to 1.0.
is_whitespace_splited (bool, optional) -- Whether the predictions and references can be tokenized by whitespace. Usually set True for English and False for Chinese. Defaults to True.