squad

compute_prediction(examples, features, predictions, version_2_with_negative=False, n_best_size=20, max_answer_length=30, null_score_diff_threshold=0.0)[源代码]

Post-processes the predictions of a question-answering model to convert them to answers that are substrings of the original contexts. This is the base postprocessing functions for models that only return start and end logits.

参数
  • examples (list) -- List of raw squad-style data (see run_squad.py for more information).

  • features (list) -- List of processed squad-style features (see run_squad.py for more information).

  • predictions (tuple) -- The predictions of the model. Should be a tuple of two list containing the start logits and the end logits.

  • version_2_with_negative (bool, optional) -- Whether the dataset contains examples with no answers. Defaults to False.

  • n_best_size (int, optional) -- The total number of candidate predictions to generate. Defaults to 20.

  • max_answer_length (int, optional) -- The maximum length of predicted answer. Defaults to 20.

  • null_score_diff_threshold (float, optional) -- The threshold used to select the null answer. Only useful when version_2_with_negative is True. Defaults to 0.0.

返回

A tuple of three dictionaries containing final selected answer, all n_best answers along with their probability and scores, and the score_diff of each example.

squad_evaluate(examples, preds, na_probs=None, na_prob_thresh=1.0, is_whitespace_splited=True)[源代码]

Computes and prints the f1 score and em score of input prediction.

参数
  • examples (list) -- List of raw squad-style data (see run_squad.py for more information).

  • preds (dict) -- Dictionary of final predictions. Usually generated by compute_prediction.

  • na_probs (dict, optional) -- Dictionary of score_diffs of each example. Used to decide if answer exits and compute best score_diff threshold of null. Defaults to None.

  • na_prob_thresh (float, optional) -- The threshold used to select the null answer. Defaults to 1.0.

  • is_whitespace_splited (bool, optional) -- Whether the predictions and references can be tokenized by whitespace. Usually set True for English and False for Chinese. Defaults to True.