rdrop#

class RDropLoss(reduction='none')[source]#

R-Drop Loss implementation For more information about R-drop please refer to this paper: https://arxiv.org/abs/2106.14448 Original implementation please refer to this code: dropreg/R-Drop

Parameters:

reduction (str, optional) – Indicate how to average the loss, the candicates are 'none',``’batchmean’,’mean’,’sum’. If `reduction` is ``'mean', the reduced mean loss is returned; If reduction is 'batchmean', the sum loss divided by batch size is returned; If reduction is 'sum', the reduced sum loss is returned; If reduction is 'none', no reduction will be applied. Defaults to 'none'.

forward(p, q, pad_mask=None)[source]#
Parameters:
  • p (Tensor) – the first forward logits of training examples.

  • q (Tensor) – the second forward logits of training examples.

  • pad_mask (Tensor, optional) – The Tensor containing the binary mask to index with, it’s data type is bool.

Returns:

Returns tensor loss, the rdrop loss of p and q.

Return type:

Tensor