rdrop¶
-
class
RDropLoss
(reduction='none')[source]¶ R-Drop Loss implementation For more information about R-drop please refer to this paper: https://arxiv.org/abs/2106.14448 Original implementation please refer to this code: https://github.com/dropreg/R-Drop
- Parameters
reduction (str, optional) – Indicate how to average the loss, the candicates are
'none'
,``’batchmean’,
’mean’,
’sum’. If `reduction` is ``'mean'
, the reduced mean loss is returned; Ifreduction
is'batchmean'
, the sum loss divided by batch size is returned; Ifreduction
is'sum'
, the reduced sum loss is returned; Ifreduction
is'none'
, no reduction will be applied. Defaults to'none'
.
-
forward
(p, q, pad_mask=None)[source]¶ - Parameters
p (Tensor) – the first forward logits of training examples.
q (Tensor) – the second forward logits of training examples.
pad_mask (Tensor, optional) – The Tensor containing the binary mask to index with, it’s data type is bool.
- Returns
Returns tensor
loss
, the rdrop loss of p and q.- Return type
Tensor