Deberta模型汇总

Deberta模型汇总#

下表汇总介绍了目前PaddleNLP支持的Deberta模型对应预训练权重。

Pretrained Weight

Language

Details of the model

microsoft/deberta-base

English

12-layer, 768-hidden, 12-heads, 100M parameters. It outperforms BERT and RoBERTa on majority of NLU tasks with 80GB training data.