TinyBert模型汇总#

下表汇总介绍了目前PaddleNLP支持的TinyBert模型以及对应预训练权重。 关于模型的具体细节可以参考对应链接。

Pretrained Weight

Language

Details of the model

tinybert-4l-312d

English

4-layer, 312-hidden, 12-heads, 14.5M parameters. The TinyBert model distilled from the BERT model bert-base-uncased

tinybert-6l-768d

English

6-layer, 768-hidden, 12-heads, 67M parameters. The TinyBert model distilled from the BERT model bert-base-uncased

tinybert-4l-312d-v2

English

4-layer, 312-hidden, 12-heads, 14.5M parameters. The TinyBert model distilled from the BERT model bert-base-uncased

tinybert-6l-768d-v2

English

6-layer, 768-hidden, 12-heads, 67M parameters. The TinyBert model distilled from the BERT model bert-base-uncased

tinybert-4l-312d-zh

Chinese

4-layer, 312-hidden, 12-heads, 14.5M parameters. The TinyBert model distilled from the BERT model bert-base-uncased

tinybert-6l-768d-zh

Chinese

6-layer, 768-hidden, 12-heads, 67M parameters. The TinyBert model distilled from the BERT model bert-base-uncased