NeZha模型汇总#

下表汇总介绍了目前PaddleNLP支持的NeZha模型对应预训练权重。 关于模型的具体细节可以参考对应链接。

Pretrained Weight

Language

Details of the model

nezha-base-chinese

Chinese

12-layer, 768-hidden, 12-heads, 108M parameters. Trained on Chinese text.

nezha-large-chinese

Chinese

24-layer, 1024-hidden, 16-heads, 336M parameters. Trained on Chinese text.

nezha-base-wwm-chinese

Chinese

12-layer, 768-hidden, 16-heads, 108M parameters. Trained on Chinese text.

nezha-large-wwm-chinese

Chinese

24-layer, 1024-hidden, 16-heads, 336M parameters. Trained on Chinese text.