ERNIE模型汇总#
下表汇总介绍了目前PaddleNLP支持的ERNIE模型对应预训练权重。 关于模型的具体细节可以参考对应链接。
Pretrained Weight |
Language |
Details of the model |
---|---|---|
|
Chinese |
12-layer, 768-hidden, 12-heads, 108M parameters. Trained on Chinese text. |
|
Chinese |
12-layer, 768-hidden, 12-heads, 118M parameters. Trained on Chinese text. |
|
Chinese |
24-layer, 1024-hidden, 16-heads, 272M parameters. Trained on Chinese text. |
|
Chinese |
3-layer, 1024-hidden, 16-heads, _M parameters. Trained on Chinese text. |
|
English |
12-layer, 768-hidden, 12-heads, 103M parameters. Trained on lower-cased English text. |
|
English |
12-layer, 768-hidden, 12-heads, 110M parameters. Trained on finetuned squad text. |
|
English |
24-layer, 1024-hidden, 16-heads, 336M parameters. Trained on lower-cased English text. |
|
Chinese |
20-layer, 1024-hidden, 16-heads, 296M parameters. Trained on Chinese text. |
|
Chinese |
12-layer, 768-hidden, 12-heads, 118M parameters. Trained on Chinese text. |
|
Chinese |
6-layer, 768-hidden, 12-heads, 75M parameters. Trained on Chinese text. |
|
Chinese |
6-layer, 384-hidden, 12-heads, 27M parameters. Trained on Chinese text. |
|
Chinese |
4-layer, 384-hidden, 12-heads, 23M parameters. Trained on Chinese text. |
|
Chinese |
4-layer, 312-hidden, 12-heads, 18M parameters. Trained on Chinese text. |
|
Chinese |
12-layer, 768-hidden, 12-heads, 118M parameters. Trained on DuReader retrieval text. |
|
Chinese |
6-layer, 768-hidden, 12-heads, 75M parameters. Trained on DuReader retrieval text. |
|
Chinese |
6-layer, 384-hidden, 12-heads, 27M parameters. Trained on DuReader retrieval text. |
|
Chinese |
4-layer, 384-hidden, 12-heads, 23M parameters. Trained on DuReader retrieval text. |
|
Chinese |
4-layer, 312-hidden, 12-heads, 18M parameters. Trained on DuReader retrieval text. |
|
Chinese |
12-layer, 768-hidden, 12-heads, 118M parameters. Trained on DuReader retrieval text. |
|
Chinese |
12-layer, 768-hidden, 12-heads, 118M parameters. Trained on DuReader retrieval text. |
|
Chinese |
6-layer, 768-hidden, 12-heads, 75M parameters. Trained on DuReader retrieval text. |
|
Chinese |
6-layer, 768-hidden, 12-heads, 75M parameters. Trained on DuReader retrieval text. |
|
Chinese |
6-layer, 384-hidden, 12-heads, 27M parameters. Trained on DuReader retrieval text. |
|
Chinese |
6-layer, 384-hidden, 12-heads, 27M parameters. Trained on DuReader retrieval text. |
|
Chinese |
4-layer, 384-hidden, 12-heads, 23M parameters. Trained on DuReader retrieval text. |
|
Chinese |
4-layer, 384-hidden, 12-heads, 23M parameters. Trained on DuReader retrieval text. |
|
Chinese |
4-layer, 312-hidden, 12-heads, 18M parameters. Trained on DuReader retrieval text. |
|
Chinese |
4-layer, 312-hidden, 12-heads, 18M parameters. Trained on DuReader retrieval text. |