ERNIE模型汇总#

下表汇总介绍了目前PaddleNLP支持的ERNIE模型对应预训练权重。 关于模型的具体细节可以参考对应链接。

Pretrained Weight

Language

Details of the model

ernie-1.0-base-zh

Chinese

12-layer, 768-hidden, 12-heads, 108M parameters. Trained on Chinese text.

ernie-1.0-base-zh-cw

Chinese

12-layer, 768-hidden, 12-heads, 118M parameters. Trained on Chinese text.

ernie-1.0-large-zh-cw

Chinese

24-layer, 1024-hidden, 16-heads, 272M parameters. Trained on Chinese text.

ernie-tiny

Chinese

3-layer, 1024-hidden, 16-heads, _M parameters. Trained on Chinese text.

ernie-2.0-base-en

English

12-layer, 768-hidden, 12-heads, 103M parameters. Trained on lower-cased English text.

ernie-2.0-base-en-finetuned-squad

English

12-layer, 768-hidden, 12-heads, 110M parameters. Trained on finetuned squad text.

ernie-2.0-large-en

English

24-layer, 1024-hidden, 16-heads, 336M parameters. Trained on lower-cased English text.

ernie-3.0-xbase-zh

Chinese

20-layer, 1024-hidden, 16-heads, 296M parameters. Trained on Chinese text.

ernie-3.0-base-zh

Chinese

12-layer, 768-hidden, 12-heads, 118M parameters. Trained on Chinese text.

ernie-3.0-medium-zh

Chinese

6-layer, 768-hidden, 12-heads, 75M parameters. Trained on Chinese text.

ernie-3.0-mini-zh

Chinese

6-layer, 384-hidden, 12-heads, 27M parameters. Trained on Chinese text.

ernie-3.0-micro-zh

Chinese

4-layer, 384-hidden, 12-heads, 23M parameters. Trained on Chinese text.

ernie-3.0-nano-zh

Chinese

4-layer, 312-hidden, 12-heads, 18M parameters. Trained on Chinese text.

rocketqa-base-cross-encoder

Chinese

12-layer, 768-hidden, 12-heads, 118M parameters. Trained on DuReader retrieval text.

rocketqa-medium-cross-encoder

Chinese

6-layer, 768-hidden, 12-heads, 75M parameters. Trained on DuReader retrieval text.

rocketqa-mini-cross-encoder

Chinese

6-layer, 384-hidden, 12-heads, 27M parameters. Trained on DuReader retrieval text.

rocketqa-micro-cross-encoder

Chinese

4-layer, 384-hidden, 12-heads, 23M parameters. Trained on DuReader retrieval text.

rocketqa-nano-cross-encoder

Chinese

4-layer, 312-hidden, 12-heads, 18M parameters. Trained on DuReader retrieval text.

rocketqa-zh-base-query-encoder

Chinese

12-layer, 768-hidden, 12-heads, 118M parameters. Trained on DuReader retrieval text.

rocketqa-zh-base-para-encoder

Chinese

12-layer, 768-hidden, 12-heads, 118M parameters. Trained on DuReader retrieval text.

rocketqa-zh-medium-query-encoder

Chinese

6-layer, 768-hidden, 12-heads, 75M parameters. Trained on DuReader retrieval text.

rocketqa-zh-medium-para-encoder

Chinese

6-layer, 768-hidden, 12-heads, 75M parameters. Trained on DuReader retrieval text.

rocketqa-zh-mini-query-encoder

Chinese

6-layer, 384-hidden, 12-heads, 27M parameters. Trained on DuReader retrieval text.

rocketqa-zh-mini-para-encoder

Chinese

6-layer, 384-hidden, 12-heads, 27M parameters. Trained on DuReader retrieval text.

rocketqa-zh-micro-query-encoder

Chinese

4-layer, 384-hidden, 12-heads, 23M parameters. Trained on DuReader retrieval text.

rocketqa-zh-micro-para-encoder

Chinese

4-layer, 384-hidden, 12-heads, 23M parameters. Trained on DuReader retrieval text.

rocketqa-zh-nano-query-encoder

Chinese

4-layer, 312-hidden, 12-heads, 18M parameters. Trained on DuReader retrieval text.

rocketqa-zh-nano-para-encoder

Chinese

4-layer, 312-hidden, 12-heads, 18M parameters. Trained on DuReader retrieval text.