decoder#
- class InferTransformerDecoder(decoder, n_head, size_per_head, decoder_lib=None, use_fp16_decoder=False, use_batch_major_op_cache=False)[源代码]#
FasterTransformer decoder block.
- 参数:
decoder (
TransformerDecoder
) -- Transformer decoder block.n_head (
int
) -- The number of head used in multi-head attention.size_per_head (
int
) -- The size of per head used in multi-head attention.decoder_lib (
str
) -- The path to decoder_lib. Default to None.use_fp16_decoder (
bool
) -- Whether to use fp16 for decoder. Default to False.
- forward(from_tensor, memory_tensor, mem_seq_len, self_cache_key, self_cache_value, mem_cache, step, memory_hidden_dim, is_fuse_qkv)[源代码]#
Defines the computation performed at every call. Should be overridden by all subclasses.
- 参数:
*inputs (tuple) -- unpacked tuple arguments
**kwargs (dict) -- unpacked dict arguments
- class FasterDecoder(src_vocab_size, trg_vocab_size, max_length, num_encoder_layers, num_decoder_layers, n_head, d_model, d_inner_hid, dropout, weight_sharing, bos_id=0, eos_id=1, max_out_len=256, decoder_lib=None, use_fp16_decoder=False, use_batch_major_op_cache=False)[源代码]#
FasterTransformer decoder for auto-regressive generation.
- 参数:
src_vocab_size (
int
) -- The size of source vocabulary.trg_vocab_size (
int
) -- The size of target vocabulary.max_length (
int
) -- The maximum length of input sequences.num_encoder_layers (
int
) -- The number of sub-layers to be stacked in the encoder.num_decoder_layers (
int
) -- The number of sub-layers to be stacked in the decoder.n_head (
int
) -- The number of head used in multi-head attention.d_model (
int
) -- The dimension for word embeddings, which is also the last dimension of the input and output of multi-head attention, position-wise feed-forward networks, encoder and decoder.d_inner_hid (
int
) -- Size of the hidden layer in position-wise feed-forward networks.dropout (
float
) -- Dropout rates. Used for pre-process, activation and inside attention.weight_sharing (
bool
) -- Whether to use weight sharing.bos_id (
int
, optional) -- The start token id and also is used as padding id. Defaults to 0.eos_id (
int
, optional) -- The end token id. Defaults to 1.max_out_len (int, optional) -- The maximum output length. Defaults to 256.
decoder_lib (
str
) -- The path to decoder_lib. Default to None.use_fp16_decoder (
bool
) -- Whether to use fp16 for decoder. Default to False.