distill_utils

to_distill(self, return_qkv=False, return_attentions=False, return_layer_outputs=False, layer_index=- 1)[源代码]

Can be bound to object with transformer encoder layers, and make model expose attributes outputs.qs, outputs.ks, outputs.vs, outputs.scaled_qks, outputs.hidden_states`and `outputs.attentions of the object for distillation.