T5

T5

We currently support loading the following checkpoint via T5.from_pretrained(identifier)

  • t5-small

  • t5-base

  • t5-large

  • t5-3b

  • t5-11b

T5Config

class model_center.model.T5Config(vocab_size=32128, dim_model=768, num_heads=12, dim_head=64, dim_ff=3072, num_encoder_layers=12, num_decoder_layers=12, dropout_p=0, emb_init_mean=0.0, emb_init_std=1, pos_bias_type='relative', position_bias_num_buckets=32, position_bias_max_distance=128, pos_init_mean=0.0, pos_init_std=1, norm_init_var=1.0, norm_bias=False, norm_eps=1e-06, att_init_mean=0.0, att_init_std=1, att_bias=False, att_mask_value=- inf, ffn_init_mean=0.0, ffn_init_std=1, ffn_bias=False, ffn_activate_fn='relu', proj_init_mean=0.0, proj_init_std=1, proj_bias=False, length_scale=False, attn_scale=False, half=True, int8=False, tied=False, cls_head=None, post_layer_norm=False)

This is a configuration class that stores the configuration of the model, which inherits from the Config class. It is used to instantiate the Bert model according to the specified parameters and define the model architecture. You can set specific parameters to control the output of the model.

For example: [dim_model] is used to determine the Dimension of the encoder layers. You can choose to use the default value of 768 or customize their dimensions.

T5Model

class model_center.model.T5(config: model_center.model.config.t5_config.T5Config)
forward(input_ids=None, length=None, decoder_input_ids=None, decoder_length=None, attention_mask=None, decoder_attention_mask=None, head_mask=None, decoder_head_mask=None, cross_attn_head_mask=None, encoder_outputs=None, inputs_embeds=None, decoder_inputs_embeds=None, output_attentions=None, output_hidden_states=None, return_dict=True, return_logits=False)
T5 is an encoder-decoder model and converts problems into a text-to-text format.

This model inherits from BaseModel. This model is also a PyTorch torch.nn.Module subclass. You can use it as a regular PyTorch Module. You can also select the data and data type that you want the model to return through changing the value of return_dict and return_logits.

Parameters
  • input_ids (torch.Tensor of shape (batch, seq_enc)) – Indices of input sequence tokens. It will be embedded by model’s internal embedding lookup matrix.

  • length (torch.Tensor of shape (batch)) – Length of input sequence before padding.

  • attention_mask (torch.Tensor of shape (batch, seq_enc)) – Used to avoid performing attention on padding token indices in input.

  • decoder_input_ids (torch.Tensor of shape (batch, seq_enc)) – Indices of decoder input sequence tokens .

  • decoder_length (torch.Tensor of shape (batch)) – Length of decoder input sequence before padding.

  • deocoder_attention_mask (torch.Tensor of shape (batch, seq_enc)) – Used to avoid performing attention on padding token indices in decoder input.

  • head_mask (torch.Tensor of shape (num_layers, num_heads)) – Unused.

  • decoder_head_mask (torch.Tensor of shape (num_layers, num_heads)) – Unused.

  • cross_attn_head_mask (torch.Tensor of shape (num_layers, num_heads)) – Unused.

  • encoder_outputs (torch.Tensor of shape (batch, dim_model, seq_enc)) – Outputs of encoder.

  • inputs_embeds (torch.Tensor of shape (batch, seq_enc, dim_model)) – Embedding of the input. You can choose to directly pass the inputs embedding to control the way of embedding.

  • decoder_inputs_embeds (torch.Tensor of shape (batch, seq_dec, dim_model)) – Embedding of the decoder input. You can choose to directly pass the inputs embedding to control the way of embedding.

  • output_attentions (torch.Tensor of shape (batch, num_heads, seq_enc, seq_enc)) – Unused.

  • output_hidden_states (torch.Tensor of shape (batch, seq_dec, dim_model)) – Unused.

  • return_dict (bool) – Whether to return a Seq2SeqModelOutput instead of just a tuple.

  • return_logits (bool) – Whether to return the prediction score for each token in vocabulary (before softmax).

Returns

The T5 output. Depended on the value of return_dict and return_logits

Return type

Seq2SeqModelOutput or tuple or torch.Tensor of shape (batch, seq_dec, vocab_output_size) or (batch, seqlen, cls_head)

T5Tokenizer

class model_center.tokenizer.T5Tokenizer

The current implementation is mainly an alias to T5Tokenizer of Hugging Face Transformers. we will change to our SAM implementation in the future, which will be a more efficient tokenizer.