emmi_inference.models.modules.blocks.transformer_block¶

Classes¶

A transformer block with a single attention layer and a feedforward layer.

class emmi_inference.models.modules.blocks.transformer_block.TransformerBlock(dim, num_heads, attn_ctor=DotProductAttention)¶

Bases: torch.nn.Module

A transformer block with a single attention layer and a feedforward layer.

Parameters:

Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x, attn_kwargs=None)¶

Forward pass of the transformer block.

Parameters:

x (torch.Tensor) – Input tensor with shape (batch_size, seqlen/num_tokens, dim).
attn_kwargs (dict[str, Any] | None) – Dict with arguments for the attention (such as the rope frequencies). Defaults to None.

Returns:

(batch_size, num_tokens, dim)

Return type:

torch.Tensor