emmi.modules.attention.anchor_attention.multi_branch_anchor_attention¶
Exceptions¶
Raised when expected tokens for a configured branch are not present. |
|
Raised when a required anchor token is not present. |
|
Raised when an unexpected token is present. |
Classes¶
A base class for multi-branch anchor-based attention modules with shared parameters between branches. |
Module Contents¶
- exception emmi.modules.attention.anchor_attention.multi_branch_anchor_attention.MissingBranchTokensError¶
Bases:
ValueErrorRaised when expected tokens for a configured branch are not present.
Initialize self. See help(type(self)) for accurate signature.
- exception emmi.modules.attention.anchor_attention.multi_branch_anchor_attention.MissingAnchorTokenError¶
Bases:
ValueErrorRaised when a required anchor token is not present.
Initialize self. See help(type(self)) for accurate signature.
- exception emmi.modules.attention.anchor_attention.multi_branch_anchor_attention.UnexpectedTokenError¶
Bases:
ValueErrorRaised when an unexpected token is present.
Initialize self. See help(type(self)) for accurate signature.
- class emmi.modules.attention.anchor_attention.multi_branch_anchor_attention.MultiBranchAnchorAttention(config)¶
Bases:
torch.nn.ModuleA base class for multi-branch anchor-based attention modules with shared parameters between branches.
Anchor attention limits the self-attention to anchor tokens while other tokens use cross-attention. Multiple branches for different modalities use the same linear-projection parameters. This base class provides a common constructor, validation logic, and forward method implementation. Subclasses only need to implement _create_attention_patterns to define their specific attention patterns.
- Parameters:
dim – Model dimension.
num_heads – Number of attention heads.
use_rope – Whether to use rotary position embeddings.
bias – Whether to use bias in the linear projections.
init_weights – Weight initialization method.
branches – A sequence of all participating branch names.
anchor_suffix – Suffix identifying anchor tokens.
config (emmi.schemas.modules.attention.AttentionConfig)
Initialize internal Module state, shared by both nn.Module and ScriptModule.
- mixed_attention¶
- branches = None¶
- anchor_suffix = None¶
- forward(x, token_specs, freqs=None)¶
Apply attention using the patterns defined by the subclass.
- Parameters:
x (torch.Tensor)
token_specs (collections.abc.Sequence[emmi.schemas.modules.attention.anchor_attention.TokenSpec])
freqs (torch.Tensor | None)
- Return type:
torch.Tensor