emmi.modules.attention.anchor_attention.joint_anchor_attention¶
Classes¶
Anchor attention within and across branches: all tokens attend to anchors from all configured branches. |
Module Contents¶
- class emmi.modules.attention.anchor_attention.joint_anchor_attention.JointAnchorAttention(config)¶
Bases:
emmi.modules.attention.anchor_attention.multi_branch_anchor_attention.MultiBranchAnchorAttentionAnchor attention within and across branches: all tokens attend to anchors from all configured branches.
For a list of branches (e.g., A, B, C), this creates a pattern where all tokens (A_anchors, A_queries, B_anchors, B_queries, C_anchors, C_queries) attend to (A_anchors + B_anchors + C_anchors). It requires at least one anchor token to be present in the input.
Example: all tokens attend to (surface_anchors, volume_anchors). This is achieved via the following attention pattern:
- AttentionPattern(
query_tokens=[“surface_anchors”, “surface_queries”, “volume_anchors”, “volume_queries”], key_value_tokens=[“surface_anchors”, “volume_anchors”]
)
- Parameters:
dim – Model dimension.
num_heads – Number of attention heads.
use_rope – Whether to use rotary position embeddings.
bias – Whether to use bias in the linear projections.
init_weights – Weight initialization method.
branches – A sequence of all participating branch names.
anchor_suffix – Suffix identifying anchor tokens.
config (emmi.schemas.modules.attention.anchor_attention.config.JointAnchorAttentionConfig)
Initialize internal Module state, shared by both nn.Module and ScriptModule.