emmi.modules.attention.anchor_attention.joint_anchor_attention¶

Classes¶

JointAnchorAttention

Anchor attention within and across branches: all tokens attend to anchors from all configured branches.

Module Contents¶

class emmi.modules.attention.anchor_attention.joint_anchor_attention.JointAnchorAttention(config)¶

Bases: emmi.modules.attention.anchor_attention.multi_branch_anchor_attention.MultiBranchAnchorAttention

Anchor attention within and across branches: all tokens attend to anchors from all configured branches.

For a list of branches (e.g., A, B, C), this creates a pattern where all tokens (A_anchors, A_queries, B_anchors, B_queries, C_anchors, C_queries) attend to (A_anchors + B_anchors + C_anchors). It requires at least one anchor token to be present in the input.

Example: all tokens attend to (surface_anchors, volume_anchors). This is achieved via the following attention pattern:

AttentionPattern(
query_tokens=[“surface_anchors”, “surface_queries”, “volume_anchors”, “volume_queries”], key_value_tokens=[“surface_anchors”, “volume_anchors”]

)

Parameters:

dim – Model dimension.
num_heads – Number of attention heads.
use_rope – Whether to use rotary position embeddings.
bias – Whether to use bias in the linear projections.
init_weights – Weight initialization method.
branches – A sequence of all participating branch names.
anchor_suffix – Suffix identifying anchor tokens.
config (emmi.schemas.modules.attention.anchor_attention.config.JointAnchorAttentionConfig)

Initialize internal Module state, shared by both nn.Module and ScriptModule.