emmi.modules.attention.anchor_attention.joint_anchor_attention

Classes

JointAnchorAttention

Anchor attention within and across branches: all tokens attend to anchors from all configured branches.

Module Contents

class emmi.modules.attention.anchor_attention.joint_anchor_attention.JointAnchorAttention(config)

Bases: emmi.modules.attention.anchor_attention.multi_branch_anchor_attention.MultiBranchAnchorAttention

Anchor attention within and across branches: all tokens attend to anchors from all configured branches.

For a list of branches (e.g., A, B, C), this creates a pattern where all tokens (A_anchors, A_queries, B_anchors, B_queries, C_anchors, C_queries) attend to (A_anchors + B_anchors + C_anchors). It requires at least one anchor token to be present in the input.

Example: all tokens attend to (surface_anchors, volume_anchors). This is achieved via the following attention pattern:

AttentionPattern(

query_tokens=[“surface_anchors”, “surface_queries”, “volume_anchors”, “volume_queries”], key_value_tokens=[“surface_anchors”, “volume_anchors”]

)

Parameters:
  • dim – Model dimension.

  • num_heads – Number of attention heads.

  • use_rope – Whether to use rotary position embeddings.

  • bias – Whether to use bias in the linear projections.

  • init_weights – Weight initialization method.

  • branches – A sequence of all participating branch names.

  • anchor_suffix – Suffix identifying anchor tokens.

  • config (emmi.schemas.modules.attention.anchor_attention.config.JointAnchorAttentionConfig)

Initialize internal Module state, shared by both nn.Module and ScriptModule.