emmi.modules.attention.anchor_attention.joint_anchor_attention
==============================================================

.. py:module:: emmi.modules.attention.anchor_attention.joint_anchor_attention


Classes
-------

.. autoapisummary::

   emmi.modules.attention.anchor_attention.joint_anchor_attention.JointAnchorAttention


Module Contents
---------------

.. py:class:: JointAnchorAttention(config)

   Bases: :py:obj:`emmi.modules.attention.anchor_attention.multi_branch_anchor_attention.MultiBranchAnchorAttention`


   Anchor attention within and across branches: all tokens attend to anchors from all configured branches.

   For a list of branches (e.g., A, B, C), this creates a pattern where all tokens
   (A_anchors, A_queries, B_anchors, B_queries, C_anchors, C_queries) attend to (A_anchors + B_anchors + C_anchors).
   It requires at least one anchor token to be present in the input.

   Example: all tokens attend to (surface_anchors, volume_anchors).
   This is achieved via the following attention pattern:
       AttentionPattern(
           query_tokens=["surface_anchors", "surface_queries", "volume_anchors", "volume_queries"],
           key_value_tokens=["surface_anchors", "volume_anchors"]
       )

   :param dim: Model dimension.
   :param num_heads: Number of attention heads.
   :param use_rope: Whether to use rotary position embeddings.
   :param bias: Whether to use bias in the linear projections.
   :param init_weights: Weight initialization method.
   :param branches: A sequence of all participating branch names.
   :param anchor_suffix: Suffix identifying anchor tokens.

   Initialize internal Module state, shared by both nn.Module and ScriptModule.