emmi.modules.blocks.perceiver_transformer_blockpair =================================================== .. py:module:: emmi.modules.blocks.perceiver_transformer_blockpair Classes ------- .. autoapisummary:: emmi.modules.blocks.perceiver_transformer_blockpair.PerceiverTransformerBlock Module Contents --------------- .. py:class:: PerceiverTransformerBlock(hidden_dim, num_heads, transformer_attn_ctor = DotProductAttention, init_weights = 'truncnormal002', mlp_hidden_dim = None, drop_path = 0.0) Bases: :py:obj:`torch.nn.Module` Base class for all neural network modules. Your models should also subclass this class. Modules can also contain other Modules, allowing to nest them in a tree structure. You can assign the submodules as regular attributes:: import torch.nn as nn import torch.nn.functional as F class Model(nn.Module): def __init__(self): super().__init__() self.conv1 = nn.Conv2d(1, 20, 5) self.conv2 = nn.Conv2d(20, 20, 5) def forward(self, x): x = F.relu(self.conv1(x)) return F.relu(self.conv2(x)) Submodules assigned in this way will be registered, and will have their parameters converted too when you call :meth:`to`, etc. .. note:: As per the example above, an ``__init__()`` call to the parent class must be made before assignment on the child. :ivar training: Boolean represents whether this module is in training or evaluation mode. :vartype training: bool Instantiates a block which contains a perciever and a transformer block. :param hidden_dim: hidden Dimension of the transformer block. :param num_heads: Number of attention heads. :param mlp_hidden_dim: Hidden dim of the feed forward MLP after the self-attention. Defaults to None. :param init_weights: Initialization method for the weight matrixes of the network. Defaults to "truncnormal002". .. py:attribute:: perceiver .. py:attribute:: transformer .. py:method:: forward(q, kv, transformer_attn_kwargs = None) Forward pass of the transformer block. :param q: Input tensor with shape (batch_size, num_query_tokens, hidden_dim). :param kv: Input tensor with shape (batch_size, num_kv_tokens, hidden_dim). :param transformer_attn_kwargs: Dict with arguments for the attention of the transformer block (such as the attention mask). Defaults to None. :returns: Result with shape (batch_size, num_query_tokens, hidden_dim).