ksuit.optimizer.param_group_modifiers.weight_decay_by_name_modifier¶

Classes¶

WeightDecayByNameModifier

Changes the weight decay value for a single parameter. Use-cases:

Module Contents¶

class ksuit.optimizer.param_group_modifiers.weight_decay_by_name_modifier.WeightDecayByNameModifier(param_group_modifier_config)¶

Bases: ksuit.optimizer.param_group_modifiers.base.ParamGroupModifierBase

Changes the weight decay value for a single parameter. Use-cases: - ViT exclude CLS token parameters - Transformer learned positional embeddings - Learnable query tokens for cross attention (“PerceiverPooling”)

Parameters:: param_group_modifier_config (ksuit.schemas.optim.ParamGroupModifierConfig)

name¶

value¶

param_was_found = False¶

get_properties(model, name, param)¶

This method is called with all items of model.named_parameters() to compose the parameter groups for the whole model. If the desired parameter name is found, it returns a modifier that sets the weight decay.

Parameters:

model (torch.nn.Module) – Model from which the parameter originates from. Used to extract properties (e.g., number of layers for a layerwise learning rate decay).
name (str) – Name of the parameter as stored inside the model.
param (torch.Tensor) – The parameter tensor.

Return type:

dict[str, float]

was_applied_successfully()¶: Check if the parameter was found within the model.