MCPcopy
hub / github.com/huggingface/transformers / get_decay_parameter_names

Method get_decay_parameter_names

src/transformers/trainer.py:1289–1299  ·  view source on GitHub ↗

Get all parameter names that weight decay will be applied to. This function filters out parameters in two ways: 1. By layer type (instances of layers specified in ALL_LAYERNORM_LAYERS) 2. By parameter name patterns (containing 'bias', or variation of 'norm')

(self, model: nn.Module)

Source from the content-addressed store, hash-verified

1287 return handler(ctx)
1288
1289 def get_decay_parameter_names(self, model: nn.Module) -> list[str]:
1290 """
1291 Get all parameter names that weight decay will be applied to.
1292
1293 This function filters out parameters in two ways:
1294 1. By layer type (instances of layers specified in ALL_LAYERNORM_LAYERS)
1295 2. By parameter name patterns (containing 'bias', or variation of 'norm')
1296 """
1297 forbidden_name_patterns = [r"bias", r"layernorm", r"rmsnorm", r"(?:^|\.)norm(?:$|\.)", r"_norm(?:$|\.)"]
1298 decay_parameters = get_parameter_names(model, [nn.LayerNorm], forbidden_name_patterns)
1299 return decay_parameters
1300
1301 def _get_learning_rate(self) -> float:
1302 """

Callers 1

create_optimizerMethod · 0.95

Calls 1

get_parameter_namesFunction · 0.85

Tested by

no test coverage detected