Activates NEFTune (Noisy Embeddings for Fine-Tuning) on the model. NEFTune adds noise to embedding vectors during training, which has been shown to improve fine-tuning performance. See https://huggingface.co/papers/2310.05914 for details. Args: model (`torch.nn.Module`):
(model, neftune_noise_alpha, accelerator=None)
| 52 | |
| 53 | |
| 54 | def activate_neftune(model, neftune_noise_alpha, accelerator=None): |
| 55 | """ |
| 56 | Activates NEFTune (Noisy Embeddings for Fine-Tuning) on the model. |
| 57 | |
| 58 | NEFTune adds noise to embedding vectors during training, which has been shown to improve |
| 59 | fine-tuning performance. See https://huggingface.co/papers/2310.05914 for details. |
| 60 | |
| 61 | Args: |
| 62 | model (`torch.nn.Module`): |
| 63 | The model to activate NEFTune on. |
| 64 | neftune_noise_alpha (`float`): |
| 65 | The noise alpha value controlling the magnitude of the noise. |
| 66 | accelerator (`Accelerator`, *optional*): |
| 67 | The accelerator instance. If provided, the model will be unwrapped before |
| 68 | accessing embeddings. Required when using distributed training. |
| 69 | |
| 70 | Returns: |
| 71 | `torch.utils.hooks.RemovableHandle`: The hook handle that can be used to deactivate NEFTune. |
| 72 | """ |
| 73 | if accelerator is not None: |
| 74 | unwrapped_model = accelerator.unwrap_model(model) |
| 75 | else: |
| 76 | unwrapped_model = model |
| 77 | |
| 78 | if _is_peft_model(unwrapped_model): |
| 79 | embeddings = unwrapped_model.base_model.model.get_input_embeddings() |
| 80 | else: |
| 81 | embeddings = unwrapped_model.get_input_embeddings() |
| 82 | |
| 83 | embeddings.neftune_noise_alpha = neftune_noise_alpha |
| 84 | hook_handle = embeddings.register_forward_hook(neftune_post_forward_hook) |
| 85 | |
| 86 | return hook_handle |
| 87 | |
| 88 | |
| 89 | def deactivate_neftune(model, hook_handle, accelerator=None): |