MCPcopy
hub / github.com/huggingface/transformers / log_metrics

Function log_metrics

src/transformers/trainer_pt_utils.py:830–917  ·  view source on GitHub ↗

Log metrics in a specially formatted way. Under distributed environment this is done only for a process with rank 0. Args: split (`str`): Mode/split name: one of `train`, `eval`, `test` metrics (`dict[str, float]`): The metrics returned from tra

(self, split, metrics)

Source from the content-addressed store, hash-verified

828
829# Trainer helper method: imported into the Trainer class and used as a method (takes `self` as first argument).
830def log_metrics(self, split, metrics):
831 """
832 Log metrics in a specially formatted way.
833
834 Under distributed environment this is done only for a process with rank 0.
835
836 Args:
837 split (`str`):
838 Mode/split name: one of `train`, `eval`, `test`
839 metrics (`dict[str, float]`):
840 The metrics returned from train/evaluate/predictmetrics: metrics dict
841
842 Notes on memory reports:
843
844 In order to get memory usage report you need to install `psutil`. You can do that with `pip install psutil`.
845
846 Now when this method is run, you will see a report that will include:
847
848 ```
849 init_mem_cpu_alloc_delta = 1301MB
850 init_mem_cpu_peaked_delta = 154MB
851 init_mem_gpu_alloc_delta = 230MB
852 init_mem_gpu_peaked_delta = 0MB
853 train_mem_cpu_alloc_delta = 1345MB
854 train_mem_cpu_peaked_delta = 0MB
855 train_mem_gpu_alloc_delta = 693MB
856 train_mem_gpu_peaked_delta = 7MB
857 ```
858
859 **Understanding the reports:**
860
861 - the first segment, e.g., `train__`, tells you which stage the metrics are for. Reports starting with `init_`
862 will be added to the first stage that gets run. So that if only evaluation is run, the memory usage for the
863 `__init__` will be reported along with the `eval_` metrics.
864 - the third segment, is either `cpu` or `gpu`, tells you whether it's the general RAM or the gpu0 memory
865 metric.
866 - `*_alloc_delta` - is the difference in the used/allocated memory counter between the end and the start of the
867 stage - it can be negative if a function released more memory than it allocated.
868 - `*_peaked_delta` - is any extra memory that was consumed and then freed - relative to the current allocated
869 memory counter - it is never negative. When you look at the metrics of any stage you add up `alloc_delta` +
870 `peaked_delta` and you know how much memory was needed to complete that stage.
871
872 The reporting happens only for process of rank 0 and gpu 0 (if there is a gpu). Typically this is enough since the
873 main process does the bulk of work, but it could be not quite so if model parallel is used and then other GPUs may
874 use a different amount of gpu memory. This is also not the same under DataParallel where gpu0 may require much more
875 memory than the rest since it stores the gradient and optimizer states for all participating GPUs. Perhaps in the
876 future these reports will evolve to measure those too.
877
878 The CPU RAM metric measures RSS (Resident Set Size) includes both the memory which is unique to the process and the
879 memory shared with other processes. It is important to note that it does not include swapped out memory, so the
880 reports could be imprecise.
881
882 The CPU peak memory is measured using a sampling thread. Due to python's GIL it may miss some of the peak memory if
883 that thread didn't get a chance to run when the highest memory was used. Therefore this report can be less than
884 reality. Using `tracemalloc` would have reported the exact peak memory, but it doesn't report memory allocations
885 outside of python. So if some C++ CUDA extension allocated its own memory it won't be reported. And therefore it
886 was dropped in favor of the memory sampling approach, which reads the current process memory usage.
887

Callers

nothing calls this directly

Calls 4

metrics_formatFunction · 0.85
is_world_process_zeroMethod · 0.80
valuesMethod · 0.45
keysMethod · 0.45

Tested by

no test coverage detected