MCPcopy
hub / github.com/huggingface/transformers / main

Function main

examples/pytorch/multiple-choice/run_swag.py:176–423  ·  view source on GitHub ↗
()

Source from the content-addressed store, hash-verified

174
175
176def main():
177 # See all possible arguments in src/transformers/training_args.py
178 # or by passing the --help flag to this script.
179 # We now keep distinct sets of args, for a cleaner separation of concerns.
180
181 parser = HfArgumentParser((ModelArguments, DataTrainingArguments, TrainingArguments))
182 if len(sys.argv) == 2 and sys.argv[1].endswith(".json"):
183 # If we pass only one argument to the script and it's the path to a json file,
184 # let's parse it to get our arguments.
185 model_args, data_args, training_args = parser.parse_json_file(json_file=os.path.abspath(sys.argv[1]))
186 else:
187 model_args, data_args, training_args = parser.parse_args_into_dataclasses()
188
189 # Setup logging
190 logging.basicConfig(
191 format="%(asctime)s - %(levelname)s - %(name)s - %(message)s",
192 datefmt="%m/%d/%Y %H:%M:%S",
193 handlers=[logging.StreamHandler(sys.stdout)],
194 )
195
196 if training_args.should_log:
197 # The default of training_args.log_level is passive, so we set log level at info here to have that default.
198 transformers.utils.logging.set_verbosity_info()
199
200 log_level = training_args.get_process_log_level()
201 logger.setLevel(log_level)
202 datasets.utils.logging.set_verbosity(log_level)
203 transformers.utils.logging.set_verbosity(log_level)
204 transformers.utils.logging.enable_default_handler()
205 transformers.utils.logging.enable_explicit_format()
206
207 # Log on each process the small summary:
208 logger.warning(
209 f"Process rank: {training_args.local_process_index}, device: {training_args.device}, n_gpu: {training_args.n_gpu}, "
210 + f"distributed training: {training_args.parallel_mode.value == 'distributed'}, 16-bits training: {training_args.fp16}"
211 )
212 logger.info(f"Training/evaluation parameters {training_args}")
213
214 # Set seed before initializing model.
215 set_seed(training_args.seed)
216
217 # Get the datasets: you can either provide your own CSV/JSON/TXT training and evaluation files (see below)
218 # or just provide the name of one of the public datasets available on the hub at https://huggingface.co/datasets/
219 # (the dataset will be downloaded automatically from the datasets Hub).
220
221 # For CSV/JSON files, this script will use the column called 'text' or the first column if no column called
222 # 'text' is found. You can easily tweak this behavior (see below).
223
224 # In distributed training, the load_dataset function guarantee that only one local process can concurrently
225 # download the dataset.
226 if data_args.train_file is not None or data_args.validation_file is not None:
227 data_files = {}
228 if data_args.train_file is not None:
229 data_files["train"] = data_args.train_file
230 extension = data_args.train_file.split(".")[-1]
231 if data_args.validation_file is not None:
232 data_files["validation"] = data_args.validation_file
233 extension = data_args.validation_file.split(".")[-1]

Callers 2

_mp_fnFunction · 0.70
run_swag.pyFile · 0.70

Calls 15

parse_json_fileMethod · 0.95
trainMethod · 0.95
save_modelMethod · 0.95
evaluateMethod · 0.95
push_to_hubMethod · 0.95
create_model_cardMethod · 0.95
HfArgumentParserClass · 0.90
set_seedFunction · 0.90
TrainerClass · 0.90
get_process_log_levelMethod · 0.80

Tested by

no test coverage detected