Commit
3 years ago
Eval harness (#212) * Add functionality for running the evaluation harness on single gpu * Add support for pipelining * support tensor parallel * save the results * Minor cleanup * Experimental Deepspeed support * Proper deepspeed integration, now working on combined tp and pp * Update model loading and clean up code. * Add some options * Fix pipelining + fp32 evaluaiton. * Remove dummy paths in examples/run_evalharness.sh Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com> * Simplify offline loading with export HF_DATASETS_OFFLINE=1 * Remove accidental copy-paste. * Experimantel deepspeed evaluation-path * make it work with deepspeed; add instructions * improve * make adaptive_seq_len work with deepspeed * move to slurm * fixes * cleanup * add instructions on how to import data into the spreadsheet * not tracking ppl/em * add task version * make compatible with lm-eval@master * switch to 16gb slurm; simplify; improve instructions * Deepspeed model loading hack * Restore correct zero state. * fix conversion script * simpler config * corrections * add logiqa * dealing with custom tokenizers * fix * Update examples/run_evalharness_deepspeed.md * check that the checkpoint path is valid * skip --abort_on_unmet_fused_kernel_constraints during eval * disable sanity check on layers-2%pp==0 * sort skip_keys * make the default path unique to avoid overwrite * Add bootstrap_iters arg * Explain bootstrap_iters flag * Intermediate results flag * Add backup file * Add arg to reduce bubble for pipeline parallel * Fix adaptive_seq_len via resetting activation shape * Extract args.load prior to load_ds_checkpoint_and_setup_megatron * Parse args prior to loading function to get load_path * Add run_evalharness-tr11-176b-ml slurm script Co-authored-by: Daniel Hesslow <daniel@lighton.ai> Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com> Co-authored-by: Stas Bekman <stas@stason.org> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Muennighoff <n.muennighoff@gmail.com>
Author
Parents
Loading