Releases: ggerganov/llama.cpp
Releases · ggerganov/llama.cpp
b3131
b3130
tests : check the Python version (#7872) ggml-ci
b3091
ggml : refactor rope norm/neox (#7634) * ggml : unify rope norm/neox (CPU) * ggml : fix compile warning * ggml : remove GLM rope mode ggml-ci * metal : better rope implementation ggml-ci * cuda : better rope implementation ggml-ci * naming : n_orig_ctx -> n_ctx_orig ggml-ci * dev : add reminders to update backends ggml-ci * vulkan : fix ggml_rope_ext() usage * cuda : fix array size + indents ggml-ci
b3089
Fix per token atrributes bits (#7749)
b3088
Allow number of nodes in CUDA graph to change (#7738) Previously the code would have failed to cope in the case that the number of nodes changes in an existing CUDA graph. This fixes the issue by removing an unnecessary conditional.
b3087
common : refactor cli arg parsing (#7675) * common : gpt_params_parse do not print usage * common : rework usage print (wip) * common : valign * common : rework print_usage * infill : remove cfg support * common : reorder args * server : deduplicate parameters ggml-ci * common : add missing header ggml-ci * common : remote --random-prompt usages ggml-ci * examples : migrate to gpt_params ggml-ci * batched-bench : migrate to gpt_params * retrieval : migrate to gpt_params * common : change defaults for escape and n_ctx * common : remove chatml and instruct params ggml-ci * common : passkey use gpt_params
b3086
ggml : remove OpenCL (#7735) ggml-ci
b3085
llama : remove beam search (#7736)
b3083
llama-bench : allow using a different printer for stderr with -oe (#7… …722) compare-commits.sh : hide stdout, use -oe to print markdown
b3082
Improve hipBLAS support in CMake (#7696) * Improve hipBLAS support in CMake This improves the detection of the correct CMAKE_PREFIX_PATH when using different distributions or a self-built ROCm SDK. * Set ROCM_PATH correctly