You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thanks for this, really appreciate the SYCL port. We reviewed it and we are happy to merge, it is cleanly isolated to ggml/src/ggml-sycl/ so it can't affect the CUDA, Metal, or ROCm paths, and the hardcoded centroid + WHT sign tables byte-match our canonical turbo tables, so the kernels look like a faithful port of the CUDA set-rows.cu.
Before we pull it we just need some compilation and test evidence, since we don't have an Intel/oneAPI box on our side to verify it ourselves:
A clean -DGGML_SYCL=ON build log (icpx), ideally on current feature/turboquant-kv-cache.
A quick functional check that turbo KV actually works on your A380, e.g. llama-server/llama-cli with -ctk turbo3 -ctv turbo3 producing coherent output, and if you can, a short llama-perplexity run showing turbo3 PPL close to q8_0 (our gate is within 5%).
Our build-sycl.yml CI (ubuntu-24.04 + oneAPI) can also produce the compile half. It is currently sitting at action_required because it is a fork PR, so once you push any update we can approve the run and let it compile against current HEAD.
A couple of small things worth a look while you're at it: the new src[0]->type == GGML_TYPE_F32 gate tightens the non-turbo set_rows path too (intended?), and please double check WHT_SIGNS2 and the turbo4 rnorm=0 write against the CUDA reference. Nothing blocking. Thanks again, nice work.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
SYSCL Implementation using Claude, tested a little bit on Intel A380 and oneapi 2025.2.
Additional information
Code not properly reviewed.
Requirements