Creating Device Context for Non-NVIDIA/Metal devices #1783
Replies: 3 comments 2 replies
-
|
Hello ! thanks for your interest in tract. tract-gpu is a support crate. We did metal first, then moved on to cuda and started by extracting from metal whatever we thought would also make sense for cuda (or other gpus). So... absolutely, if you want to start working on some OpenCL support, your starting point should be extending/implementing the tract-gpu traits similarly to tract-metal and tract-cuda. You will first need to figure out what does what in OpenCL. The "Context" is a somewhat process-wide singleton that you can use to allocate memory, while the Metal queue and cuda Stream allow to schedule operations. You will need to find some bindings for Opencl. We probably want to stay away from overly ambitious project that actually allow authoring kernels in rust (like rust-gpu). A driver-level binding that will allow be focused on the Context or its equivalent and the Stream/CommandQueue is what you're after. Hopefully there is something not too heavy and maintained that could be used. For cuda we went with cudarc: its ability to dynamic load cuda was great, because we can have a universal binary command line that will work with cuda if cuda is there, but will gracefully handle a server where cuda is missing. If you can figure out these dependencies, then I guess we will be able to guide you for the next steps. We have been through that relatively recently, so it's still fresh in our memories :) |
Beta Was this translation helpful? Give feedback.
-
|
Pinging @LouisChourakiSonos for more insights on the gpu stuff |
Beta Was this translation helpful? Give feedback.
-
|
Hey @jbrockett-reach! I am the person who has worked on tract-metal and is now working on tract-cuda. Very enthusiastic about a having a potential tract-opencl crate! Are you trying to run some LLM model or more classic NNs? I think mimicking Cuda and Metal crates is the good approach. GPU work for tract is pretty recent so it will help a lot to have implementations as uniform as possible in case we need to do general refactoring. Hopefully tract-gpu is generic enough to let you do what you want with open-cl but if it is not the case, we can discuss of possible modifications. The process I suggest you to follow is the following:
I suggest you start with a simple operator like element-wise operator because kernels are simple to write and you can quickly iterate on them. Actually do you plan on writing kernels yourself or taking them from some open source repository? On our side, we took inspiration from llama.cpp and apple-mlx code if it helps.. Another useful thing to know is about the I don't really know much about opencl, so I hope the crate you have found will provide what you need. I am a bit worried about the fact the project is not very active, but if you don't have the choice then go for it! I hope I answered most of your questions. Don't hesitate to ask them here if you have others! We are very grateful for you interest in tract and we look forward on running some models in opencl |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi there! I started using
tractfor an embedded project and it's been working out well-- however, I’m using an embedded device with a GPU, and would like to be able to have my model’s inferencing done on that to free up the CPU. That said, my GPU is not Metal or NVIDIA-based, so the Metal and Cuda crates wouldn’t work out for me. My current model is stored in tflite format (though I could change this), if it's relevant. Though there's the GPU crate, there's currently no example and I can't quite gather how to set it up.Currently, my error is that I don't have a device context to set to-- how might I go about creating this?
I have OpenCL working on the device, so my current best guess is to try and mimic what's going on in the Cuda and Metal crates with OpenCL, but I'm not familiar enough with the process to know if there's a better approach.
Beta Was this translation helpful? Give feedback.
All reactions