What would a basic GPU support package look like?
Let's assume we want to support Array and Int and Float64 possibly packed into smaller types. What if we add a few small rank tensors to cover the common cases, like up to rank4 maybe. What are the minimal operations needed to do something useful? Obviously dot-prod-type operations, but what else.
How would we model loading data into the GPU, operating on it, and then fetching it back?
How could we support metal on macos, no-gpu (just CPU memory) as well as cuda?
What would a basic GPU support package look like?
Let's assume we want to support
ArrayandIntandFloat64possibly packed into smaller types. What if we add a few small rank tensors to cover the common cases, like up to rank4 maybe. What are the minimal operations needed to do something useful? Obviously dot-prod-type operations, but what else.How would we model loading data into the GPU, operating on it, and then fetching it back?
How could we support metal on macos, no-gpu (just CPU memory) as well as cuda?