We could use VQSORT to get SIMD accelerated sorting. https://github.com/google/highway/blob/master/hwy/contrib/sort/vqsort.h Paper: https://arxiv.org/abs/2205.05982