-
Notifications
You must be signed in to change notification settings - Fork 2
Description
By now we have a spatial index that works on top of a z-order space filling curve
- Space filling curve based spatial index #22
- Adds a space filling curve based spatial index, closes #22 #68
and we have the (optional) functionality in the public API for users to sort graph nodes by a z-order space filling curve
with the idea that it will benefit graph traversals due to better memory locality when we walk a graph e.g. during dijkstra's s-t shortest path.
We need to integrate the spatial index into the public API so that users can
- start and end their search at locations and not node ids
- get back a path of locations and not just node ids
Questions in terms of designing the API surface
- Should we decouple the spatial index from the graph and expose both low-level structures to the user?
- Or should we tighly couple the graph and the spatial index so that users can do
graph.search(start, stop)with locations?
One key realization I had while building the spatial index: If we already sort graph nodes based on the z-order space filling curve, then the spatial index on the very same nodes is free! 😮 This is because all our spatial index is doing is sorting nodes by z value and then doing binary searches with search space pruning on the z values.
In practice what this could mean if we want to tighly couple graph and spatial index
- We always sort nodes by z order space filling curve and construct the graph based on that
- We then put the spatial index on top of it; in theory there is nothing to do in construction; in practice I believe we want to at least compute and cache the z values because they are not the cheapest to re-compute
- When the user searches for a location in the spatial index we simply do binary search on the z values and return indices as results
- These indices are the graph node ids to start searches from and stop searches at
What we then have to store for the graph and the spatial index
- a permutation to reverse the z value sorting to return results back to users
- one lng, lat array with one item per graph node, keyed by node id
- one z value array with one item per graph node, keyed by node id, computed from the lng, lat (purely as cache and an optimization)
We should explore if this sort of tighly coupling is worth it for the synnergy effects: graph and spatial index will be very small.