Mastodawn

May 10, 2024

#kdtree and ball trees seem cool, but require full knowledge of the thing I'm searching for. What if it's 7 dimensional and I only know 4 of the values?

I feel like a "parallel kd tree" with a separate binary index on each dimension would work better here.

Reduce depth. Allow unspecified values. It'd also be a snap to create and search each dim in parallel.

This must already exist...

#computerscience #math

Show thread

bob

@davidr "it's 7 dimensional and I know 4 of the values" = "it lies within this bounding hypercube". you're probably not going to be able to do much better than find the LCA of the hypercube in the kd-tree and do an O(n) search of the subtree rooted at that node.

Show thread

bob May 10, 2024

@davidr if the dimensions you know are the first 4 you could also use a sorted array of morton codes. the values of those 4 dimensions are a common prefix so you can just do a single binary search and a linear scan

Show thread

dr 🛠️🛰️📡🎧

May 11, 2024

@bob Part of my hesitance is that I'm not that familiar with these data structures. My understanding is that the kd tree splits first on dim 1, then dim 2....k, then dim 1 again, etc, until max depth/min node size is reached.

If I'm missing some dimensional values, how do I bridge those gulfs and continue on into further subtrees of the dims I do have?

Maybe that's what "LCA of the hypercube" means.