I have now implemented another additional BVH construction mode in the last few hours, which is based on my 3D Dynamic AABB Tree implementation in my physics engine KRAFT, which, in turn, is based on your Box2D BVH concepts but with tree data structure optimizations such as stack-free traversal in a linear skip-list style and merging of leaf nodes to ensure a minimum triangle count as post-processing steps. Unfortunately, it didn't help much, but at least there was a slight improvement. Later, I'll try the 'Mean/Largest-Axis-Variance' splitting approach, and if that doesn't help either, I guess I'll have to implement some mesh cleanup preprocessing and vertex count decimation to keep the number of collision mesh triangles at a moderate scale.