In a previous posting, I explained a minor problem in the “Bounding Box” and “Dynamic Bounding Box” algorithms I describe in chapter 9 of Adam Machanic’s book Expert SQL Server 2005 Development. In short, the problem is that I calculated the boundaries of the search area just a tiny bit too small, introducing a (very small) chance of relevant data not being found. I also presented a quick and dirty fix that basically works by deliberately making the search area too big, sacrificing (a small bit of) performance for correctness. And I promised to keep trying to find a better solution.
In this post, I’ll first explain the logic of the final version of the boundary box algorithm. I’ll then quickly cover some other adjustments and improvements that I have made to all versions of the code. The last part of this post shows the rather surprising results of a head to head performance match between the three versions (the incorrect one, the quick but simple fix, and the exact but more complex fix) of the algorithm.
The problem with the original algorithm was that I tried to calculate the boundaries of the bounding box by going straight east and west from the center point. However, moving in that direction will not take you to the highest or lowest longitude, unless the origin is exactly on the equator. To find the real minimum and maximum longitude, I need to move in another direction. Unfortunately, that direction changes with the center of the search and even with the search radius.
For any given starting point and distance, the longitude (and latitude) of the point reached by moving said distance in any direction can be expressed as a function of the direction (a function that eventually describes the circular outline of the search area). The longitude values I am trying to find, the boundaries of the bounding box, are of course the minimum and maximum values of this function.
My high school math teacher once told me that a minimum or maximum value of a function is always characterized by the derivative function being 0. I believed him then and I see no reason to stop believing him now – so finding the minimum and maximum longitude now becomes a “simple” matter of first expressing the longitude of a point on the circular outline of the search area as a function of direction, treating longitude and latitude of the centre and the maximum distance of the search as unknown constants, then finding the derivative of that function, and finally finding all values of direction for which that derivative function is 0.
Trust me – this is just as much as fun as it sounds like. I now fully recall why I swore never to do anything related to derivative functions or trigonometric logic after I finished high school, and I have just sworn never to break that oath again J. I decided not to scare away my readers by including all the nasty arithmetic in the post; if you really feel like torturing yourself, you can find all details in the MS Word document “Finding min and max.doc” that is included in the ZIP file attached to this post.
As is so often the case with calculations like these, the formula kept getting longer and more complex with each next step – until, somewhere halfway through the process, the formulas suddenly start to get simpler and shorter. In the end, I was left with a formula that was so staggering simple that it just had to be correct (and that I was left with a strong feeling that there should have been an easier way to accomplish this result – please mail me if you know one!). To find the maximum and minimum longitude, we just have to move the maximum search distance in the directions
arccos(tan(Dist) * tan(Lat1))
– arccos(tan(Dist) * tan(Lat1))
When I went back to my code to create a second fixed version, something happened that often happens when I get back to code I wrote a few months ago – I saw another (very tiny) issue, and I noticed a few performance optimization opportunities I had previously missed. I decided to correct and repair all versions before starting any comparative performance test, as I would not want to run any apple to orange comparison.
The issue I identified has to do with possible rounding errors. Since I’m using datatype FLOAT in the T-SQL code and Double in the .Net code, very small rounding errors (in the order of magnitude of 1e–16 – that is 0.0000000000000001, less than one billionth of a millimeter – can occur). In extremely unlikely scenarios, where the search area exactly touches a point of interest, these rounding errors might result in the point not being found. I decided to exclude this possibility, however unlikely, by simply adding a very tiny fraction (1e–10, less than a millimeter) to the maximum longitude and latitude, and subtracting a tiny fraction from the minimum values.
In the T-SQL code, I had to repeat the logic for moving a given distance in a direction, since SQL Server will not allow me to call a stored procedure from a user-defined function. This had already given me the opportunity to eliminate two calculations of unused longitudes, but I now saw another opportunity to simplify more – whenever I calculate with a direction of 0, 90, 180, or –90 degrees, I can easily replace the functions that calculate the sine of cosine of the corner by their outcome, which in all these cases is one of –1, 1, or 0. Where the outcome is 0 and it was multiplied by some other terms, I removed the entire multiplication.
In the CLR code, where calling a procedure from within a user-defined function is allowed, I could not apply this same optimization. However, I did apply some other optimizations. I applied the lessons I learned from reviewing the CLR code in Microsoft’s sample code for the Hierarchical Triangular Mesh: I used variables to store intermediate calculations results instead of repeating the expression as is more common in a set-based language such as SQL, and I used multiplication instead of Math.Pow() to find the square of a value. I also refactored some of the functions: by separating conversions between degrees or kilometers/miles to radians and back from the real calculations, I could vastly reduce the number of conversions required. I also changed the configuration setting in Visual Studio from “Debug” to “Release” to allow it to compile optimized code by not including debug symbols.
The file SQLScripts.sql in the ZIP file attached to this post contains CREATE FUNCTION scripts for all the functions I tested against each other for performance. GetNeighborsWrong is an optimized version of the original (incorrect) algorithm to search the neighborhood; GetNeighborsFix1 implements the original “quick and dirty” fix I described in the previous post, and GetNeighborsFix2 implements the supposedly better fix described in this post.
The file SpatialCLR.zip that is included in the attachment as well contains the complete CLR project. Functions of interest are CLRNeighborsWrong, CLRNeighborsFix1, and CLRNeighborsFix2 (CLR versions of the functions mentioned above), but also CLRDynamicBBWrong, CLRDynamicBBFix1, and CLRDynamicBBFix2 – implementations of the search for the nearest neighbor using the dynamic bounding box, again employing the original but incorrect algorithm, the first “quick and dirty” fix and the final fix described here. There are no T-SQL versions of these, since the T-SQL implementation of the dynamic bounding box merely calls one of the GetNeighbors or CLRNeighbors functions.
My expectations for the performance test were quite straightforward. Since the original algorithm failed by making the search area too small, I expected this one to perform fastest; the only reason I did include it in the test was just to see how much performance I had “gained” on the HTM versions provided by Microsoft by missing some results. The first fix corrected the error, but did this by overshooting. The algorithm was still almost as simple as the original version, but the search area became bigger – so I expected the “Fix1” versions to perform better than the “Wrong” versions. The “Fix2” versions narrow the search area down again to exactly the correct size, but at the price of some (slightly) more complex calculations in the code. I expected “Fix2” to be about on par with “Fix1”, figuring that the performance gain achieved by not overshooting the search area would be compensated by the more complex calculations.
Boy, was I wrong!
The code I used for testing was almost the same as what I used for other tests throughout writing the chapter. I decided to test the various bounding box implementations with the query to find all pairs of cities in Texas that are at least 5 but at most 10 kilometers apart, and to test the dynamic bounding box implementations with the query to find the nearest city to each city in the USA. Both queries are included in the chapter, so I won’t repeat them here. The Texas query was included in six versions (“Wrong”, “Fix1”, and “Fix2”, in both the T-SQL and the CLR versions); the nearest city query was even included in nine versions (the three CLR versions, plus six versions of the T-SQL version, employing each of the six versions of the neighborhood search). Each of these 15 test cases was enclosed in a loop that clears the cache, then calls the query ten times so that one in ten tests was on a cold cache. Results of the queries were stored in a temporary table that was subsequently truncated, to make sure that network traffic and speed of formatting and outputting results did non influence the test results. All this was then enclosed in an endless loop that I left to run on my laptop for three days, after which each of the fifteen test cases was executed a total of 3,000 times.
The average execution times, ordered by test case and performance, were as follows:
Testcase Version Avg.Duration
-------- -------------------------------- ------------
Texas CLRNeighbors, fix 1 361.28
Texas CLRNeighbors, fix 2 362.31
Texas CLRNeighbors, original 362.77
Texas SQLNeighbors, fix 2 420.75
Texas SQLNeighbors, fix 1 439.24
Texas SQLNeighbors, original 442.00
DynBB SQL with CLRNeighbors, original 12243.28
DynBB SQL with CLRNeighbors, fix 2 12251.26
DynBB SQL with CLRNeighbors, fix 1 12265.54
DynBB SQL with SQLNeighbors, fix 2 14043.38
DynBB SQL with SQLNeighbors, original 14419.06
DynBB SQL with SQLNeighbors, fix 1 14424.31
DynBB Completely in CLR, original 15588.18
DynBB Completely in CLR, fix 1 15626.54
DynBB Completely in CLR, fix 2 15652.72
As you see when you analyze these numbers, my expectations were trashed. Only two out of five tests confirmed my expectation of the original but incorrect algorithm being fastest. And these two disagree about which fix is best. The other tests also disagree about the relative performance of each of the fixes. I must admit that I am completely unable to explain the numbers I got from my performance tests – so if any of you readers can enlighten me, please do!
One thing does stand out clear from these results: the differences between the original version and the two fixed versions are very small. Since there is no performance argument to choose either of the fixed versions over the other, my preference would lean towards the “cleaner” solution described in this post and implemented in the “Fix2” versions of the code, but it appears that the quick and dirty fix implemented in the “Fix1” versions can be used just as well. The original version should of course be removed from your system – not because of its performance, but because it’s incorrect. And that should be the nightmare of any database developer, just as it is the nightmare of any DBA.