Harshvardhan Gupta commented on DERBY-6949:
The whole lifecycle of Query Optimization in Derby is as follows -
Optimizer Interface's implementation OptimizerImpl.java is responsible to iterate through all permitted semantically equivalent execution plans and orchestrating calls to the various entities that provide their 'estimated cost' of execution considering the set of constraints that will be associated with them during that particular plan.
Before getting into the details of the above mentioned entities and their estimated cost, it is important to note that the search space of all the possible semantically equivalent plans is limited to only left deep trees. The selection of join order also influences the way we want to constraint the execution. For example - A left deep HJ has to wait for each join to finished completely so that the result set can be hashed before the next step can probe it, on the other hand in a right deep HJ, it is not the intermediate results that are being hashed but the base tables. So, Right deep HJ trees use substantially more memory / disk spillovers.
It has been shown in the [VLDB Optimizer Paper|http://www.vldb.org/pvldb/vol9/p204-leis.pdf] (Section 6.2) that left deep trees perform quite reasonably while restricting search space and making the Optimizer simple and fast as opposed to right deep trees where disk spills eat away a lot of performance. The author further emphasize on cardinality estimates as opposed to search space which may not offer significant performance benefits. This area definitely needs to be further explored in Derby's context.
Coming back to the entities for cost estimation, they are Join Strategy, Optimizables and Ordering. Each of these implement the 'estimateCost' function and gives out their estimated cost of execution to the Optimizer. Optimizer permutes through the various combinations and produces the best possible execution plan it is able to find.