News, examples, tips, ideas and plans.
Thoughts around ORM, .NET and SQL databases.

Wednesday, July 22, 2009

Query transformation pipeline inside out: APPLY rewriter

DO4 is designed to support various RDBMS having different capabilities. One of such capabilities is possibility to use reference values from the left side of join operation in its right side. Microsoft SQL Server supports this kind of join via APPLY statement. But, for example, PostgreSQL does not provide any similar feature. However, an operator like APPLY is required to translate of many LINQ queries - such backreferences are very natural to LINQ.

As you might know, our LINQ translation layer translates LINQ queries to RSE - in fact, query plans. Further these plans are sent to our transformation & optimization pipeline, which is different for different RDBMS, although it is combined from the common set of optimizers (transformers). So if some RDBMS does not support certain feature, we add a transform rewriting the query to make it compatible with this RDBMS. In the end of all we translate the final query plan (RSE query) to native query for the current RDBMS. Note that if we meet something that can't be translated to native query on this step, an exception is thrown.

Let's return back to the subject of this artcile. There is ApplyProvider in RSE, which does the same job as APPLY statement. When query is compiled from LINQ to RSE, we freely use ApplyProvider everywhere where it is necessary. But as I've mentioned, there are some RDBMS that does not support it, and we must take care about this. That's the story behind APPLY rewriter.

To archive the same behavior for different RDBMS, we try to rewrite queries containing ApplyProvider (i.e. requiring APPLY statement to be translated "as is") to get rid of references to the left side from the right side. We perform the following modifications in the source query during rewriting process:
Obviously, such rewriting is not always possible. Below there is the list of cases when we can rewrite the source query:
  • Right part returns a single row: in such a case rewriting is not required, since we can translate such ApplyProvider to SQL as "SELECT [left columns], [right columns] FROM ...", which is supported by any RDBMS
  • Columns used in expressions of considered FilterProviders or CalculateProviders are not removed by other providers (e.g. by AggregateProvider).
  • Reference to ApplyParameter is contained only in one of sources of a BinaryProvider, if such a provider exists in the right part of the original ApplyProvider.
  • Right part of ApplyProvider contains references only to its own ApplyParameter.
That's it. This is enough to get rid of APPLY from the most part of queries initially looking as if they'd involve it. Obviously, we can't eliminate APPLY from any queries.

An example LINQ query leading to CROSS APPLY on LINQ to SQL:

from category in Categories
from productGroup in (
  from product in Products
  where product.Category==category
  group product by product.UnitPrice
)
select new {category, productGroup}

Its SQL:

SELECT [t0].[CategoryID], [t0].[CategoryName], [t0].[Description], [t0].[Picture], [t2].[UnitPrice] AS [Key]
FROM [Categories] AS [t0]
CROSS APPLY (
  SELECT [t1].[UnitPrice]
  FROM [Products] AS [t1]
  WHERE [t1].[CategoryID] = [t0].[CategoryID]
  GROUP BY [t1].[UnitPrice]
  ) AS [t2]

We can't get rid of CROSS APPLY in such a query as well. On the other hand, the good thing is that we can translate it properly, if APPLY is supported - in contrast to almost any other ORM we've looked up (except EF and LINQ to SQL). The same is correct for APPLY-related query transformations - as we know, only EF and LINQ to SQL are aware about them.

No comments:

Post a Comment