Skip to content

2009 07 06 another update

Fabian Schmied edited this page Jul 6, 2009 · 1 revision

Published on July 6th, 2009 at 8:03

Another Update

As I wrote about four weeks ago, re-linq is currently undergoing a major overhaul in order to make it more usable as a generic LINQ provider foundation. Two weeks ago, I wrote that we had rewritten re-linq’s structural parsing engine, which parses an expression tree into a QueryModel construct, and its expression resolution mechanism, which determines the source of the data used in an expression.

What remained from our original plan was the following:

3. After those two changes, we will be able to clean up the clause classes a lot.
4. One of the big challenges of generating queries from LINQ expressions is to transform what was written by the user into something that can be expressed in the target query language. Let clauses, for example, generally cannot be expressed in SQL in a good way. However, at least as long as they are side effect-free, you can simply insert their right sides everywhere their left side is used.
Similarly, as Frans Bouma recently explained, Where clauses sometimes have to be moved to the right side of Join clauses to avoid derived table expressions. 
We want to make such transformations really easy to perform, and we will refactor the clause classes and QueryModel to facilitate this.

Note that Let clauses aren’t even present in the QueryModel any more. We’ve decided to simply substitute their right sides by default. Nonetheless, these two broad issues shaped our work of the last two weeks.

In detail, these were our cleanup tasks:

  • We refactored QueryModel as well as the clause classes (SelectClause, WhereClause, etc.) so that by now nearly every property has a public setter (as well as a getter), which means that you can take a QueryModel and simply change its where predicates, select expressions, data source (from) expressions, and so on. With the help of reference expressions (an expression that indicates data stemming from a query data source), this is really simple now.
  • All the collections (e.g. QueryModel.BodyClauses) are now fully modifiable, which means you have the typical Add, Remove, Clear collection API on them.
  • We cleaned up and unified the public APIs of the clauses and of QueryModel.
  • We removed all interdependencies between the clause classes (e.g. IClause.PreviousClause) because they hindered transformability. Previously, it wasn’t easy to move a clause to another query because all the clauses essentially formed a linked list in addition to being held by the QueryModel.BodyClauses collection. This has now changed, you can simply take a clause and put it somewhere else. (But don’t forget to adjust your reference expressions.)
  • The QueryModel.ParentQuery interdependency was also removed.
  • We removed MemberFromClause and SubQueryFromClause, these were remnants from our back-end. All other back-end things showing through (e.g. GetColumnSource) were also removed.
  • We also decided to strip QueryModel.GetExpressionTree – with the advanced transformability we were planning, this  would get out of sync with the QueryModel too easily. QueryModel is now completely independent of the original expression tree.

After the cleanup, we pimped up re-linq’s transformation possibilities:

  • We updated IQueryModelVisitor to receive more context for visited clauses. Every visited item now gets its parent object and its index in the parent’s collection passed to the IQueryModelVisitor.Visit… method.
  • We provided a QueryModelVisitorBase class that visits all the collections (such as QueryModel.BodyClauses) automatically; previously, the user had to iterate over the collections manually.
  • We made QueryModelVisitorBase resistant to changes in the collections currently being iterated. For example, while all the body clauses of a query are visited, the visitor can change that body clause collection, adding and removing clauses at will. Clauses inserted after the current clause will be visited later (according to their new position in the clause), clauses inserted before the current clause will not be visited.
  • We added a TransformExpressions method to both QueryModel and all clauses. This will call a transformation object on all expressions held by the QueryModel or clause (including its child objects, transitively), which makes it easy to e.g. replace all reference expressions in a clause when it is moved between QueryModels. (Use ReferenceReplacingExpressionTreeVisitor and ClauseMapping for exactly this scenario.)
  • Based on that, we then implemented a sample transformation that flattens subqueries in from clauses (like in from x in (from y in Ys where y.ID > 10 select y)). I’ll write about this in a separate blog post.

And then, there were a few other refactorings and simplifications:

  • We’ve updated and completed the documentation comments of those classes an adopter of re-linq would have to use.
  • We’ve added a DefaultQueryProvider so that a user of re-linq only has to derive from QueryableBase and not implement QueryProviderBase any longer.
  • We’ve added a ThrowingExpressionTreeVisitor base class that throws exceptions for every expression type not explicitly handled by a specific subclass. This is usable when implementing a transformation from LINQ expressions to, for example, SQL expressions – you don’t want unknown LINQ expressions to be ignored (as ExpressionTreeVisitor does), you want an exception to be thrown for them.
  • We’ve given each clause a ToString override to help with debugging and outputting of queries.

All in all, quite a lot for just two weeks, I’d say. Next on our list are two parsing issues we noticed last week, a redesign of query modifiers (currently called ResultModificationsDistinct, Take, Count, and so on), some other minor refactorings and namespace changes (to more cleanly separate back-end from front-end), GroupClause, and JoinClause. Stay tuned!

Clone this wiki locally