News, examples, tips, ideas and plans.
Thoughts around ORM, .NET and SQL databases.

Tuesday, July 28, 2009

What models do we maintain

Since there are at least 3 visible models, it's necessary to explain why we maintain so many of them ;)

Our models stack is combined from the following models:

Xtensive.Storage.Model (+ Xtensive.Storage.Building.Definitions)

That's the top-level model used by storage. In fact, it consists of two parts:
- Definitions model: XxxDef types, e.g. TypeDef
- Runtime model: everything else. E.g. TypeInfo. Exposed via Domain.Model property.
- Serializable version of runtime model: see Xtensive.Storage.Model.Stored namespace.

Definitions are used on the first step of Domain build process. We reflect all the registered typed and build "model definition". The definitions can be freely added, removed or modified by your own IModules - you should just implement OnDefinitionsBuilt method there. This allow modules to dynamically build or change anything they want - e.g. they can add a property to any of registered types, or register a companion (associated) typefor each one of them. Note that nothing special must be done to make this happen, you should just ensure that necessary module is added to the set of registered types.

So definitions describe crude model - there is just minimal information needed to build a runtime model.

And as you might assume, runtime model is built by definitions gathered gathered on the previous step. It is much more complex - e.g. it fully describes any association and mapping.

In general, it is build to immediately answer on any question appearing during the Domain runtime.

Finally, there is an XML-serializable version of model. It is loaded & serialized to database during each schema upgrade. Our schema upgrade layer uses it to propertly translate type-level hints related to old model to its schema-level hints and uses it in upgrade process to make it more intelligent. You can find it serialized into one of rows in Metadata.Extension table.


Let's call it schema model. This is a low-level model of storage we use during schema comparison and upgrade process. It differs from Storage.Model, because:
- Storage model maintains two-way relationships betweem type-level and storage-level objects. E.g. betweem types and tables, properties and columns. But here we need only a part of this model related to storage-level objects.
- Storage model is built to quickly answer on common questions. This model is designed to schema change and comparison well.
- Storage model is more crude. E.g. DO isn't much interested of foreign keys - it should just know there must be a foreign key. Schema model knows all the details about it.

Schema model is used to:
- Compare extracted and required schema. This process is actually more complex than you might expect - to generate the upgarde actions well, we split the comparison process into a set of steps, and comparing the models related to them. In fact, we do something like: ExtractedModel -> Step1Model -> Step2Model -> ... -> RequiredModel. That's why we must be able to clone and change it nearly as any SQL Server does this. Step1Model here may refer to model with dropped foreign key constraints, Step2Model can be e.g. model containing temporarily renamed schema objects and so on. Yes, we can safely handle rename loops like A->B', B->C', C->A' - we detect & break such loops by renaming one of objects in it to temporary named one on intermediate step ;)
- Index engines use this schema as their native schema format. Actually it is quite fast as well, if locked ;)

Schema models are available via two Domain properties:
- ExtractedSchema
- Schema.


This is the SQL schema model - a schema model of SQL database in its native form. It differs from the above one - it describes all the SQL schema terms instead of a part of them we need. For example, you can find View and Partition objects there, althogh for now we don't have their analogues in schema model.

This model us used to:
- Produce extracted schema model. SQL DOM provides Extractor allowing to extract it for any supported database; the result of its work is sent to SqlModelConverter (a part of any SQL storage provider) to produce the schema model from it. So this converter is responsible for such decisions as ignoring non-supported SQL schema objects and so on.
- Produce SQL statements (commands). SQL DOM refers to its objects by its statement model objects, such as SqlAlterTable.

You can't access this model in runtime, but it is available to any SQL storage provider via its DomainHandler.Mappings member (see Xtensive.Storage.Providers.Sql namespace - it looks like I expluded this part from the brief version of API reference). These mappings are used to produce generally any SQL command sent by provider.


This is our model of SQL language - so-called SQL DOM. You can think objects from this model (except SQL schema model objects) normally have rather short lifetime, since they represent parts of particular SQL commands. But actually this isn't true:
- We cache almost any SQL request model we build. Cached request models are bound to particular CRUD operations, LINQ and RSE queries. So in general we almost never build a request model twice.
- We cache even pre-translated versions of request parts. Relatively long strings from which we combine the final version of requests. This allows us to produce a version of request with differently named parameters almost instantly.
- Moreover, we support branching in SQL DOM request model. It is used to produce different versions of request containing external boolean parameters. Earlier I wrote this can be quite important: let's imagine we compiled a request with all the branches. One of such branches there may require table scan in query plan, and thus query plan the whole SQL request will rely on table scan. This "slow" branch could be a rarely used one (i.e. condition turning its logic "on" is quite rarely evaluated to true). But the plan will always use table scan, since RDBMS produces the most generic query plan version. An example of such query is "Select * from A where @All==1 or @Id==A.Id". Check out its plan on SQL Server. Then imagine, if normally @All is 0. Branching & pre-translated query parts allow us to handle such cases perfectly.

No comments:

Post a Comment