News, examples, tips, ideas and plans.
Thoughts around ORM, .NET and SQL databases.

Monday, November 28, 2011

DataObjects.Net 4.3.8 RC 1 & 4.4.1 RC 1 are published

Today DataObjects.Net 4.3.8 RC 1 & DataObjects.Net 4.4.1 RC 1 are published and available for download.

Just as a reminder: the list of fixes, updates and new features is here, the detailed overview of index management-related features is here.

No special actions for migration to the new version are required except updating the references.

We are waiting for your feedback! Write to us to support@x-tensive.com.

Thank you!

Thursday, November 24, 2011

DataObjects.Net 4.5 Beta 4 binaries are updated

DataObjects.Net 4.5 Beta 4 package is updated to the latest revision, incorporating all the fixes and features from branches 4.3 & 4.4. The updated binaries can be downloaded from here.

Monday, November 21, 2011

DataObjects.Net Extensions, part 1

I'm writing this post on behalf of Alexander Ovchinnikov who is the author of DataObjects.Net Extensions project. Earlier Alexander developed LINQPad provider for DataObjects.Net, which is included into standard DataObjects.Net installation package starting from version 4.5, so he is deservedly one of the most productive contributors in DataObjects.Net community.

This project is officially called DataObjects.Net Extensions and is distributed under MIT license. Source code is available on github. The latest version of DataObjects.Net Extensions contains 2 main set of features:
- Batch server-side update and delete operations
- Operation reprocessing

In this post I'll describe the first part, batch server-side update and delete operations.

How to install the extension

1. The extension is distributed in a form of NuGet package, so to use it you should install NuGet Manager if you haven't installed it already. As far as I know, Visual Studio 2010 SP 1 already includes this manager by default.

2. Open the desired solution and in the context menu select "Manage NuGet packages..."


3. Search for "DataObjectsExtensions" package


4. Click "Install", choose projects to add the reference to and click "OK".


5. Check that the package is successfully installed and start using it.

Sample domain model

This model will be used in code samples.
[HierarchyRoot]
    public class Bar : Entity {

        public Bar(Session session) : base(session){}

        [Field, Key]
        public int Id { get; set; }

        [Field]
        public string Name { get; set; }

        [Field]
        public int Count { get; set; }

        [Field]
        public string Description { get; set; }
    }

    [HierarchyRoot]
    [Index("Name", Unique = true)]
    public class Foo : Entity{

        public Foo(Session session) : base(session){}

        public Foo(Session session, int id) : base(session, id){}

        [Field, Key]
        public int Id { get; set; }

        [Field]
        public string Name { get; set; }

        [Field]
        [Association("Foo")]
        public Bar Bar { get; set; }
    }

Batch server-side operations

Why they are important?

Well, Captain Obvious to the rescue, sometimes there could be a scenario when a set of records needs to be updated without fetching them on client. There might be various reasons for that such as performance issues, limitations of business logic, etc. In such cases users have to utilize old good plain SQL to execute "UPDATE" or "DELETE" commands via underlying ADO.NET providers. But while this is not always convenient, this also could lead to potentially erroneous results because as soon as domain model changes, it becomes out of sync with these SQL commands.

DataObjects.Net Extensions sorts out the problem providing a set of IQueryable extension methods that are translated to the desired UPDATE or DELETE commands. For instance:
Query.All<Bar>()
  .Where(a => a.Id == 1)
  .Set(a => a.Count, 2)
  .Set(a => a.Name, a => a.Name + "suffix")
  .Update();
Listed below are some scenarios which you might encounter while using the package. Every case is followed by the corresponding translation to SQL.

Updating persistent property with constant value

Query.All<Bar>()
  .Where(a => a.Id == 1)
  .Set(a => a.Count, 2)
  .Update();
is translated to:
DECLARE @p0 Int SET @p0 = 2
UPDATE [dbo].[Bar]
SET [Count] = @p0
FROM [dbo].[Bar] AS j0 INNER JOIN (
SELECT
  [a].[Id]
FROM
  [dbo].[Bar] [a]
WHERE
  ([a].[Id] = 1)
) AS j1 ON (j0.[Id] = j1.[Id])

Updating persistent property with expression, computed on server

Query.All<Bar>()
  .Where(a => a.Id==1)
  .Set(a => a.Count, a => a.Description.Length + a.Count * 2)
  .Update();
is translated to:
UPDATE [dbo].[Bar]
SET [Count] = ((DATALENGTH([Description]) / 2) + ([Count] * 2))
FROM [dbo].[Bar] AS j0 INNER JOIN (
SELECT
  [a].[Id]
FROM
  [dbo].[Bar] [a]
WHERE
  ([a].[Id] = 1)
) AS j1 ON (j0.[Id] = j1.[Id])

Setting a reference to an entity that is already loaded into Session

// Emulating entity loading
var bar = Query.Single<Bar>(1);

Query.All<Foo>()
  .Where(a => a.Id == 2)
  .Set(a => a.Bar, bar)
  .Update();
is translated to:
DECLARE @p0 Int SET @p0 = 1
UPDATE [dbo].[Foo]
SET [Bar.Id] = @p0
FROM [dbo].[Foo] AS j0 INNER JOIN (
SELECT
  [a].[Id]
FROM
  [dbo].[Foo] [a]
WHERE
  ([a].[Id] = 2)
) AS j1 ON (j0.[Id] = j1.[Id])

Setting a reference to an entity that is not loaded into Session, 1st way

Query.All<Foo>()
  .Where(a => a.Id == 1)
  .Set(a => a.Bar, a => Query.Single<Bar>(1))
  .Update();
In this case DataObjectsExtensions are smart enough to extract the value of key from Query.Single method argument and use that value in command translation:
DECLARE @p0 Int SET @p0 = 1
UPDATE [dbo].[Foo]
SET [Bar.Id] = @p0
FROM [dbo].[Foo] AS j0 INNER JOIN (
SELECT
  [a].[Id]
FROM
  [dbo].[Foo] [a]
WHERE
  ([a].[Id] = 1)
) AS j1 ON (j0.[Id] = j1.[Id])
All overloads of Query.Single, Query.SingleOrDefault methods are also supported.

Setting a reference to an entity that is not loaded into Session, 2nd way

Query.All<Foo>()
  .Where(a => a.Id == 1)
  .Set(a => a.Bar, a => Query.All<Bar>().Single(b => b.Name == "test"))
  .Update();
is translated to:
UPDATE [dbo].[Foo]
SET [Bar.Id] = (SELECT TOP 1 [a].[Id] FROM [dbo].[Bar] [a] WHERE ([a].[Name] = N'test'))
FROM [dbo].[Foo] AS j0 INNER JOIN (
SELECT
  [a].[Id]
FROM
  [dbo].[Foo] [a]
WHERE
  ([a].[Id] = 1)
) AS j1 ON (j0.[Id] = j1.[Id])
All overloads of Queryable.Single, Queryable.SingleOrDefault, Queryable.First, Queryable.FirstOrDefault methods are supported.
Note, this way always leads to a subquery, so if key of referenced entity is known, it is strongly recommended to use the 1st way.

Constructing update expressions of the fly

Set extension method can be used in scenarios when the update expression is constructed in runtime, for example:
bool condition = CheckCondition();
var query = Query.All()<Bar>
  .Where(a => a.Id == 1)
  .Set(a => a.Count, 2);

if(condition)
  query = query.Set(a => a.Name, a => a.Name + "test");
query.Update();

Updating lots of properties at once

In a scenario when many properties must be updated at once, the alternative version of update extension syntax might be more convenient to use. The main method is called Update instead of Set and it takes a constructor with object initializer as argument:
Query.All<Bar>()
  .Where(a => a.Id == 1)
  .Update(a => new Bar(null) { Count = 2, Name = a.Name + "test", dozens of other properties... });
Note, while you have to populate the constructor with arguments because compiler requires that, it will never be called, only the object initializer will be processed. So it is safe you set fake arguments to the constructor. In all other ways these 2 kinds of syntax are equal, just choose the one that is convenient for you.

Deleting entities

Delete syntax is pretty straightforward, no new rocket science is here. It is almost the same as Update syntax, but use Delete instead.
Query.All<Foo>()
  .Where(a => a.Id == 1)
  .Delete();
is translated to:
DELETE [dbo].[Foo]
FROM [dbo].[Foo] AS j0 INNER JOIN (
SELECT
  [a].[Id]
FROM
  [dbo].[Foo] [a]
WHERE
  ([a].[Id] = 1)
) AS j1 ON (j0.[Id] = j1.[Id])

Known limitations

1. This version supports Microsoft SQL Server only.
2. When updating a reference field, you can't refer to the entity that is being updated, for instance:
Query.All<Foo>()
  .Where(a => a.Id == 1)
  .Set(a => a.Bar, a => Query.All<Bar>().FirstOrDefault(b => b.Name == a.Name))
  .Update();
3. Session.SaveChanges() method is executed before any batch server-side operation. This might lead to validation of entities, depending on validation configuration.
4. Every batch server-side operation cleans Session cache to provide data consistency.
5. Batch server-side operations don't support compiled LINQ queries.

Credits

All credits go to Alexander Ovchinnikov. The extension is a must-have, I'm absolutely sure that it will be very popular among DataObjects.Net community.

Thanks, Alexander!

P.S.
The second part is coming soon. Stay tuned.

Friday, November 18, 2011

Precise control over indexes

This is a detailed description of the index-related features from the upcoming release mentioned in the previous post.

Database indexes were invented to improve the speed of data retrieval operations on a database table. But as always, they have the cost: slower writes and increased storage space. So database architects have to keep the balance between faster data retrieval and slower writes. Ideally, indexes should be applied only on columns that are used in filtering, ordering, etc.

1. Control over indexes on reference fields

Versions of DataObjects.Net prior to the upcoming one were pretty straightforward and applied indexes on every reference property, for example, in this model index would be placed on Owner property automatically.

public class Penguin : Entity {

  [Field]
  public Person Owner { get; set; }
}

DataObjects.Net behaved so in assumption that such indexes would boost performance in queries like:
var PoppersPenguins = session.Query.All<Penguin>().Where(a => a.Owner == MrPopper);

But what if the application doesn't have such queries? And the index that is created and maintained by the database is absolutely useless, moreover, it makes the performance worse? The upcoming version of DataObjects.Net has the answers. Here is what you can do to prevent automatic index creation on a reference field:

public class Penguin : Entity {

  [Field(Indexed = false)]
  public Person Owner { get; set; }
}

2. Control over index clustering

There are 2 main approaches in physical organization of data tables: unordered and ordered. The first one means that the records are stored in the order they are inserted, the second implies that the records are pre-sorted before storing. Both of them have advantages and disadvantages, depending on structure of primary key, usage scenarios, etc. Some database servers supports both cases, others - only one. Microsoft SQL Server supports both: unordered approach is a "heap table (file)" and ordered one is based on "clustered index" when data rows are physically sorted according to position of primary key in internal index structures (B+ tree).

For Microsoft SQL Server DataObjects.Net always chose "clustered index" as an organization of physical storage because this is quite efficient in most cases with only few exceptions: when your primary key is not integer auto-incremented data type but uniqueidentifier or char/varchar, clustered indexes become less efficient on data inserts. Until now there was no way to say that there is no necessity to make primary index clustered, but that has changed: DataObjects.Net provides easy way to control clustering of indexes. Here is how:

// Penguin table should not be clustered
[HierarchyRoot(Clustered = false)]
public class Penguin : Entity {

  [Field]
  public Person Owner { get; set; }
}

Note, removing clustering from hierarchy root you remove it for all its descendants.

Additional feature is the ability to define an index that is not primary and then use it for clustering, like this:

// Person table should be clustered by Name column
[HierarchyRoot]
[Index("Name", Clustered = true)]
public class Person : Entity {

  [Field]
  public string Name { get; set; }
}

This can be done for each class separately except the case when SingleTable inheritance scheme is used. Another obvious restriction is that there can be only one clustered index per table.

3. Support for partial (filtered) indexes

A partial index, also known as filtered index is an index which has some condition applied to it so that it includes a subset of rows in the table. This allows the index to remain small, even though the table may be rather large, and have extreme selectivity.

Our goal was not only to add the support, but to make it easy to use and prevent users from the necessity to write database server-dependent expressions in index definitions. Therefore, we decided to implement the following technique that gives us compile-time validation, type-safety and the power of standard .NET expressions:

[HierarchyRoot
[Index("Owner", Filter = "OwnerIndex")]
public class Penguin : Entity {

    public static Expression<Func<Penguin, bool>> OwnerIndex()
    {
      return p => p.Owner != null;
    }

    [Field]
    public Person Owner { get; set; } 
}

Note that depending on inheritance scheme applied, partial indexes can or can't contain expressions with columns defined in ancestors/descendants.

In case you want to move filter expression to a separate class, you should use the following construct:

[HierarchyRoot]
[Index("Owner", Filter = "PenguinOwnerIndex", FilterType = typeof(FilterExpressions))]
public class Penguin : Entity {

    [Field]
    public Person Owner { get; set; } 
}

public static class FilterExpressions {

    public static Expression<Func<Penguin, bool>> PenguinOwnerIndex()
    {
      return p => p.Owner != null;
    }
}

Partial indexes are supported in Microsoft SQL Server and PostgreSQL providers.

In the next post I'll describe other features of the upcoming release, stay tuned. In the meantime, we are preparing the new builds, installers and stuff.

Monday, November 14, 2011

Upcoming major update

Hello everybody,

Despite the fact that there was no news in this blog from DataObjects.Net team for a couple of months, this doesn't mean that we were having rest or watching endless TV-series. We've been fixing bugs and inventing some really cool features for our customers. Let's take a closer look at the details.

First of all, bugs fixed:
Updates to the existing functionality:
The most interesting part, features:
Simply put, the features are so dramatic that we are releasing the stuff as a major version update. DataObjects.Net 4.3.7 branch will become 4.3.8 and 4.4.0 will be known as 4.4.1 (one of our lead developers call this update as DataObjects.Net R2 and I understand him pretty much). 

While the above mentioned changes are being thoroughly tested under various environments and being prepared for the release, we'll describe the features listed above in a dedicated series of posts.

Thank you! Stay tuned.