News, examples, tips, ideas and plans.
Thoughts around ORM, .NET and SQL databases.

Thursday, September 02, 2010

DataObjects.Net-based application example: "Single Window" system for SeverRegionGaz (Gazprom subdivision)

This post actually starts a sequence of posts dedicated to DataObjects.Net design - more precisely, to its design goals. I decided to start the cycle from the practical example, since picture frequently worth more then thousands of words. Certain amount of advertisement of our outsourcing team is just side effect of this post, although if have some serious project for these guys, you're welcome.

SeverRegionGaz is regional natural gas provider. This is a big organization, that, although being a subdivision of Gazprom itself, has a set of branches as well. There is a bunch of legacy software systems there, varying from pretty old to new. Some of them have very similar functionality: earlier SeverRegionGaz branches were independent from each other, and thus they use partially equivalent software.

Most of data they maintain is related to:
  • Billing - obviously.
  • Equipment. E.g. they know exactly what's installed at each particular location (home, office, etc.).
  • Incidents and customer interaction history.
  • Bookkeeping. Unfortunately for us, they wanted to see certain information from two different instances of 1C Enterprise 7.5 there as well.
The main goal of "Single Window" system is to provide a single access point allowing to browse all this data, and, importantly, search for any piece of information there. 

Let me illustrate the importance of this goal: earlier, to find necessary piece of information (e.g. billing and interaction history for a particular person), they should identify one or few of these legacy systems first, and then request necessary information from appropriate people (nearly no one precisely knows all these systems). In short, getting the information was really complex and long process -- and we were happy to change this.

There were few other, minor goals - e.g. it was necessary to:
  • Provide web interface allowing customers to report the values of natural gas consumption counters and interact with support staff.
  • Integrate with external payment processing system provided by their bank and automatically process the payments made by customers.
  • Implement reporting. Only a part of reports needed by SeverRegionGaz was available in legacy systems, so we should implement the missing ones.
To stress this, we should develop an application allowing to browse really huge database (I'll explain this later). Its editing capabilities should be pretty limited - mainly, because:
  • Most of this information must be imported from external sources. If we'd be asked to support editing, it would bring the complexity to a completely different level. In fact, we should either be capable of syncing back all the changed (that's really hard, taking into account that none of legacy systems is ready for syncing, and almost no one knows how these systems exactly work at all), or, alternatively, replace all the legacy systems there by our own (hard as well).
  • That fact that different systems are still necessary for editing (mainly, data entry) is acceptable for SeverRegionGaz. The people working with them are used to them; single person there normally deals with a single system. Having "Single Window" there, they should study just one more system to be capable of accessing all the data - that's much better then ~10.
  • Full editing support (likely, a complete replacement of most of legacy systems) is primary goal for "Single Window v2" - and the idea of movement to this big goal step-by-step is really good. For now it's ok if we allow to edit only the data we fully control.
That was a story behind "Single Window" project. Now some facts about its implementation:
  • Time: 7 months -- February 2010 ... July 2010. First 1.5 months were spent almost completely on specifications.
  • People: initially -- 3.5 developers + 1.5 managers (Alex Ustinov was playing both roles there); closer to completion -- 6 developers.
  • Database: ~ 12 GB of data, 507 tables, 440 types! (all are unique, i.e. there are no generic-based tables)
  • External data sources: 8,  full data import is implemented for all of them; in additional, continuous change migration is implemented for 3 of them.
  • Other elements: hundreds of lists, forms and reports. I suspect, totally - almost 1 thousand.
  • Complexities: lots of, but mostly they were related to ETL processes, starting from some funny ones and ending up with real problems.
  • Used technologies:
    - DataObjects.Net 4 - btw, it's our first really big application based on it. And that's why I write this post :)
    - LiveUI - it would be really hard to generate that huge UI without this framework. Btw, Alex Ilyin adopted its core part to WPF pretty fast. We also intensively used T4 to
    - WCF - we've implemented 3-tier architecture relying on WCF as communication protocol.
    - WPF - an obvious choice for UI,
    and lots of other stuff, ending up with pretty exotic COM+ (used for integration with 1C).

The main tab is designed in very minimalistic fashion:

That's what happens when user hits "Search" button:

As you see, we use full-text capabilities of DO4 in full power here. In fact, we index all of objects, which content is interesting from the point of search. Full-text indexing here is implemented in nearly the same way as it was in v3.9 - i.e. there is a single special type for full text document, per-type full-text content extractors running continuously in background, and so on.

Here is a typical list (take a look at grid settings and search box):

Some forms:


The list of actions in left panel is actually pretty long - i.e. you see may be 30% of it. There is no scrollbar, but you may find "Up" and "Down" arrows indicating the list will automatically scroll up or down when mouse pointer approaches its top and bottom edges.

Integration services control list:

Web site, customer's home page:

Pages for registration of natural gas consumption counter value and interaction with support staff:

And finally, a single screenshot exposing the complexity of domain model:

As you see, our team made a huge job, that is directly related to DataObjects.Net. 

The main point of this post is: DataObjects.Net is designed to develop really complex business applications fast. Why? Well, that's the topic for my subsequent posts, but for now I'd like to touch key points:
  • Code-only approach allows developers to work fully independently without caring about schema changes at all - even in different branches.
  • Integrated schema upgrade capabilities are ideal for unit testing (and testing in general). It's easy to launch unit tests for any part of your application.
  • Rich event and interaction model simplifies development of shared logic, such as full-text indexing and change tracking.
  • Excellent LINQ support brings significant advantages, when you start writing reports. Queries there might require really good translator.
  • Can you imagine dealing with 500+ types in EF? Actually, I just tried to find some reports about this on the web, but found only "avoid this" statements, with tons of reasons. The most funny one is about performance and usability of IntelliSence when you type "dataContext.".
To be frank, of course I got lots of issue reports during these months from our "Single Window" team, and actually, still get them. Mostly they were related to LINQ and schema upgrade. So if you use DO, you can say "thanks" to Alex Ilyin - he simply tortured me and Alexis Kochetov, and still does this. E.g. mainly because of him:
  • DO4 translates really complex LINQ expressions involving DTOs (custom types and anonymous typs).
  • Schema upgrade works really well and fast now. Domain.Build(...) performance is 2-3 times higher there (Alex hates waiting for launch). E.g. now our 440-type Domain requires ~ 4 seconds to be built in Skip mode on my moderate office PC (Core 2 Duo). To achieve this, I should parallelize some stages of build process.
Let me finish with one more screenshot:

Download DataObjects.Net v4.3.5 build 5887 (just published, not yet announced!) and test the newest build of our framework by your own. Btw, IMO we've fixed all known bugs related to schema upgrade in this version.

P.S. I'm really interested in examples of complex applications relying on popular ORM tools. If you know some, please share the link. I got an impression that people still avoid using ORM tools in such cases (there are opinions like "ORM is not for enterprise!"). So I'd like to "measure the length" in terms of model complexity (count of tables, types, etc.) - just for fun, of course. You should know we like to measure various features of ORM tools.


  1. Nice post, btw i curious how Alex Ilyin adopted LiveUI with WPF, can he disclose some details on this task?

  2. This comment has been removed by the author.

  3. I think he can, but since LiveUI project isn't adopted well for public use, it won't change much for us. LiveUI is a good tool for in-company use (at least, currently); the process of tuning it to a tool available for public use is, in fact, frozen.

    I can only add that this adoption doesn't imply possibility of easy migration of your application from WPF to ASP.NET. We used LiveUI mainly as composition framework there.

  4. And other thing, how did you do transfer data between client <> server throught WCF channel? Have you internally some "remote://" protocol on DO4 side?? If yes i am interesting to have it too :-)

  5. In most of cases they were exchanging by DTOs - there were 4 main reasons for this:

    1) Most of data must be accessed in read-only fashion.

    2) DisconnectedState wasn't that good at the moment they started.

    3) Alex Ilyin almost instantly developed a special "virtual query" module allowing to write client-side queries involving DTOs and execute them. Being executed, they are submitted to server, where actual execution takes place, and results are transparently sent back to the client.

    4) Alex Ilyin is a big fan of DTO. In addition, he's an architect of this system :) Actually, it's simply impossible to make doing something he doesn't want to do. I also think it's a bad idea to limit developers in their decisions, even if I don't like them. Of course, I tried to insist they must rely on DS, but they made their own decision. And since I seen it isn't completely crazy from the point of complexity, I didn't object.

  6. Btw, being on their place, I'd implement a hybrid 2.5 tier architecture:
    - All data read operations on the client would hit DB directly. The easiest way to implement this is to disable any changes for DB user client runs from.
    - All the changes must flow through application server. It's really easy with DisconnectedState.OperationLog.

    In fact, this workaround brings identical results as "remote://" protocol we're thinking about.

    Developing 3-tier architecture is _always_ much more complex then 2-tier, so I'd prefer lower complexity in this case.

  7. Alex, yours goal when implementing "remote://" protocol is to use some part of existing DO4 framework, e.g. usage with "DisconnectedState.OperationLog" or ?

  8. I would like to explain why we did not use DisconnectedState and used DTOs instead, it has nothing to do with my personal preference.

    1. Risks. I have no practical expirience with using DS in big applications, so there could be problems with using it. Even if these problems are solvable they eat time.

    2. Good user interface allways work with data asynchronously. DS does not support it.

    3. Start time. Even if Domain takes just 7 seconds to build, it is too long for client applications.

    4. Buisness logic and external services tend to change more frequently than user interface. So 3-tier architecture helps updating the application.

    5. Simplicity. It is much easier to make application this way because DTO can be optimized for specific forms and binding, Entities can not.

    Really there is whole lot of other reasons, but I think any of this 5 is enough. :)

  9. I missed Alex Ilyin's comment, answers:

    1 - Absolutely agree. When they started, DS was quite new and untested feature.

    2 - That's wrong. You can run the query (or request the data from service) async, but merge the result into DS+Session synchronously in UI thread (that's quite fast even for very large result sets).

    3 - IMO, that's absolutely acceptable, especially for such a large application. Even Picasa starts ~ 10 seconds for the first time, not saying about e.g. Microsoft Word. And if this would be a real problem (currently none of customers is really worried about this), we'd decrease that time further.

    Also, you didn't test the time you need to build the Domain with memory provider, that should be used in such case (3-tier app with DS).

    4 - Partially agree. Or, better to say, in many cases it's possible to allow an old client to deal with newer server. On the other hand, I hardly believe it's possible (and really necessary) in this case.

    Do you really use this mode now (i.e. different client & server versions)?

    IMO, detection of correct client version & its automatic update is minor problem (esp. in this case), that needs few days to be resolved.

    5 - "DTO simplicity" here was a good point mainly because most of the data client-side code deals with is used in read-only fashion. Otherwise DTOs won't suffice (INotifyXxx, save\cancel, binging to view model and all similar problems must be solved by your own here).

    Btw, what's good is that we consider and use the whole set of options even inside our own company. Initial disappointment in some feature (DS in this case) makes us to make it better, + provide a stuff solving the issues team faces while going another way (i.e. with DTOs).