Thursday, January 27, 2011

What is CQRS? My experience with implementing it. Part 1

A short post about CQRS and my experience with it.

One of the more interesting programming/design patterns as I was learning Java way back was the command pattern. Sure, this pattern has been around a long time and has been well defined and discussed. But the first time I implemented it was through Java.

The pattern itself is conceptually easy. There is a command handler that handles any command submitted to it. The command typically implements an interface that the handler executes. With this flexibility, one can introduce pre- and post- execution methods, execute-around method similar to an Aspect etc.

While this pattern was primarily used as one component of the application's architecture, it was almost never the basis of the architecture of any project I worked on until recently. I had the opportunity to be able to implement a CQRS style framework for a greenfield project.

What is CQRS?
Stands for Command Query Responsibility Separation. The concept as the name (sort of) explains is to separate out concerns where a system is 'comman'ed to do something versus a system is queried for something. Wait! Isn't that pretty much obvious and a given in any system. When I command system to save some data, I use a dufferent syntax and when I query it, I use a different syntax. Well, not so fast. This is not about syntax. Rather it is about the semantics of how one saves and queries, the architecture that supports these operations and how they should be separated for better performance and maintainability.

So, let's look at a typical architecture of a web application. We have the three tiers (perhaps more but let's keep it simple here).

The UI/Client layer interacts with the user and asks the business layer to respond to the user's actions - to save data, to show data, to perform a calculation and show results, to collect data from separate sources and show as a dashboard/portal etc.

The business layer accepts request, acts upon it by either persisting, querying, calculating and returns result back to UI/Client.
A few issues I have with this approach:
1. Traditionally, the business layer providers interface for each of these actions. For instance, there is usually a handler (action/controller etc) for Create, one for Update, one for Query and so on. This makes the architecture open to modifications for any new request in an unnecessary way.
2. This also makes the UI very CRUD based. If one wants to update say a user profile to change address, one opens the entire profile screen with all editable fields and updates the addressline and submits. Lost in this update (unless there is a lot of scripting or code involved) is the fact that the user only wanted to update address as opposed to the entire record.
3. Of course, some architectures combine this into one action that performs the appropriate task based on a parameter in the incoming request. But this is not preferred since the one handler can get large and messy. Needless to say, a simple requirement change makes it susceptible to dangerous side effects.
4. An architecture such as this typically serves one client - a UI, or a web service and leads to code duplication.
5. The ORM becomes the source of persistence as well as query and anyone who has dealt with complex database structures and/or large datasets to be queries will know that the ORM typically can be optimized to persist or read but not both especially if the reads bring back large datasets coupled with complex object maps.

How does CQRS help overcome these issues?
  • CQRS separates the read and write concerns while opening a whole new architectural avenue to implement apps. Basically, every operation becomes a command and a command handler will inspect the command to determine the action to take.
  • This makes the command and command handler a standard interface of the application to the entire client base. UI, Web Services, JMS Messages etc can all now interact with the business layer in a standard interface.
  • The command can implement business logic and hence this will be reused and standardized across the application. Required fields, sizes, formats etc can be validated.
  • Command handler can do pre- /post- processing to send notifications, audits, initiate events based on data updates, deletes etc.
  • CQRS allows creating task-based UI where a specific task can be targeted. In the earlier example of a user updating address, if the UI allowed a user to change just one field, and this operation becomes a command called update zip code, then the command handler knows exactly what changes, audit trail is granular as well. Any business logic that depends on zip code can be implemented cleanly without a gazillion lines of code to determine whether business logic changed and then take action.
  • In this case, the command carries all that is needed to be done.and that command is audited. If partner system need to know that zip code changed, an event/message can be sent to interested parties to listen on it.
An extension of this pattern is to complement this with event sourcing. A command can publish an event out to a queue/topic as it updates the OLTP database. This can be a distributed transaction to write to a queue and database. A queue listener can then push this out to the Datamart and/or other interested applications in the enterprise.
One can even implement the command itself as an event that is consumed by the handler and perhaps updates the OLTP database and the datamart.

No comments: