Sunday, February 25, 2007

Navigation Controller Pattern

When people use the term "website navigation", they usually refer to one of the two different things. First is a set of hyperlinks displayed on a web page that users can click in order to get to a specific part of the website. These links may be presented as a drop-down menu, tree-view, or a bread crumb trail. Such navigation controls are easy to incorporate, especially with ASP.Net 2.0 where they all bind to a common data source.

Another meaning of the term is related to page flow: a sequence of pages user needs to go through in order to accomplish some process. Ordering a book from Amazon.com is a simple example: from the shopping cart screen, user has to go to the shipping details page, then billing details, review order summary, order receipt, and, finally, suggested items list. An example of a complex flow is creating a will on LegalZoom.com. In both cases, entire sequence is controlled by the application -- user cannot bypass or reorder steps. Thus, page flow is part of the application's business logic while navigation controls are part of its user interface.

Traditionally, page flow is implemented either in the code-behind class or page presenter (if the application follows the MVP pattern) as a sprinkle of Response.Redirect() calls. Although this approach is straightforward for developers, it has an important limitation: page flow logic is inseparable from the rest of the application logic. Unless the system is fully documented (and we know how often that happens, right?), it is impossible to answer a question "How many pages lead to Foo.aspx" without scanning the source code. Consequently, application becomes difficult to maintain: simple business request to change default page flow becomes a challenge for a developer, especially if he or she isn't intricately familiar with the system, and requires extensive regression testing.

Since I mentioned the subject of technical documentation, one reason it’s so unpopular is because any documentation almost immediately becomes outdated. Imagine how much easier it would have been if we had a “live” system diagram. But that is a subject for a different post…

Navigation Controller design pattern provides a better alternative. As it name implies, entire page flow navigation logic is abstracted to a new class. At the core of navigation controller are two types of page transitions: default and named. Default transition will transfer user to the next web page within the flow, while named transition represents an alternative navigation route. For instance, a page that displays credit card details will transition by default to the order summary page:

IPageFlow controller = this.NavigationController;
controller.Next();

However, if user decided to add new credit card info, control should be redirected to that page instead:

IPageFlow controller = this.NavigationController;
controller.Navigate(TransitionNames.AddNewCreditCard);

Note that the string Navigate method accepts isn’t the URL of the page, but a generic value that represents the transition. The logic within navigation controller maps transition name to actual page URL.

Essentially, navigation controller is a state machine where each state is associated with a particular web page. As all state machines, it needs to have a start and end states, and maintain current state information. From the web application point of view, every user session requires a dedicated instance of the controller class.

As we know, nothing prevents a user from typing the URL directly in their browser window or clicking “Back” button. Both these actions can potentially break the intended page flow, so another function of navigation controller is to “keep” user in:

protected override void OnLoad(EventArgs e)
{
IPageFlow controller = this.NavigationController;
controller.RedirectToCurrentPage();
base.OnLoad(e);
}

Microsoft Patterns and Practices group recently released initial version of its Web Client Software Factory. The factory includes, among other things, Page Flow Application Block, which is a versatile navigation controller. As all PnP application blocks, it is build using provider model, meaning that there may be different concrete implementations of the controller. The team actually chose Windows Workflow Foundation (WF) for their implementation. In WF, page flow is represented by a state machine workflow and can be edited in the visual designer inside VS 2005. Because WF has its own persistence model, page flow instances need not be stored in user session state and can be paused and resumed as necessary. For more information, please visit Web Client Software Factory wiki on CodePlex.

Monday, February 19, 2007

Design For Operations (Part II)

Last time I made a mistake by including words "Part I" in the title of my post. The plan was to write part II right away, and of course, it took me two months to get to it. I bet if I didn't call it "Part I", I would have posted this essay before Christmas ;-)

So, my goal is to make the application more accessible to the people who actually have to maintain production servers. WMI is great for this particular purpose, because it is a standard mechanism for accessing many system components at run-time: disk drives, Windows services, message queues, etc.

1. Publishing to WMI

If I design my application as a WMI provider, it will be able to publish information to WMI infrastructure built into Windows system. Application status information is represented by instances and events. Windows WMI infrastructure will ensure that any client that has proper permissions can query this data, subscribe to events, etc. All I need to do is define classes that will be published and decorate them with InstrumentationClassAttribute.

[InstrumentationClass(InstrumentationType.Instance)]
public class MyInstance
{
public MyInstance() {}
public string ProcessName;
public string Description;
public string ProcessType;
public string Status;
}

[InstrumentationClass(InstrumentationType.Event)]
public class MyTerminationEvent
{
public MyTerminationEvent() {}
public string ProcessName;
public string TerminationReason;
}

Publishing data is extremely simple:

using System.Management.Instrumentation;
...
MyInstance myInstance = new MyInstance();
MyTerminationEvent myEvent = new MyTerminationEvent();
// Set field values...
Instrumentation.Publish(myInstance);
Instrumentation.Fire(myEvent);

2. Registering WMI Namespace

Now let's take a step back. In order to make above code functional, I need to register WMI namespace for my application. This is done using ManagementInstaller class, but first, I have to decorate the assembly with a special attribute:

[assembly: Instrumented(@"root\MyCompanyName\MyNamespace")]

ManagementInstaller is trivial: it just needs to be added to the Installers collection of my application's Installer class:

[RunInstaller(true)]
public partial class MyAppInstaller : Installer
{
private ManagementInstaller managementInstaller;

public MyAppInstaller()
{
InitializeComponent();

managementInstaller = new ManagementInstaller();
Installers.Add(managementInstaller);
}
}

Now, after I build the application, I can register WMI namespace simply by running "installutil" command against assembly name.

3. Developing WMI Client

Chances are, operations team will ask me to write a WMI client for my provider. No problem, .NET framework has all the tools to get to my application's published data. One approach is to write a query using a SQL-like syntax and execute it using WqlObjectQuery class. Another relies on the ManagementClass object:

ObjectGetOptions options = new ObjectGetOptions(null, new TimeSpan(0, 0, 30), true);
ManagementClass wmiClass = new ManagementClass(WmiScope, "MyInstance", options);
ManagementObjectCollection wmiInstances = wmiClass.GetInstances();

In both cases, I will get back a collection of ManagementObject instances. Although I can extract all the data I want from ManagementObject using field names (e.g., obj["ProcessName"]), I would rather have a strongly typed class to work with. Turns out, there is a .NET tool called "mgmtclassgen" that does exactly that - generates a strongly typed class wrapper for any WMI instance type.

***

WMI is a complex subject, and I realize that I barely scratched the surface in this post. Still, there is enough information to get you started. Good luck!

Monday, December 11, 2006

Design For Operations (Part I)

When I design an enterprise application, I need to realize one simple truth: the system is going to spend just 15% (or less) of its life in development environment. After that it moves to the gated community known as production, and the only people who are supposed to have access to production are system operations engineers, a.k.a. “IT guys”. And of course, I cannot assume that IT guys will become familiar with the intricacies of the application’s design. If I do, I would be making a grave mistake which results in those dreaded 4:50 PM or 2:30 AM phone calls from the NOC.

So, I really need to design the system with operations in mind. It should be able to report its status and notify about any issues. I should allow operations to monitor my system with their usual tools, such as event viewer, performance monitor, management console, or MOM, instead of running SQL queries and reviewing XML configuration files. This means instrumenting my application with event logs, performance counters, and WMI objects and events.

Event Logging. Although simple file log is very convenient to put all sorts of debugging and profiling information, I can’t really expect IT to dig through megabytes of text looking for error information. Instead, they should be able to get it from Windows Event Viewer. So, I will create an instance of EventLogInstaller in the application’s installer class and specify the Source and Log properties. I will make sure to log all unhandled exceptions (see my previous post) using EventLog.WriteEntry method.

Performance counters are invaluable tools for monitoring and profiling the system in production. They may also give early indication of system issues. Windows and .NET framework already contain dozens of performance counters, but custom counters can provide an insight into my application’s processing logic. So, exactly kind of information should I expose via performance counters and what kind of counters (instantaneous or average) I should use? There is no standard answer; it really depends on the nature of the system. There is a good introduction to the concept on MSDN. In order to register custom performance counters I usually create a custom installer:

public class PerformanceCountersInstaller : Installer
{
public const String CategoryName = "...";
public const String CategoryHelp = "...";
public const String CounterName = "...";
public const String CounterHelp = "...";

public override void Install(IDictionary state)
{
base.Install(state);
Context.LogMessage("Installing performance counters...");
SetupPerformanceCounters();
}

public override void Uninstall(IDictionary state)
{
Context.LogMessage("Uninstalling performance counters...");
if (PerformanceCounterCategory.Exists(CategoryName))
PerformanceCounterCategory.Delete(CategoryName);
Context.LogMessage("Successfully uninstalled performance counters");
base.Uninstall(state);
}

private void SetupPerformanceCounters()
{
try
{
if (PerformanceCounterCategory.Exists(CategoryName))
PerformanceCounterCategory.Delete(CategoryName);

CounterCreationDataCollection CCDC = new CounterCreationDataCollection();

// Create and add the counters
CounterCreationData ccd;
ccd = new CounterCreationData();
ccd.CounterType = PerformanceCounterType.CounterDelta32;
ccd.CounterName = CounterName;
ccd.CounterHelp = CounterHelp;
CCDC.Add(ccd);

// Create the category.
PerformanceCounterCategory.Create(CategoryName,
CategoryHelp,
PerformanceCounterCategoryType.SingleInstance,
CCDC);
Context.LogMessage("Successfully installed performance counters");
}
catch (Exception ex)
{
Context.LogMessage(String.Concat("Could not install performance counters", ex.Message));
}
}
}

In the next post I will discuss using WMI to publish application status information.

Tuesday, November 28, 2006

Dealing With Exceptions

Although I don't have exact statistics, it certainly feels that most .NET developers often don't know how to deal with exceptions. I often see code where author had assumed that nothing ever goes wrong and decided not to put in any kind of exception handling. Such "infantile" code is clearly not ready for the hard realities of life. On the other end of the spectrum we've got programs that swallow all exceptions in an effort to make themselves bullet-proof. What developers don't realize is that it actually makes them more vulnerable to security attacks. When such attacks destabilize operating environment, a normal system would fail but "exception-swallower" carries on, making an ideal target for exploitation.

So, when do I actually need to catch exceptions? In essence, there are three distinct scenarios. First is called handling. It's when I know what kind of exception to expect and - more importantly - how to recover from it. For example, my stored procedure may become a victim of a SQL Server deadlock. In the managed code, this will result in a SqlException which I should handle (retry transaction up to the pre-defined number of times). Another example is trying to read some configuration data from a file:

try
{
configData = File.ReadAllText(configFilePath);
}
catch(FileNotFoundException)
{
configData = DefaultConfigData;
}

As you can see, I am handling FileNotFoundException by force-feeding some default configuration data into the variable. It's important to emphasize that I didn't attempt to handle any other kind of exception that File.ReadAllText can throw. For instance, it may throw UnauthorizedAccessException or SecurityException and I'd rather have these bubble to the top and hopefully force program termination.

This brings us to the second scenario: unhandled exceptions. If the exception hasn't been handled anywhere in the call stack (which either means there is an unknown problem or a problem that I don't know how to recover from) it should be caught and properly logged. Windows applications should display a generic error message to the user and shut down, Web applications should redirect user to a generic error page, and services can either shut down or terminate failed thread.

Third scenario is called exception wrapping. The idea is to substitute a low-level exception object with higher-level exception class containing additional information (if you are absolutely positive that original error is not sufficient). Wrapping is different from handling because there is no recovery - a new exception is thrown. In the example below, I am replacing SqlException with ScriptException that adds stored procedure name in an effort to facilitate debugging:

catch(SqlException ex)
{
ScriptException e = new ScriptException(storedProcName, ex);
throw e;
}

Wrapping should be used with caution because it changes the call stack and makes debugging more difficult. It is imperative to assign original exception object to the InnerException property of the new exception (in the above example this is done using a constructor overload).

An interesting implication is that in order to handle exceptions I need to know what exceptions a method can throw in the first place. List of exceptions should really be part of the method signature. In fact, Java has the concept of checked exceptions and corresponding "throws" syntax while in .NET we need to rely on class documentation. If you are interested in comparative analysis of the two approaches, read this interview with Anders Hejlsberg, creator of C#.

Saturday, November 11, 2006

Recruitment By [Lucky] Numbers

In the past few years I have interviewed a lot of people for various software development positions. Finding the right employee is always a challenge (as Franco DiAddezio put it, recruitment is an equivalent of finding the perfect spouse after just one or two dates). Candidates can have plenty of work experience, and you can fairly easily confirm whether or not they really known the technologies advertised in their resume. But are technology skills alone sufficient? My personal opinion is that a good software engineer is defined by his or her analytical thinking and problem-solving abilities. Specific technologies, such as programming languages and API's can always be learned.

My own litmus test for identifying the right engineer is a small but elegant programming problem called "The Lucky Numbers" problem. I first heard of it years ago in the university, and more recently - on Mikhail Gustokashin's site dedicated to programming problems where it is ranked "Very Easy" (follow the link only if you can read Russian). Here it is:
A 6-digit ticket number is considered "lucky" if the sum of its first 3 digits equals the sum of last 3 digits. For example, "006123" and "511304" are both lucky, "980357" isn't. Write an efficient algorithm to determine how many lucky numbers exists among all 6-digit numbers (from 000000 to 999999).

First, let's write an inefficient algorithm. We will iterate through all six-digit numbers and increment the counter if sum of first 3 equals to sum of last 3.

for(int i=0; i<10; i++)
for(int j=0; j<10; j++)
for(int k=0; k<10; k++)
for(int l=0; l<10; l++)
for(int m=0; m<10; m++)
for(int n=0; n<10; n++)
if(i+j+k == l+m+n) luckyNumbersCount++;

This algorithm performs 1 million iterations and it is the least I would expect from a candidate (amazingly, more than half failed to produce it). We can arrive at the efficient solution by carefully reading the problem. It doesn't ask us to produce all "lucky numbers", only their quantity. Can we find it without generating the numbers? We know that digit sums of both halves of the lucky number are equal. A sum of digits can take values from 0 (0+0+0) to 27 (9+9+9). For each value, we need to find out how many combinations of digits can produce it, e.g. "1" has 3 combinations: "001", "010", and "100". Evidently, there are 3 * 3 = 9 "lucky numbers" that correspond to the value of "1". So, here is optimized algorithm that performs only 1027 iterations:

int[] combinations = new int[28];
for(int i=0; i<10; i++)
for(int j=0; j<10; j++)
for(int k=0; k<10; k++)
combinations[i+j+k]++;

int luckyNumbersCount = 0;
for(int i=0; i<28; i++)
luckyNumbersCount += combinations[i] * combinations[i];

Wednesday, October 25, 2006

Dependency Injection

Let's talk about dependency injection (DI). DI is, essentially, a design pattern that can be applied to tightly coupled systems. For example, imagine that several modules of our system use a cryptographic component. The code may look like this:

public class OrderManager
{
private CCryptography _Crypto;

public OrderManager()
{
_Crypto = new CCryptography();

Obvious drawback of this approach is that we cannot easily swap cryptographic algorithms - multiple references in the code will need to be changed. Using DI pattern, we would extract the interface of CCryptography class and delegate the creation of concrete object to an outside factory:

public class OrderManager
{
private ICryptography _Crypto;

public OrderManager(ICryptography crypto)
{
_Crypto = crypto;

This way, different kinds of cryptographic objects can be created and swapped at run-time; OrderManager class doesn't know anything about it (and doesn't need to know, either). Another important benefit is testability: we can now test OrderManager functionality without a fully-functional cryptographic component. All we need is a mock object that implements ICryptography interface and probably doesn't even encrypt/decrypt.

By moving object creation to a new entity, we can address additional issues. For example, by caching objects in a dictionary we may apply Singleton pattern and ensure that only one instance of cryptography component is created. We can also control the order in which objects are created. Thus, we complemented original DI pattern with the concept of lifetime container.

Folks at Microsoft Patterns and Practices group took the idea even further and created dependency injection container called ObjectBuilder (OB). OB uses reflection to analyze classes and automatically fulfill their dependency requests. So, as long as we explicitly expressed a dependency (by placing it into a constructor as in the above example or a property decorated by a special attribute), OB will know what to do. There is much more to OB than I just mentioned, so if you'd like to read more, here are a couple of links:

- Download Object Builder from CodePlex: http://www.codeplex.com/Wiki/View.aspx?ProjectName=ObjectBuilder
- Great tutorial by Sayed Hashimi: http://www.sayedhashimi.com/PermaLink,guid,d05aed4f-a211-4969-893e-7ffea324a56c.aspx

Friday, September 15, 2006

Business Objects and Value Objects

Encapsulation of data and behavior is one of the cornerstones of object-oriented programming. This basically means that a business object contains both the data and methods that manipulate the data. In the example below, a CreditCard object contains public method Authorize and public property AuthorizationCode. It's important for the object to establish a proper public interface. Authorization code value is returned by the payment processor; we wouldn't want clients to accidentally modify it. Therefore, I exposed a read-only property rather than a field.

public CreditCard
{
...
private string _AuthorizationCode;
public string AuthorizationCode
{
get { return _Authorizationcode; }
}

public bool Authorize(double amount) {...}
...
}


Object-oriented approach is great - within a single application tier. When designing a distributed application, we need to take other factors into consideration. Suppose, my application needs to display customer credit card data on a web page. Should I pass the CreditCard object to the web tier? Sure, it is possible, but the web tier only needs credit card data, not the behavior. Frankly, I wouldn't want web tier code to accidentally call Authorize() method. So, for the sake of security, we should somehow limit the objects. Performance is another concern, especially when passing objects between physical tiers. Regardless of which remote technology we use - web services, .NET Remoting, or COM+ - large objects with lots of methods and properties may not be ideal for this.

A simple and elegant solution is to combine essential business object data into a "value object". Individual data elements of the value object are exposed to business object's clients via public properties. Revised CreditCard class below demonstrates this approach. Note how overload of the constructor allows us to easily create a business object from a value object. We can extract value object just as easily and send it to another application tier.

public CreditCard
{
...
private CreditCardInfo _CCInfo;
public CreditCardInfo CCInfo
{
get { return _CCInfo; }
}

public CreditCard(CreditCardInfo info)
{
_CCInfo = info;
}

public string AuthorizationCode
{
get { return _CCInfo.Authorizationcode; }
}

public bool Authorize(double amount) {...}
...
}


public CreditCardInfo
{
...
public string CardNumber;
public DateTime ExpDate;
public string Authorizationcode;
public DateTime? LastTransactionDate;
...
}

Saturday, September 02, 2006

How Many Layers Is Enough?

A colleague recently asked my advise about the design of a web application he was working on. He showed me the draft: an elaborate architecture involving web application calling web service, which in turn invoked application server over .NET Remoting. The problem? This was an intranet application, with maximum number of concurrent users below 50. Had he implemented this original design, he would end up with an extremely scalable system which will never get a chance to realize its potential. On the other hand, the flip side of scalability - poor performance - will be obvious to any user.

So, how many layers is enough for an enterprise application? I am talking about physical layers, of course. With logical layers, approach is well-known: you would typically have data abstraction tier, business objects tier, and workflow tier. In addition, web application itself should be designed using Model-View-Presenter pattern .

With physical layers, it's far less straightforward. When we have all logical tiers running inside single application domain, we achieve best possible performance, because all calls are in-process. Once we place business and data tiers in a dedicated application server, we lose performance to out-of-process calls even if the application server is running on the same physical server. It really doesn't matter what specific remote calling mechanism we use: Web services, .NET Remoting, or Enterprise Services (COM+) - performance will suffer. When we move application server to a separate hardware, performance gets even worse because of network latency.

So why not run everything in-process? Well, many web application do just that - every web server in a farm has all logical tiers of the application. The downside is that web servers have to process both web application logic and business logic. This limits the number of HTTP requests each server could process and thus dramatically reduces system scalability. There are other drawbacks, too. Servers have to be really beefed up to handle the load, so hardware cost is high. Also, this deployment layout is inherently insecure: web servers are usually placed in a DMZ outside company firewall. Think of all the connection strings, encryption keys and other sensitive data that hacker could obtain by breaking into a web server.

By placing business and data tiers on separate physical layer (application server farm) we are trading performance to scalability. Web servers no longer have to process business logic, so they can handle much more page requests (and don't require high-end CPUs and tons of memory). Application servers can utilize connection pooling, object pooling and data caching in order to effectively support the web layer. Better yet, if we move from a homogenous app servers to more specialized "application services", we can improve performance even more by fine-tuning server configuration.

Coming back to the question I used as a title, there really isn't a universal rule. Number of physical layers can be different depending on scalability, performance, security, and other requirements of a particular application.

Thursday, August 24, 2006

Caching in Distributed Applications

This is an overview of different approaches to caching in a distributed application environment. Distributed N-Tier applications generally have at least two server farms: web servers and application servers.

Web servers, of course, can cache entire HTTP responses, using, for example, ASP.NET OutputCache page directive. This is a blunt tool, though. It can result in high memory load and impact page processing logic. Sometimes it's not applicable at all. In these cases, application data - in form of objects - should be cached instead. Naturally, this is the only caching option for application servers.

Application data that we need to cache can be either static or dynamic. I'm not suggesting that static data doesn't change at all (otherwise we could just build it into the application), only that it changes very infrequently. Static data can be loaded into cache on every server (let's call it isolated cache). We get the benefit of fast reading, because data is always stored locally. On the other hand, it is heavily duplicated - every single server has to have a copy, which may be a waste of memory. Here's another drawback of isolated cache: imagine several servers joining the farm at the same time. How stressful it will be for the database server while they are filling up their respective isolated caches?

Caching dynamic data is much more complicated, because any server may need to modify it at any time. First thing that comes to mind is to use a common caching data store, such as a database or dedicated server (let's call it centralized cache). For example, ASP.NET allows you to have centralized data store for session state. Unfortunately, centralized mechanism always creates a single point of failure, so it may not be a good solution depending on your availability requirements. Another drawbacks of centralized cache are generally reduced performance and scalability.

A good alternative to centralized cache is distributed cache, which assumes some kind of communication and coordination among the servers. Distributed cache comes in two essential flavors: fully replicated and partitioned. In the fully replicated architecture once application puts an object into local cache on one server, it is immediately copied to all other servers in the cluster. The end result may look very similar to the isolated cache, but remember that isolated cache only works with static data. Still, as the server farm grows, it will take more and more time and memory to maintain fully replicated cache.

Enter partitioned cache. While "get" operation in the previous scenario was always local, getting data from partitioned cache could mean querying all servers in the cluster until one is found that holds the required object. "Put", on the other hand, is local. Partitioned architecture represents a trade-off: we utilize memory more efficiently and don't waste time replicating the data, but it may take longer to retrieve it.

Last but not least: tools. Enterprise Library from Microsoft is free and has Caching application block which unfortunately doesn't support distributed cache. Enterprise version of NCache from Alachisoft does support distributed cache but is far from being free.

Thursday, August 17, 2006

Coding Standards - Good, Bad, and Ugly

Are coding standards a good thing to have in a development organization? Most companies would say yes, and cite a variety of reasons, among them ease of code maintenance and improved continuity (which is important in an industry with such high turnover rate). In addition, good coding standards could help developers to avoid common pitfalls. Code reuse, that holy grail of enterprise development, supposedly improves, too.

Yet the employees of the few companies I know that actually have coding standards document are rarely excited about it. Usually the document is extremely large and unbelievably boring. In an effort to make it comprehensive, authors put together lots of small rules, which makes the document feel like a programming textbook. Well, at least a textbook has a target audience, while the standards document contains a mixture of trivial, simple, moderate, and advanced items. Also, the rules in a textbook are supported by detailed explanations. Coding standards document can be very vague or simply omit the explanations.

The ugly part begins when project managers and team leaders require their engineers to follow coding standards to the letter. This immediately kills all creativity; people think more about compliance than solutions. Dogmatism in such a dynamic profession as software engineering can only mean one thing: stagnation.

So, how do we get all the benefits of coding standards without any of the drawbacks? First, we need to recognize that software engineering is a creative profession. I would put an emphasis on both words. It's creative, so we shouldn't limit the spectrum of algorithms, technologies, and patterns to solve the programming problem. We need to treat engineers as professionals, and assume that they don't need another textbook. Of course, there are plenty of bad programmers out there, which is a subject for a different blog.

Ideal standards document would concentrate on the specifics of the architecture adopted by the company. Describe how the application layers are structured, what are the common components for logging, data access, exception handling, configuration management, caching. Don't bother defining naming conventions for variables.