Monday, December 11, 2006

Design For Operations (Part I)

When I design an enterprise application, I need to realize one simple truth: the system is going to spend just 15% (or less) of its life in development environment. After that it moves to the gated community known as production, and the only people who are supposed to have access to production are system operations engineers, a.k.a. “IT guys”. And of course, I cannot assume that IT guys will become familiar with the intricacies of the application’s design. If I do, I would be making a grave mistake which results in those dreaded 4:50 PM or 2:30 AM phone calls from the NOC.

So, I really need to design the system with operations in mind. It should be able to report its status and notify about any issues. I should allow operations to monitor my system with their usual tools, such as event viewer, performance monitor, management console, or MOM, instead of running SQL queries and reviewing XML configuration files. This means instrumenting my application with event logs, performance counters, and WMI objects and events.

Event Logging. Although simple file log is very convenient to put all sorts of debugging and profiling information, I can’t really expect IT to dig through megabytes of text looking for error information. Instead, they should be able to get it from Windows Event Viewer. So, I will create an instance of EventLogInstaller in the application’s installer class and specify the Source and Log properties. I will make sure to log all unhandled exceptions (see my previous post) using EventLog.WriteEntry method.

Performance counters are invaluable tools for monitoring and profiling the system in production. They may also give early indication of system issues. Windows and .NET framework already contain dozens of performance counters, but custom counters can provide an insight into my application’s processing logic. So, exactly kind of information should I expose via performance counters and what kind of counters (instantaneous or average) I should use? There is no standard answer; it really depends on the nature of the system. There is a good introduction to the concept on MSDN. In order to register custom performance counters I usually create a custom installer:

public class PerformanceCountersInstaller : Installer
{
public const String CategoryName = "...";
public const String CategoryHelp = "...";
public const String CounterName = "...";
public const String CounterHelp = "...";

public override void Install(IDictionary state)
{
base.Install(state);
Context.LogMessage("Installing performance counters...");
SetupPerformanceCounters();
}

public override void Uninstall(IDictionary state)
{
Context.LogMessage("Uninstalling performance counters...");
if (PerformanceCounterCategory.Exists(CategoryName))
PerformanceCounterCategory.Delete(CategoryName);
Context.LogMessage("Successfully uninstalled performance counters");
base.Uninstall(state);
}

private void SetupPerformanceCounters()
{
try
{
if (PerformanceCounterCategory.Exists(CategoryName))
PerformanceCounterCategory.Delete(CategoryName);

CounterCreationDataCollection CCDC = new CounterCreationDataCollection();

// Create and add the counters
CounterCreationData ccd;
ccd = new CounterCreationData();
ccd.CounterType = PerformanceCounterType.CounterDelta32;
ccd.CounterName = CounterName;
ccd.CounterHelp = CounterHelp;
CCDC.Add(ccd);

// Create the category.
PerformanceCounterCategory.Create(CategoryName,
CategoryHelp,
PerformanceCounterCategoryType.SingleInstance,
CCDC);
Context.LogMessage("Successfully installed performance counters");
}
catch (Exception ex)
{
Context.LogMessage(String.Concat("Could not install performance counters", ex.Message));
}
}
}

In the next post I will discuss using WMI to publish application status information.