When to Add an ORM Tool

I’m working on the code that parses VCalendar data so that it can be processed. Im copying the data I care about into a simple data structure that can represent a calendar request in any format. Any logic that interacts with calendar requests would use this internal structure. I want it to be simple, only having the stuff that I need, but I don’t want to completely re-invent the wheel either, so I will use the VCalendar format as a guideline.

Ive used this pattern of processing an established or complex data format into a simple proprietary one before with good results. It allows you to isolate the complexities of a data format in the code that processes it, keeping the rest of your code clean. Like everything, there are cases where I wouldn’t use it, but for this kind of scenario it should work well.

Now that I’m building this neutral representation, my next thought is how it will be saved to the database. I don’t need a database at this stage of the project, so part of me wants to ignore the decision until later, but the whole point of the system is to persist and interact with these types, so it’s important that I get it right. If I make simple POCO classes now, and start writing a bunch of code that uses them, I might have to change a lot of code later if I want to switch to types generated by Entity Framework. I could write my own custom code to read and write my own types from the database, then I can use any type I want without restrictions, but it would be a waste to write this plumbing code myself when an ORM can do it faster and better.

Creating a table design now wouldn’t be a simple matter either. I don’t know enough about the needs of the scheduling service to know which parts of the VCalendar format to bring over. If I try to guess now I know I’ll bring over a bunch of stuff that I don’t need, but starting with a simple table and adding fields every time I need one is no fun either. Adding a database to a system is like attaching a ball and chain, and I want to wait until I can be sure I have my model correct before I do it.

It’s need to make a decision, so here it is: I’m going to keep building my own types, and try to keep them. When it comes time to hook up a database, I’ll play around with the new POCO support in Entity Framework 4, and if that fails I’ll try another ORM tool. I may need to change my model a bit to suit the ORM, but Im hoping that it wont need to change much.

Invitations and the VCard Format

My next goal for the Themis project is to parse an invitation from an email. I am starting with invitations generated by MS Outlook because that’s my target audience, but a peak inside of a Google Calendar invitation gives me hope that I’ll be able to support multiple calendars without much trouble.

Outlook invitations are sent in the VCalendar format, content type “text/calendar”. The standard was published as RFC 2445 in 1998. It describes a standard layout for calendar data in the VCard format, which is described in RFC 2425.

VCard is a simple text based format with a nested structure that’s similar to XML. It uses colons and semicolons instead of angle brackets, supports attributes, and has standard representations for several common data types. Here is an example of the same data represented in both formats (content from RFC 2425):

An example of VCard data beside an XML representation of the same data.

I looked around for a library to do the job,but decided to build my own. Writing this kind of code isn’t a good way to deliver business value, but it sure is fun. A bit of time with a white board and a couple of evenings coding, and I have a mostly functional VCard parser ready to go.

When I am setting out to build something like a parser, I like to consider the designs of other established libraries that perform a similar function. As I’ve already mentioned, VCard is a lot like XML in structure, so XML parsers are a great place to look for inspiration. I built a structure that loosely mimics the XmlDocument model, an object tree that holds the entire document in memory. It won’t perform as fast as using something like an XmlReader, but it makes it easier to handle variations in document format with polymorphism. Since the VCard documents I’m processing should never be too big, it won’t be much of a performance burden anyway.

The object structure that holds VCard data looks like this:

A UML diagram showing the class structure for holding VCard data.

At firstedI wanted to have a property like Value, but there is no reliable way to know the type of a VCard value without knowing the structure of the document, and at this level I don’t want to be coupled to the structure of the document. Instead, I will be adding a series of methods to get and set the EscapedValue as a specific type.

The VCardSimpleValue class was a late addition to the model. I needed a way to hold parameter values (the equivalent of attributes in XML), but since they can’t have parameters of their own I didn’t want to pick up the Parameters collection. I also considered making a type seperate from VCardValue for the parameters, but both these classes will need the ability to read and write the escaped value, and I don’t want to write that code twice.

I’ve added the first unit tests to the solution, checking the parser against a number of examples in the specifications and real snippets extracted from emails sent by Outlook. I’ve also added tests for a number of failure cases in the parser such as groups without an ending or values lines without a value delimiter.

My next addition will be parsers for the value types and a stub in the test harness that replies to emails with some info about the original request.

Themis: System Design

I’m charging forward on the Shared Resource Schedule Service. There are a lot of things I’m getting into place before I start writing code. Don’t try to tell me that I’m not being agile either, youaways need to design a little to write good code. The key to being agile is designing onlyas much as you need, and being prepared to change your mind later.

The Name

The first order of business was to choose a name. It helps to know the name before you create a project. I’m also hoping to put the source in a public repository right away so that I can talk about the coding process while I build it.

Themis comes from Greek mythology. According toWikipedia, she “is the embodiment of divine order, law, and custom.” That seems somewhat applicable to the subject at hand. I know thatGreek mythology isn’t too creative, but we can always change the name later if anyone has a better idea.

Application Structure

A diagram showing the two main applications and the database that they both reference.

This system should be very simple. It’ll have a windows service to check for invitation emails, a web page to view schedules and configure the service, and a database to hold all the information so that both apps can access it. For simplicity, I want to put most of thecode into a single assembly.I want to the web UI and the service runner to be a thin veneer over the main assembly.

Application Design

Although this is just a home project I do want it to be good quality, even if my name weren’t going to be pasted into every source file with the license. To achieve this, I willbe using unit tests in as many areas as I can add them; not only to catch mymistakes but to help others understand and maintain the systemafter I’m gone. To make unit testing easier, I want to use anIoC / DIcoding style.

I’m going to use Entity Framework to keep the application code ignorant about the database. I don’t want to be tied to a particular database system, though I suspect most installations would usea sqlce database. I could use any ORM, but I’ve been wanting to try EF.

Processing Mail

Now that enough of the design (and some of the non-coding elements) have been straightened out, I can start to really focus on the first part of the project:Processing email.

A sequence diagram showing the mail processing sequence.

I want tobuild the email handling logicas agroup of small components,each responsible for asingle aspect of the process. Aside from some shared types to hold email or request data, and the controller that knows how to orchestrate them, none of the components will know anything about each other. Even then the controller won’t know which typescontain theactual implementation thanks to the IoC container.

By building each part in isolation I get a few good advantages. First, I’m able to test each component completely and separately from the rest of the system. Second, it becomesposible to replace any part of the chain with nothing more than a configuration change, which will be especially handy if anyone wants to fork the code. Third, I can reuse the individual components for other parts of the system, and they are likely to require less refactoring before they can be reused.

This is the new coding style: you build a system with complimentary building blocks loosly hooked together. You end up with a lot of classes, but you avoid the pain of complex inheritance trees.

The Next Step

Now that I’ve figured out my approach, I see that myfirst step is too big for an afternoon. Instead, I’m going to tackle it in three smaller steps:

  1. Receive any email and send a simple response back.
  2. Receive event invitationsby emailand send a human-readable response listing the details of the invitation.
  3. Receive event invitations by email and send an acceptance email. (The original requirement.)

AllI haveleftis to makea good cup of tea, turn up the music,and bang some keys.