So, you want to write software?: January 2011

At one of my customers we have been doing a lot of work on rich domain models, based on the principles of Domain Driven Design by Eric Evans. During this work we identified a particular anti-pattern in the existing code. I was trying to explain this to one of the developers on my team and it proved quite a subtle distinction. After thinking about it a bit more I decided to write this blog entry with some code examples to explain it in more detail.

Background

In the particular domain in question we have a (very simplified!) notion of Content and Metadata. Each piece of Content has some Metadata attached. However, the Content is sourced from one location (for example a CMS system) and Metadata is sourced from another (such as a remote web service). The challenge is combining the objects from these two different sources into a model that can be used by the higher layers of the application.

Anti-Pattern Solutions

The current solutions to this problem are based around the concept of an aggregator that pulls content from the different sources and merges them together into the returned objects. In all cases the returned objects follow the anaemic domain pattern of just fields with setter and getter methods.

Anti-Pattern 1: Transient Relationship

In this pattern, the aggregator pulls one object (the Content) from its source, uses an id field in this object to obtain the other object (the Metadata) from its source, and then sets the Metadata on the Content as a transient field:

public class Content {
    private String id;
    private Long metadataId;
    // Other fields

    private transient Metadata metadata;

    // Constructors, setters & getters
}

public class Metadata {
    private Long id;
    // Other fields

    // Constructors, setters and getters
}

public class Aggregator {
    // Fields, constructors and setters

    public Content getContent(String id) {
        Content content = cms.getContent(id);
        Metadata metadata = metadataService.getMetadata(content.getMetadataId());
        content.setMetadata(metadata);
        return content;
    }
}

The above code has a number of major problems:

Clients of the Content class can use both the metadataId and metadata properties, possibly interchangeably. This makes for fragile client code that is not receptive to change.
Any serialisation will result in the transient field being lost. How do we get it back?
When we want to update the metadata relationship do we update the metadataId, metadata or both? How do we keep them in sync?
What happens if we can't get the metadata, but successfully get the content? Do we throw an exception and prevent the content being used (even if it can)? or do we return content with a null transient field - requiring the client code to do null checking?

So, clearly having and id and a transient field representing the same relationship is not a good approach, can we do it without the transient?

Anti-Pattern 2: Decoupled Objects

In this pattern we avoid the transient object by keeping the two objects (Content and Metadata) separate and using the aggregator to retrieve each:

public class Content {
    private String id;
    private Long metadataId;
    // Other fields

    // Constructors, setters & getters
}

public class Metadata {
    private Long id;
    // Other fields

    // Constructors, setters and getters
}

public class Aggregator {
    // Fields, constructors and setters

    public Content getContent(String id) {
        return cms.getContent(id);
    }

    public Metadata getMetadata(Long id) {
        return metadataService.getMetadata(id);
    }
}

This code solves SOME of the problems of the first solution, but introduces problems all of its own:

It is the total responsibility of the client code to manage the relationship between Content and Metadata - making the whole codebase more fragile
It is very easy for client code to bypass the aggregator entirely and make direct calls to the metadata service, increasing complexity and coupling

So, what's the better way to do this?

The Rich Domain Solution

Our solution to the problem is to correctly model the relationship between Content and Metadata. We can introduce a Repository for Content that correctly sets up and manages the relationship, hiding the underlying management of ids and using real objects instead:

public class Content {
    private String id;
    private Metadata metadata;
    // Other fields

    // Constructors, setters & getters
}

public class Metadata {
    private Long id;
    // Other fields

    // Constructors, setters and getters
}

public class ContentRepository {
    // Fields, constructors and setters

    public Content getContent(String id) {
        Content content = cms.getContent(id);

        // We could proxy the metadata here for lazy loading or to delay exceptions being
        // thrown until the metadata is requested. However, in this simple example we will
        // build the relationship directly
        Metadata metadata = metadataService.getMetadata(content.getPrivateMetadataId());
        content.setMetadata(metadata);
        return content;
    }

    public void save(Content content) {
        // Persist changes and update the relationship if it has changed
    }
}

By hiding the management of the relationship between Content and Metadata in the core structure of the object model we prevent the details of how this is created and managed escaping from the domain model. Changes in the relationship between Content and Metadata can be correctly managed by the domain model. We also have options such as proxying the Metadata in Content to support lazy loading or delayed exception handling - something we can't do when details of the relationship are exposed to or managed by the client code.

Just goes to show how much a bit of good rich domain modelling can improve the structure and maintainability of software, even in very subtle ways.

Addendum

Just been thinking further on the subject. The above adequately addresses the technical aspects of the anti-pattern and the solution. However, there is also a huge cognitive element as well. The aggregator anti-pattern encourages a mental model of the domain as one of disconnected objets that are being forced together. Clients of the domain are therefore much more likely to create code that teats the model in this way: code that will be more complex and fragile. However, following the rich domain approach, clients have a mental model of a single object (Content) and it's relationship to other objects (such as Metadata) without being burdened with any of the separation knowledge. This cleaner mental model of the domain is much more likely to lead to clean, well architected client code that is easier to test, maintain and enhance.

So, you want to write software?

Tuesday, 25 January 2011

Anti-Patterns and Rich Domains