So, you want to write software?: October 2010

Friday 22 October 2010

From Java to Scala in O's and X's

I've been doing a lot of evangelising on the Scala language, mainly to Java devs that I work with. A number of the have become very interested in the language and have started making an effort to learn more about it. The most common questions that seem to come up relate to how to move from the Java to Scala language and the key differences (such as a functional approach, building apps though composition and mixins, the type system and so on).

With this in mind I decided to build a small application that starts as a typical Java development and over a number of versions becomes a full-blown Scala implementation. All versions of the application are maintained so that it becomes possible to step through and see each new level of progression into the Scala way of doing things. The progressions can be presented quickly in an hour or so, or it is possible to spend much longer on each one discussing design decisions and so forth. Even without the Scala focus, it's also a good application for discussing different design and implementation strategies with Java developers.

To have an application where the domain is easy to understand I picked the age old Noughts and Crosses game (Tic-Tac-Toe for my American readers). The advantages of this game are that it is very simple to learn and understand, it has some interesting data structures and interesting set of rules that can be explored for selecting the best move to make. It can also be implemented quite happily in a handful of classes and a few hundreds of lines of code. The particular approach I selected was to have the game played by two computer players who each take it in turns to pick the best move to make until one either wins or the game is drawn because the grid is full.

Aside: Initially the rules were so optimised that my implementation only ever resulted in a drawn game. The final solutions therefore ensure that the first couple of moves are randomly selected in order to introduce a bit of interest.

All of the source code that I'm going to discuss is available on my github account at: http://github.com/skipoleschris/OandX

Version 1 - Typical Java Code

After spending so much time writing Scala (and the most Scala like Java code possible) it was surprisingly difficult dropping back into the type of Java code that you find on most projects. For this version I tried to adopt the Java coding style of a typical average Java developer. Code is written in basic Java and tests in Java using TestNG. The solution works find but has a number of code and design smells, specifically:

'null' is used to represent empty positions on the grid
The grid is represented in a Java Bean style and so all details of how to work with the contents of the Grid are actually outside the Grid class - leading to a pattern of utility classes that contain large amounts of domain logic
There is a lot of duplication in the WinScanner and MoveFinder classes
MoveFinder in particular makes it very difficult to separate the rules being applied from the complex code implementing them
MoveFinder has the most fantastic if ( ... == null ) multi-repeated check method!

Version 2 - Slightly Improved Java Code with ScalaTest

The next version improves the Java code slightly. (It's still code way below the level I would normally write, but this is an exercise in converting from Java to Scala, not in producing perfect Java code!). In particular, the Grid class now contains more domain logic; the WinScanner and MoveFinder classes have duplication removed; and the MoveFinder is implemented with a chain of filters rather than the nasty if clause. The MoveFinder in particular is still WAY too complex.

The other main feature of this version is that tests have been ported over to ScalaTest. I went for the BDD style of tests using FeatureSpec, GivenWhenThen and MustMatchers as I feel this shows the full power of DSL-like test constructs. Most Java developers I have show this to seem to find it very readable. Typical tests now look like:

feature("The OnX grid") {

  scenario("can have a token added to it") {
    given("a new grid")
    val grid = new Grid

    when("a token is added at a given position")
    grid.addToken(Grid.MIDDLE, Token.nought)

    then("that position contains the token")
    grid.getToken(Grid.MIDDLE) must be (Token.nought)
  }

}

Hopefully much more readable!

Version 3 - Java Without Semicolons

The third version ports the Java code directly to Scala with minimal use of advanced Scala language features. All I did here was convert classes and data structures directly, remove boilerplate code and use Scala for loop syntax. At this point it just looks and reads pretty much like a Java application.

For example, here's the same method from the MoveFinder class implemented in both Java and Scala:

private static class DoubleWinPositionFinder implements PositionFinder {
    @Override
    public Position findPosition(final Grid grid, final Token token) {
        final Set positions = new HashSet();
        for ( final Line line : grid.getLinesWithMatchingTokenAndTwoSpaces(token)) {
            for ( final Position position : line.getEmptyPositions() ) {
                if ( positions.contains(position) ) return position;
                else positions.add(position);
            }
        }
        return null;
    }
}

private class DoubleWinPositionFinder extends PositionFinder {
  def findPosition(grid: Grid, token: Token): Position = {
    var positions = Set[Position]()
    for ( line <- grid.linesWithMatchingTokenAndTwoSpaces(token) ) {
      for ( position <- line.emptyPositions.toList ) {
        if ( positions.contains(position) ) return position
        else positions.add(position)
      }
    }
    null
  }
}

This approach is a great way to get working with Scala. Just bring all your Java knowledge and coding style with you, make use of the Scala tools, test frameworks and so on and gradually move towards advanced Scala when you feel comfortable.

Version 4 - Moving Forwards With Scala

In this version, I moved to a much more Scala like model. In particular:

The Grid class is now immutable, returning a new Grid on each change
The Option[T] concept is now used as a replacement for null to allow better representation of empty positions
The code uses more advanced functional concepts such as map, flatMap, filter and so forth.
Mixins are used to compose the Grid with traits representing line handling, win scanning and move finding. This gives a much better represented Grid concept and smaller API surface area while still allowing separation of concerns into separate traits and classes

While I think this is a much better implementation than version 3, I am still not happy with the implementation of some of the line building logic and the move finder is still too complex and makes it difficult to separate the rules being applied from how they are implemented. In particular code like this (which finds all the empty positions on a list of lines) is just way too messy:

def emptyPositions(lines: List[Tuple2[List[Option[Token]], Position]]) =
  lines.filter(_._1 contains None).flatMap(empties => empties._1 zip positions(empties._2)).filter(_._1 == None).map(_._2)

This leads me to believe that while I am using some of the more advanced language features that something is wrong with my underlying data structures if the code gets this complex. However, looking at our move finder function we can see some simplification starting to take place:

private case object DoubleFreePositionFinder extends PositionFinder {
  def findPosition(token: Token) = {
    val allEmptyPositions = emptyPositions(linesWithMatchingTokenAndTwoSpaces(token))
    if ( allEmptyPositions.contains(Position(1, 1))) Some(Position(1, 1))
    else if ( allEmptyPositions.isEmpty ) None
    else Some(allEmptyPositions.head)
  }
}

Version 5 - The Final Solution

For version 5 I decided to approach the problem from a much more functional perspective. By switching my thinking from the object approach (i.e. what is my state and how can I encapsulate it) to a functional approach (i.e. what functions do I want to perform) I was able to realise an important concept. Although the noughts and crosses game is represented as a grid, all of the functions we want to apply are either on an individual position on the grid or on one of the eight lines that can be made from the grid. By switching my data structure to one more appropriate for applying these functions we end up with a significantly simplified set of functions for evaluating the game state.

For example, we can take a list of tokens, zip them with a list of positions and then apply functions across either the token part or the position part of the pair:

class Grid(tokens: List[Option[Token]]) extends TokensWithPositions
                                        with Lines
                                        with LineQueryDSL
                                        with WinCheck
                                        with MoveFinder {

  require(tokens.length == 9)

  val positions = for (row <- 0 to 2; column <- 0 to 2) yield Position(row, column)
  private val values = tokens zip Grid.positions

Lines can therefore be represented in exactly the same manner:

type Line = List[Pair[Option[Token], Position]]

This new structure greatly simplifies the rest of the application code when it comes to building and working with lines:

def emptyPositions(lines: List[Line]) = lines.flatMap(_.filter(_._1 == None).map(_._2))

The other main change in this version is separation of the rules for finding the best position for a player from how those rules are implemented. This is achieved through a custom DSL that allows defining the rules in a highly readable way. For example, the rule we saw previously now becomes:

private def doubleFreePosition(token: Token) =
  find linesHaving 2 positions Empty and 1 tokenMatching token select First take EmptyPosition

The DSL is a fairly simple one, implemented as a chain of filtering functions that reduce the lines down and then a selector that pulls the correct result from the matching lines. It actually turns out that the DSL is pretty easy to read and follow through, which also significantly simplifies the code and reduces duplication.

Although version 5 is a much better solution, there is still room for improvement and some functions that could certainly be implemented in a better way. Let me know when you find them and send me your better solution :-)

Version 6 - Adding Some Actors

The final version takes the code from version 5 and wraps actors around the classes. One actor for the game coordinator, one for the grid and one for each player. These actors are loosely coupled through message passing.

Noughts and Crosses is not the best demonstration for actors and concurrency as it's turn based nature makes it a purely sequential game. However, version 6 does allow the concept of actors to be demonstrated in an easily understood way and leads on to more detailed conversations about using actors for concurrent problems.

Conclusions

The Noughts and Crosses example shows how a Java programmer can migrate to Scala in a gradual manner, adding new Scala concepts as they become more comfortable. However, it's important to point out that Scala is certainly not a silver bullet for good software. Like all development it requires careful thought, good design, solid programming practice, good testing and constant refactoring. Any Scala application will only be as good as the developer(s) building it.

Monday 11 October 2010

Solving Polymorphic Problems using Scala Type Classes

Last week I attended Scala Lift Off London. There were a couple of sessions on Type Classes in Scala, which I found very interesting. In fact, they offer a great solution to a problem that I encountered recently on a Java project (although I'm using Scala code for this entire post).

This particular project had a number of domain objects that could be 'published'. In this particular case, published meant converted to an XML representation which was then passed to a remote server for processing. Multiple domain objects can be published at the same time in a 'publish package'. The challenge was that none of these domain objects had any particular similarities or common interfaces. E.g.

class Page {
  // Page attributes and methods
}

class Story {
  // Story attributes and methods
}

// Other domain objects: image, component etc.

Now, there are a number of ways that these classes could be handled for publishing. An initial suggestion might be to add a Publishable interface to each domain object. However, this is a bad idea for a number of reasons:

It couples the knowledge of publishing into domain objects that are used for multiple different purposes
The domain objects are complicated for all users who don't require publishing
It may not be possible to change the domain objects if they are owned by another project.

So, how do we go about adding these domain objects into the publish package. Well, there are a number of approaches, none of which are ideal. The first is to add a separate method to the publish package for each class that can be published:

class PublishPackage {
  def add(page: Page) = ...  // convert to XML and add to package
  def add(story: Story) = ... // convert to XML and add to package
  ... // And so on
}

This creates a publish package that is very unstable and will soon grow beyond a maintainable size, not good. Another approach is to have some kind of dynamic class-based lookup:

class PublishPackage {
  private mappers: Map[Class, PublishMapper] = ...

  def add(obj: AnyRef) = ... // Lookup mapper by object class and call method on it
}

This is much better as the logic to map each domain object to its publish XML is now outside the publish package. However, it still has some issues in that we need to update the map of mappers each time we have a new domain object type to publish, which may involve changing publish code or configuration. There is also a runtime risk here in the the compiler will let us pass any object to the add method even if it can't be published, so we loose some level of type safety.We also have to write additional code in the add method to handle this case.

This general - I want to handle objects polymorphically but can't because they don't share a common base - problem occurs all too frequently in software, especially when dealing with legacy code bases or objects from external dependencies. Fortunately Scala Type Classes give us an elegant solution to the problem. What we will do is take the second solution described above, but use Type Classes to solve the problems of runtime safety and the need to update the map of mappers.

First, we change to publish package to support adding of publish-ready XML instead of any specific or general binding to domain objects:

class PublishPackage {
   def +(nodes: NodeSeq) = ... // Add nodes to publish package
}

Next we declare a common trait that will be implemented by all the mappers:

trait PublishMapper[T] {
  def formatForPublish(content: T): NodeSeq
}

Next we declare the Type Classes (or Objects in this case) that implement each of the mappers. Note that they are defined as implicit, allowing the correct one to be implicitly selected for the type of domain object being used:

implicit object PagePublish extends PublishMapper[Page] {
  def formatForPublish(content: Page) = ...
}

implicit object StoryPublish extends PublishMapper[Story] {
  def formatForPublish(content: Story) = ...
}

Next we need to declare the Type Class method that takes an implicit PublishMapper which it will call to convert the domain object into the NodeSeq for adding to the publish package:

implicit def objectToPublishXml[T](t: T)(implicit m: PublishMapper[T]): NodeSeq = m.formatForPublish(t)

Let's talk about this one in more detail. It's a parameterised function marked as being implicit so we can convert from any T to a NodeSeq, provided that there is a PublishAdapter instance available that is also of the parameterised type T. If all the conditions are met then the formatForPublish method is called on the mapper and the result returned.

Now that we have this in place, we can just write the following code:

val somePage = new Page
val someStory = new Story

val p = new PublishPackage
p + somePage
p + someStory

So, now we don't have to modify any existing code when we want to support a new publish type, we just implement a new PublishMapper for it and the compiler will notice that the new implicit conversion is available. Additionally, we are now fully typesafe as the compiler will error if we try to add add any domain objects to the publish package for which there is no publish mapper conversion.

As we have hopefully seen, Scala Type Classes are an elegant solution to a very common problem which gives us continued type safety for a minimal amount of extra code. In fact, I'd argue that the tiny amount of Scala complexity here saves way more code which we would have to write if we were doing class based dispatch and runtime type checking.

Friday 8 October 2010

Scala Lift Off London - Day 2

Day 2 of Scala Lift Off London started with some quick-fire presentations on a number of interesting subjects:

OSGi and Scala
Lifty
Minesweeper Problem

Lifty especially looks like a great tool for faster creation of Lift apps using SBT. Certainly something I'm going to be installing.

Scala Test Frameworks and DSLs

The first session I attended was an group discussion about some of the test frameworks for Scala. We looked at both Specs and ScalaTest and compared their approaches and the differences in their DSL offerings. We then looked at ScalaCheck as an alternative solution for testing. Especially useful when you need to generate a sequence of values to test against.

Finally we took DSLs even further by looking at test narratives for making tests clearer. For example you can write direct in the test case:

def test() = {

Alive cell dies when there are 3 neighbours

}

Is implemented by a DSL such that Scala converts it into the following series of chained method calls:

Alive.cell(dies).when(there).are(3).neighbours

A very interesting approach.

Monds, Functional Concepts, Type Concepts

I decided to have a go at hosting my own session to try to get some clarifications on a number of topics that have been bugging me for a while. I'm coming from a Java background and thus the functional elements of Scala along with its much more complex type system are two areas where I have had the most difficult transformation.

The aim of the session was to get those people who wanted clarifications on topics to post them on a whiteboard and then get members from the audience who were experienced in these areas to explain what they mean and how to use them in 5-10 minutes each. The session started well with some good audience participation. After a while it then dropped down to just Kevin Wright and Jon Pretty doing most of the work (thanks guys!).

During the hour (and the extra bit that we ran over into lunchtime) we covered:

Monads
Partial Functions
Currying and Partially Applied Functions
Monoids and Type Classes
Implicit Conversions
Self Types
Covariance and Contravariance

I learnt a huge amount from this session and a lot of the stuff I was only partially clear on now makes much more sense. I'm pleased I proposed the session as it was exactly what I was looking for.

Minesweeper

The third session of the day was a great talk by Kevin Wright (@thecoda) about implementing the Minesweeper game in Scala and Lift. An excellent example of how to write really great Scala and provide an elegant solution to the problem. Very interesting to see how this was done with only immutable structures and how ajax and Lift were combined in the front-end implementation.

I'd certainly recommend visiting Kevin's github page to explore this code in more detail.

Type Classes and the Cake Pattern

Another great presentation, this time by Jon Pretty. The first part was a more detail look at Type Classes and the Monoid pattern. This got pretty in-depth, but Jon explained it well. I've since built up an example of using Type Classes to solve a real world problem that I encountered. I'll post a description next week.

The second part of the session was the Cake pattern. This is a great approach to building componentised Scala applications as an alternative to dependency injection solutions like Spring or Guice. I'm already very familiar with the Cake pattern as I've done a fair bit of experimenting with it on my 'discala' experimental project. I'm going to follow up with Jon to perhaps explore some ideas in more detail.

Overall I'd have to say that Scala Lift Off was a great two days. There's a very vibrant Scala community in London and across Europe. I learnt lots and was very privileged to be in the same room as some very talented people. Also a big mention to SkillsMatter for hosting such a great event and providing some fantastic facilities. I'm looking forward to next year already!

Thursday 7 October 2010

Scala Lift Off - Day 1

I've been at Scala Lift Off London today (#scalalol). Had a great time and attended talks by some really knowledgable people. Also met some people doing some really great suff with Scala. The 'unconference' format took some getting used to at the start, but the group created a great agenda and it all came together nicely. Here's my take on today's talks and some ideas that I might propose for tomorrow.

Session 1 - Moving from Java to Scala
An interesting discussion of how a company moved from building apps in Java to their first Scala deployment. An interesting study on how you can move along incrementally, bringing the whole team along as you go. Their progress can be roughly described as:

Start writing unit tests for Java code in Scala
Use Scala for main code, but still using Java coding styles (i.e. Java without the semi-colons, getters/setters and boilerplate)
Starting to make use of Scala features to improve code (e.g. Option instead of null)
Using more of the Scala language features (e.g. closures, immutable types etc.)
Adopt the additional functional aspects of the language (e.g. map/flatMap)

What I thought was very interesting of their approach of not moving on to the next level of Scala until all of the team were up to speed with the current level. Also, their approach of getting the whole team on board in terms of style was excellent.

This certainly proves it is possible to move a team of Java programmers to the Scala language. I'd certainly love the opportunity to take one of the teams that I work with on this journey on a project over the space of 6-9 months.

Session 2 - Akka
A fantastic session with Jonas Boner (@jboner) presenting the main elements of the Akka framework. The actors model for building scalable concurrent applications really excites me. I can see many cases where things that I'm working on now could be made much simpler, but at the same time more scalable and performant by adopting this model.

I can certainly see some major architectural changes in the future for building large systems. Applications composed of thousands of lightweight actors, all working concurrently and passing messages around has got to be better than the current model of a single thread handling a single request and trying to deal with all the synchronisation issues whenever shared state is involved.

Also very impressed with the supervisor model within Akka, where a supervising actor can manage a number of other actors and control how they get restarted in failure cases and so forth. Also, the distributed actor model looks great and is almost seamless (although deployment might be more challenging!)

I'm certainly looking forward to building some actor based applications. I'll post details as I do.

Session 3 - Experience of using Scala and Akka
Although this was an excellent session, it was probably the least valuable one of my day. It was great to hear about a successful project that used Scala and the Akka framework - especially how quickly and easily it all came together. However, not much new information.

Session 4 - Software Transactional Memory (STM)
Another great session with Jonas Boner, looking at the STM support within Akka. I've played a bit with STM in the past, but combining this with the actor model looks very powerful. It's all very new at the moment, but I can see some great applications for this approach. The session also turned into a great discussion of STM and I learnt a lot of good stuff.

Other Thoughts
Martin Odersky (@odersky) did what looked like a great talk on advanced Scala and implicit conversions. Sadly I missed this as I was in the STM session. Looking forward to watching the video on this one.

Today for me was definitely an Akka day. However, a lot of the other sessions seemed to be on quite advanced or specialised topics. I'm hoping tomorrow to focus more on Scala and Lift, with perhaps some more intermediate and generic sessions. Some suggestions I might propose:

WTF is a Monad (and other Functional Goodness): I'm coming from a Java background and while I get the functional approach in Scala I still find it fairly challenging to work with. Some real world examples would help
The Scala Type System
Building Lift Applications for developers familiar with Java frameworks (e.g. Wicket, Tapestry, JSP, JSF)

Looking forward to tomorrow.