Monday, January 01, 2007
Web 2.0 (Time to Think Again)
However, the concept of ‘mash-up’ a term introduced in the book got me thinking especially about application development and integration. Mash-up is the term used to describe the mashing together of content, often programmatically, from several different sources. MySpace pages are a classic example where pictures, music and video are drawn together from all over the web. But what is interesting is how the APIs from several sources can be combined like in housingmaps.com. Here is a site that has taken the Google Maps API and combined this with a property rental directory. You click on the Google Map and up pops a listing of rental property in that area.
Not only are the Google APIs interesting, but also those provided by the likes of Amazon. Take S3 (Simple Storage Service) for example. This is brilliantly simple. Amazon have globally deployed terabytes and petabytes of highly available, high performing storage that it uses to support its business. By signing up to S3 and using their API you can store and retrieve any data, in fact any amount of data. I don’t need to build a data centre, I can just start using theirs. The barriers to entry for new business have just been lowered. What is even more fascinating is that you can use your already existing Amazon account for billing! I just registered and as they say I’m good to go.
Google have recently retired their SOAP search API in favour of an AJAX API. Just add some Java Script to your web-page and you can incorporate the power of Google search into your applications. A number of client side controls are provided for doing this.
So in the new Web 2.0 age as designers and architects we should be providing AJAX APIs in addition to WSDL, as this a quick and easy mechanism for integrating applications together. Is SOA dead or being reinvented?
Hammers and Nails (The perils of Mark-up Languages)
I have recently been dabbling with WPF (Windows Presentation Foundation) Microsoft’s new graphics framework for building Windows applications that is at the core of Windows Vista. The framework is a departure from the likes of previous frameworks such as MFC and .Net 2.0 WinForms that treated the screen as a canvas that is sub-divided into rectangular regions. Each region is owned by a piece of logic, usually a graphical control that would be responsible for drawing and creating the content for that region. Buttons are responsible for drawing themselves, windows for drawing their surrounds and menus know how to drop down or pop up. As a general rule each area of the screen was owned by one control, the drawing of rectangular regions could not overlap. These limitations were fine in the days when processing resources were limited. It was optimised to work out what regions to draw and redraw as users interacted with the computer.
With WPF a complete re-think has occurred. Essentially WPF is based upon the concept of a graphics tree, a tree of graphical objects arranged in a hierarchy. The tree is treated in a declarative manner where each graphical element describes itself e.g I’m a rectangle, this big, with this fill colour and border. WPF has been designed to take advantage of the graphics cards that are includes in the majority of PCs. All WPF graphic display is done via the Direct X graphics engine. The WPF graphics tree is converted into a Direct X graphics tree. That’s it.
In order to program WPF Microsoft have helpfully provided a .NET library. You can develop applications in your favourite .NET language, there is a beta extension to Visual Studio 2005 for this. Like any good IDE and tool-set Visual Studio provides a graphical editor for creating .NET WPF applications, windows, dialogs and controls. I’m sure you are all familiar with the standard drag and drop style of interface where you compose windows via the layout of graphical components.
Ok now here is the rub. Some of you may already be familiar with .NET 3.0 and WPF and you maybe wondering why I haven’t mentioned XAML (Extensible Application Markup Language). The reason I haven’t mentioned XAML is that it isn’t fundamental to developing applications in WPF. However, according to some folks WPF is XAML. The use of XAML is yet another example of the misapplication of XML in the development of software. The industry is fixated in the use of XML for everything, when in the majority of cases it is totally the wrong thing to do.
Starting with WPF/XAML and a few other examples I’ll explain why. In so doing the next time you think XML is cool, take a moment to reflect.
When developing applications in Visual Studio for WPF the tool helpfully creates the XAML definition for the window say that you have just created. There are tags for buttons <Button>, applications <Application>, menu items <MenuItem> etc. Each tag corresponds in fact to a class in the WPF framework. To create a button with the text ‘OK’, you use the following <Button>OK</Button>. You get the idea. When you compile and execute the application Visual Studio helpfully provides a code generator that creates .NET (usually C# code) from the XAML definition.
Sounds fine? Well not exactly. Not exactly because XML, even with validating parsers, is not semantically rich enough to capture the constraints of the WPF framework. For example nested menu structures are created in XAML using the <MenuItem> tag. Here is an example of a nested menu.
<Menu VerticalAlignment="Top" HorizontalAlignment="Left"
Height="20" Width="50" >
<MenuItem Header="File">
<MenuItem Header="New" />
<MenuItem Header="Open" />
<MenuItem Mode="Separator" />
<MenuItem Header="Database" >
<MenuItem Header="Create"/>
<MenuItem Header="Delete" />
</MenuItem>
<MenuItem Mode="Separator" />
<MenuItem Header="Exit" />
</MenuItem>
</Menu>
You can create tabbed windows in a similar way with the <TabItem> tag. I wanted to create a two level tabbed window, so thinking it behaved like the MenuItem I created a similar structure to the menu above. The XML parser didn’t complain. It was only when I came to debug the application that the code generated from the XAML reported an error. Apparently TabItem can’t be nested in the way I intended. You need an intermediate control, such as a TabControl, between nested TabItems.
So this got me thinking, what is the point of having XAML in the first place? It doesn’t provide the semantic richness required and it introduces that awful XML tagged syntax, you can’t see the wood for the tree. In fact in the previous version of .NET 2.0, the graphical editor for WinForms worked in a similar manner. It creates a resource file, not in XML, that was compiled into code. WPF would work a lot better if Microsoft hid the XAML, created a proper language with real keywords that was both expressive and powerful. XML appears to have been chosen because XML is cool and not that it is an appropriate choice of technology for the problem to be solved.
XSLT is another example. Here is a declarative language for transforming XML from one format to the other. The basic language contains approximately 35 keywords, small by programming language standards. What did the wonderful people defining the language decide to do, they decided to use XML as the syntax of the language. What a mistake, it is impossible to read, edit and maintain. A simple syntax using keywords instead of tags for the language would have been much better. Why the fixation on XML?
Finally a few months back I was working on a small project which required the transfer of a complex nested data structure with cyclic references between a Windows .NET C# environment to a J2EE Java environment. Naturally the engineers with whom I was working decided to adopt XML. Various strategies were adopted for the XSD to accommodate the cyclic references (I won’t go into them now). Both the C# and Java environments provide tools for binding to XSDs and therefore allowing the serialisation and de-serialisation of XML documents to be performed. Could we get data to be transferred between the two, bi-directionally, well no? It transpired that there are many different ways to interpret the XSD when binding, list structures and references are problematic. After two weeks of banging our heads against the wall and in frustration, we decided to write a very noddy CSV (de-)serialiser that used reflection to transfer the data between the two environments. It worked like a dream, was very fast and didn’t cause code bloat. It only took us three days to complete, we got the job done much quicker.
So the morale of all this is, if you have a hammer not everything is a nail. The world has gone mark-up language crazy. I’m waiting for the world to wake up and realise that XML was a bad idea and that how we used to do data transfer about eight years ago is more appropriate, that is thinking explicitly about the semantics and encoding. You can see this with WSDL and web-services. Application integration, especially SOA, requires efficient inter-application communication. WSDL allows bindings to non-XML transport structures. Most Web Service vendors including Microsoft have discovered that it is much more efficient to build systems that don’t use XML. Microsoft’s Windows Communication Foundation (WCF, formally Indigo) supports non-XML bindings for the implementation of web services.
Wednesday, August 16, 2006
The Problems with the State Pattern
I cannot rave enough about the Gang of Four (GoF) Design Patterns book. Ever since it came out over ten years ago I plug this book to all the software engineers that I meet. I’ve even had two personal copies stolen from me. I treat this like the Gideon’s and their distribution of the bible. I’m happy for people to steal my copies because if they think it is worth stealing it means they appreciate the value of the book.
Alas Design Patterns is getting a little dated. The code examples are in languages that aren’t so commonly used these days. The notation for presenting design and interaction diagrams is pre-UML. All in all the book needs to be revised and released in a new 2nd edition. There are a couple of other books that attempt to present the patterns in more modern languages, Java and C#, but somehow these aren’t very good.
I Have a Hammer and Everything Is a Nail
Engineers who encounter Design Patterns for the first time often go through an ‘ah ha’ moment where they recognise and find names to patterns that they have been using all along. The pattern names are more common as they have crept into the design and implementation of class libraries, such as Observable (based on the Observer pattern) in Java.
The ‘ah ha’ moment is usually followed by an ‘oh wow’ moment as new patterns are discovered in the book solving problems that the engineer has encountered previously but hadn’t found a satisfactory solution. From that moment on there is often a desire for the engineer to use all of the patterns in the book. I often call this pattern mania. It’s the hammer syndrome, when you have a hammer everything is a nail. I can use the hammer (patterns) for all of my problems.
With a bit of experience and guidance, pattern mania passes and a more informed usage of Design Patterns then ensues.
Not All Patterns Are Equal
Of the twenty one design patterns two of them need to be used with caution and in fact I recommend them with a health warning. These are State and Singleton. In this blog entry we’ll cover State and Singleton subsequently. Singleton for future reference causes problems similar to the use of global variables, a universally bad idea.
Why I Dislike the State Pattern
Each of the Design Patterns includes a consequences section outlining the issues relating to the use and implementation of each particular pattern. I encourage you to read the State pattern as there is little value to be gained from reproducing it here. A summary of the pattern can be found here http://exciton.cs.rice.edu/JavaResources/DesignPatterns/StatePat.htm, but reading the chapter in the book is best.
There are several consequences in using the State pattern, which to me are fundamental and make it to all intents and purposes impractical to use. These are:
Class Explosion!
It has taken the OO concept and applied it to the extreme as a class has to be implemented for each state. Even for simple state machines a large number of classes need to be implemented resulting in:
- More code to write
- Difficulty in reviewing the structure of the state machine as its implementation is smeared across multiple classes
Brittle Interface
The state interface as defined in the pattern is expensive to maintain when a new event is introduced. The valid events are defined in the interface of the ‘context’ class where there is a method for dealing with each event. These also need to be added to each state class.
Alternative State Pattern Implementations Strategies
There are other approaches to implementing state machines instead of using the State Pattern. These strategies vary depending upon the complexity of the state machine to be implemented.
Introduce an Event class
As the GoF Design Patterns book states, ‘design to interfaces not implementation’. One of the shortcomings of the State pattern is that the interface is brittle as previously described. When we want to add a new event we have to add a new method to the state ‘context’ interface and to all of its subclasses. This is expensive and a fiddle to do especially if there are many states implemented as classes.
The State Pattern only really goes half-way in providing an object-oriented implementation. What is missing is a representation for events. We can introduce a class for the event or better still an enumerate type, where the values of the enumeration represent the different possible events. We can then introduce an onEvent( Event anEvent ) method to the ‘context’ class and the State class.
Once we have a standard onEvent() interface we have a couple of alternative implementation strategies.
Table Driven
This is actually covered in the original design patterns book. We can make the implementation table driven. This essentially involves adding a two dimensional array with states as one dimension and events as the other. Each element of the array contains the next state. Additional arrays can be implemented to support other aspects onEvent() of transitioning from one state to another for instance an action to be performed (based on the Command pattern perhaps). The array can either be populated through initialization as code in the class, or alternatively, be data driven and read from a file (including an XML document).
The table driven approach has the advantage that the whole state machine can be viewed in a single place making it easier to review and maintain.
Flags and Conditional Logic
The state of the class can also be represented by flags, or variables for holding the state, within the class (using the good old enumated value again). Conditional logic within the method be used to determine the next state based upon the current state and the event that has just occurred. If you know the state == x then do this else if the state == y do the following.
This approach can be surprisingly simple to implement. Often the state machine to be implemented isn’t that complicated and so this is a very practical solution. Like the table driven approach the whole state machine can be viewed in a single place within the method and changes are easy to make as a result.
Guidelines
Most business objects do not have a sufficiently complex state to warrant a full implementation of the State pattern. Usually they follow a typical CRUD (Create, Read, Update and Delete) cycle. Therefore I recommend the following guidelines:
- In 9 out of 10 cases you should stick to the Flags and Conditional Logic Approach described above as most state implementations are not that complicated. This is usually sufficient for managing implementations requiring the support of up to five states.
- If the state machine is more complex, say with ~6 states and events, then consider introducing Events and combine this with a State Pattern implementation.
- Finally consider the Table Driven Approach, for the most complex State machines. Often I would still use in this preference to a class based State Pattern implementation.
Sunday, July 09, 2006
What’s so special about Agility?
So, the Agile Manifesto is based on a dozen ‘principles’. Let’s take a look and see. I’m concerned that the Agilites somehow think they are unique in their beliefs and that there is no other way in which quality software can be produced.
1. Our highest priority is to satisfy the customer through early and continuous delivery of valuable software.
This is the fast food philosophy to software development – clearly the Agilites can produce you something quickly however it is unlikely to be something that will be of quality, will inspire you or satisfy you beyond you immediate needs. Many customers are looking for something that will be substantial beyond their immediate short-term gratification. It is arrogant to assume that all customers are motivated and share this priority. Clearly time to market is important, but slapping out versions of half-baked software is not always the answer.
I’m reminded of Fred Brook’s preface to chapter 2 of The Mythical Man Month, actually the chapter named after the book. He quotes from the menu of restaurant Anoine, New Orleans – ‘Good cooking takes time. If you are made to wait, it is to serve you better, and to please you’.
2. Welcome changing requirements, even late in development. Agile processes harness change for the customer's competitive advantage.
Have these guys never seen effective change management processes? Even ‘waterfall’ techniques can accommodate change late in development when an effective change management process is in operation. Further, the customer is given an informed choice based on an impact assessment (these are really cheap and easy to do) irrespective of the development stage. The Agilites would press ahead without any foresight – I know some customer’s who wouldn’t feel comfortable with that. Can you imagine the scenario – humm let’s try that idea out and see if it works? I’m not sure what impact it will have longer term but let’s give this a try? This is a rather speculative and potentially risky strategy at times don’t you think?
3. Deliver working software frequently, from a couple of weeks to a couple of months, with a preference to the shorter timescale.
Continuous integration is possible with an iterative architecture driven approach. I would have it no other way, daily builds once construction has started. It is just a question of when does construction start? Perhaps a little later than the Agilites would like.
4. Business people and developers must work together daily throughout the project.
Do the Agilites live in a different Universe? If the customer is paying, they’d sure like it for the business to talk to the developers. I’ve never seen this not happen – why is this exclusive to Agile development?
5. Build projects around motivated individuals. Give them the environment and support they need, and trust them to get the job done.
Yep – done this, but not with Agile techniques. I’ve known developers wanting to defer their holidays because they could see the value of a plan based, architecture, design lead approach.
6. The most efficient and effective method of conveying information to and within a development team is face-to-face conversation.
True. But I hope they write it down – the mind is a fickle thing and miscommunication is easy. We can all speak the same words but mean totally different things.
7. Working software is the primary measure of progress.
Why? This is based on the premise that customer’s are not educated, or are to too stupid to understand that progress can be measured in other ways. If I’m having an extension added to my house I hardly expect the builders to turn up and start to dig the foundation and lay bricks until we’ve agreed the detailed blueprints. With an Agile approach they’d lay the foundations in the wrong place (costly to change) or build walls that would need to be knocked down. Why would a customer pay for an army of Agilites to start cutting code which will be wrong and require modification. Surely paying someone to think before that code is a far more prudent plan.
8. Agile processes promote sustainable development. The sponsors, developers, and users should be able to maintain a constant pace indefinitely.
What on earth does this mean? Can someone explain this principle, it seems to imply that sustainable progress is only unique to Agile approaches.
9. Continuous attention to technical excellence and good design enhances agility.
How is this possible – on the Agile projects that I have observed everyone is racing along so quickly that they don’t have time to think. Technical excellence is rarely achieved and poorly thought-out designs occur. Somehow there is an expectation that some combination of rudimentary code refactorings will salvage a poor design. This is rarely the case.
10. Simplicity--the art of maximizing the amount of work not done--is essential.
Yes I subscribe to the Ludwig Mies van der Rohe ‘less is more’ principle.
11. The best architectures, requirements, and designs emerge from self-organizing teams.
What effective teams aren’t self-organising?
12. At regular intervals, the team reflects on how to become more effective, then tunes and adjusts its behavior accordingly.
Continue review is an excellent principle, but again this isn’t exclusive to Agile approaches.
Wednesday, July 05, 2006
SOA, Designing Web Service APIs
I often get involved in discussions concerning the design of Web Service APIs. Invariably people get hung up on the implementation technology, WSDL, SOAP etc and loose sight of the underlying issues. Most people’s initial stab at a Web Service API is to think “oh, it’s like an local API call but with slower response times”. Well almost, but there is more to it than this.
Below I have identified five areas in which the design of local and remote APIs differ. I think these are useful when thinking about Web Services, what do you think? In addition I believe they can be reused in the future when Service Orientation becomes passé and is rebranded into something like Super Lightening Integration Metaphor (SLIM) or whatever …
API Consideration Areas
1. Calling Semantic
Remote API
Call by value. Parameters and return values adopt a pass by value semantics (e.g. objects are serializable)
Local API
Both call by value and reference can be supported. The later may be desirable for performance purposes and is usually desirable for local APIs.
2. Function Granularity
Remote API
Due to the communication overhead (principally latency) the API has to be designed so that the number of interactions/invocations is minimised. Typically this encourages a coarse grained API. This is also consistent with the calling semantics above.
Tens to Hundreds of invocations per second are possible. Beware if you are porting an application that used a local API previously.
Local API
Since calling is local, there is a minimal communication and invocation overhead. Millions of invocations per second are possible.
3. Security Issues
Remote API
Two aspects:
- Authentication – is the caller who they say they are (how do we prevent client identity being spoofed)
- Encryption – is some of the data being exchanged in each function invocation sensitive? The information may be transported via an untrusted channel. Plain text passwords are an obvious example, but there may be commercially sensitive information that also should not be passed in the clear.
Local API
With local invocation it isn’t usually necessary to worry about these issues.
4. Recovery Semantics
Remote API
The API must be designed to support recovery following communication failure or the loss of messages. This applies to both function invocation and response.
Think of a simple banking application which has a ‘Debit Bank Account Transaction’ as an example here we wish to debit £10 from a customer account we need to ensure that the transaction or function call is executed once. Technically this is called idempotency.
The message used for the invocation of the function may get lost, or the invocation message may be received successful but the return response message gets lost on the way back.
There are two strategies for dealing with this;
- the use of a token for each unique invocation. The recipient can check to see if the request has been processed previously. See Synchronizer Token pattern (http://www.refactoring.com/catalog/introduceSynchronizerToken.html)
- the provision of a ‘query’ function. If the caller is unsure that function has been executed a request can be made to determine the state.
Local API
A local API doesn’t need to worry about these issues as communication is typically within a machine/server.
5. API State
Remote API
To support scalability and recovery typically the API is designed to be stateless. The provider of the API is not in contact directly with the caller. The caller may crash without the provider realising. For stateful APIs there are recovery strategies, but typically these revolve around the use of timeouts which increase the complexity of the API implementation and use of more resources (as these need to be kept until the timeout has expired).
Local API
With local invocation, and typically as a result of adopting call by reference semantics, a stateful API can be implemented and is often desirable for performance reasons.
Thursday, June 29, 2006
Beware of the Agilites
The underlying theme of the Agilites (the advocates of agile methods and approaches) is to pooh-pooh ‘waterfall’ development processes and to claim that there is only one approach, the agile one, to the application of iterative development techniques. We all accept that a pure waterfall development is not ideal for the development of large-scale software, however they reject so much by dismissing it completely, this is like throwing out the baby with the bath water. Many of the Agilites base their views on not understanding large-scale development, being poor practitioners of it or having experienced it being done poorly. Iterative development can be conducted in a non-Scrum, non-XP, non-FDD manner without being a pure ‘waterfall’.
Agile projects do fail just as often as ‘waterfall’ ones, I’ve experienced all kinds and seen both fail. It has nothing to do with the approach and all to do with the experience of the practitioners.
You can see the misunderstands of the Agilites by examining the Agile manifesto.
Through this work we have come to value:
- Individuals and interactions over processes and tools
- Working software over comprehensive documentation
- Customer collaboration over contract negotiation
- Responding to change over following a plan
Let me take each of these points in turn.
- Point 1. As the old adage goes ‘a fool with a tool is still a fool’. The application of a development process and the appropriate use of tools requires skill and experience. A development process is only as good as the people (no fools please) and an endeavour of any complexity requires interactions. No development process espouses the absence of individuals interacting. Many neo-Agilites reading this value interpret this as a rejection of process and tools. They roll their eyes at the merest mention of the dreaded ‘P’ word and heaven forbid if you suggest that capturing a UML diagram in a tool can provide anything other than an inconvenience. I’ve applied both processes and tools very successfully thank-you.
- Point 2. This value is based on the premise that there is no benefit to be gained in producing artefacts other than software. The word ‘comprehensive’ is used in an almost disparaging way. All documentation should be produced for a purpose. We’ve all seen people/projects who go through the motions producing documentation that serves no purpose. This is not a problem with documentation per-say. Who can dispute the creation of a succinct domain model that provides insight into a business problem? These things cannot be uncovered by hacking software or if they can the problem domain must be trivially simple.
- Point 3. Can’t really dispute this point. However, I would say there are few of us who are in the luxurious position where commercial accountability does not come to bear. We all have to produce software on time, to quality and on budget.
- Point 4. This completely mis-understands the purpose of a plan and the art of project management. This is more a problem with poor project management (managers) not plans. A plan isn’t, as implied here, something that is created at the offset and blindly followed. It is a valuable tool that enables progress to be tracked, dependencies to be managed, changes to be easily assessed and is continually refined and revised (dare I say agilely!?). A project that follows a plan is not to the exclusion of accommodating change, far from it. It is only poor project plan practitioners who create a rigid plan and blindly stick to it. A plan should be revised daily (and not to just reflect progress).
In my next entry I will discuss how an architectural driven approach can be used to drive a plan based project which is adaptable to change and incorporates iterative techniques. It embraces the pragmatic application of precision and formalism whilst rejecting the ‘dash to code’ and ‘random walk’ approach to development software proffered by the Agilites.