Category Archives: web

The Crazy Talk Series – An introduction to an alternative web architecture and construction method

If you have met me or worked with me in the last 15 years or so, you will know that I am pretty mad about the web as a platform, HTTP as a protocol and HTML as a state engine. Add to this the last 6 years or so practising agile techniques while building highly scalable websites and you tend to find that I have fairly unusual views in the field about how to build and architect web applications.

A lot of people say to me, Dan, you must write about X as the industry seems to be doing Y and while I find it very easy to talk in a face-to-face conversation or even a mailing list, writing an article or post has always been a challenge due to the one way nature of the conversation. Anyway after some gentle nudging by Sarah Taraporewalla, Christian Blunden and Martin Fowler I have decided to jump in head first and try and create a series of articles that will hopefully open a few peoples eye’s to some alternatives.

So enough with the why and onto the what I’m planning to cover in the series:

  • Caching on the wire not in the app
  • Addressable components
  • Personalisation, cachability and composition
  • Moving state off the web server
  • Web server decoupling, in-memory testing and lightning fast builds
  • Some alternative libraries
  • Common HTTP patterns

Html Contracts – How semantic html can help your cross functional team

One of the pain points we see on web projects is the divide between client side and back end development. This pain might show itself in a number of ways:

  • Small changes in the HTML cause lots of tests to fail
  • Small changes to visual layout require large changes to the HTML which then causes the above
  • Developers say the work is done but it can’t be signed off as it looks terrible or doesn’t work in certain browsers
  • CSS or QAs want developer to add ID attributes to lots of elements so they can target them more easily

Now ideally all your developers should be poly skilled and understand javascript / CSS / HTML just as well as they understand java / C# / ruby but often the reality is not quite so rosy.
So if we are working in a world where we don’t have the ideal but still need to get the job done what can we do to reduce the pain?

Well the technique I have used on a number of teams is to come up with a “HTML contract”. An example might be as follows:

  • Everyone must make the HTML as semantic as possible
  • IDs are deprecated in favor of class attributes. Just use more specific CSS selectors to target elements
  • Developers will liberally add class attributes with semantic names even if they don’t need them immediately (QAs and CSS will use them even if you don’t)
  • CSS / Designers can add class attributes if needed but can not remove class attributes without pairing with a dev/QA.
  • Changing the HTML to support visual display (I’m talking about document order, float and clear) is severely frowned upon. If you have to do it consider doing it with javascript instead.
  • QAs are to use hand written XPath expressions in tests that match the domain and make extensive use of contains(@class, ‘someClassName’) and descendant:: rather than IDs or specifying HTML tags

Some more general tips that I find help:

  • Converting tables to divs is really no better apart from download size. Use divs to group things or as the root element for a control, use lists for lists of things, tables for tabular data etc.
  • Hacking HTML, CSS, Javascript just because you are fed up with IE6 is not acceptable
  • No Cut and Pasting from the web. Or mass import of Javascript / CSS fixes. Don’t put it in if you don’t understand what it does.
  • When you have a CSS issue keep deleting rules until you work out which rule is causing the problem then rebuild the rules up.

I post some examples in my next post

The implicit back link

A little while back I was chatting to someone about SEO and the power of outgoing links. In the bad old days your client would ask you to build a web site that had no links leaving the site or put some horrible confirmation page asking you that you are leaving!

Whats interesting here is that by not having any external links you are reducing your interconnections with other sites, and so reducing your ranking in search engines. Not only do most search engines allow you to search for linked pages but each additional connection in either direction will increase the weighting that matching symbols (words) in those documents have the same implied semantics. This is just like a neural network where each additional connection re-enforces the bond further increasing the chance of a pattern match.

Add to the fact that incoming links, trackbacks and pingbacks are very common on blogs and other modern publishing tools and those implicit back links become explicit.

So if you want people to link to you, you should link to them.

What I would love to do

Smart Contracts

We may end up getting RESTful smart contracts for free due to all the goodness in the current design. This could lead (my client).com to an unheard of level of security and possible make it the most high profile public implementation of smart contracts to date. Watch this space

Auto Save Enhancement

We currently auto save on page transition but for JavaScript clients we could do this on any change. Imagine a world were your browser behaves like IntelliJ and you never loose any data you type in.

Web 3.0 Semantic Web

Due to our data being so dynamic we may have the opportunity to go from web 2.0 to web 3.0 and again this could make this one of the first killer semantic web applications.

What we will be doing in the future

Decoupling Resource selection from Representation selection in our Urls

This will allow resources (domain objects/documents) to be selected (think SQL WHERE clauses) and presenters re-used by the client editors without specific coding by developers. So a url like “/video/skins/latest” could return all videos tagged with skins or possibly any resource tagged with video and skins. What we are aiming for is totally dynamic / fluid urls with uses we never imagined.


Editors will be able to mashup YouTube videos, Facebook items, MySpace pages, Flickr images etc. into pretty much any where on the site. Users will be able to mashup client content / videos onto any other site.

Social Networking

Allowing users to rate each others content, interact with client content / users and other social networking sites and generally use the site in new and interesting ways


We will be totally embracing web caching so that all non current user specific content will be cache able. User specific content will be included client side (in an accessible way) so containing pages do not need to be made non cache able.

What I have been doing for the last few months

So I thought I would start my first post with what I have been working on for the last couple of months. Naturally I have removed all references to my current client.

What we have built already:

RESTful Web Site and Web Components

So just like REST exposed your web service api/end points on to url and embraced simple message / state transfer we are going a step further and making all our web components (think html widgets) be exposed by an addressable url. Composition of components can then be done done either server side or client side. And is therefore not language specific so one could compose a widget that comprised of other widgets written in ruby, python and C# etc. In fact the widgets don’t even need to be on our server or written by us – think mashups on steroids.

Post/Redirect/Get pattern

We have built a simple interface contract that enforces that all POSTs return a redirect so that the Back button will always work. This combined with the transaction boundaries ensures that all GET requests are idempotent.

Simple transaction boundaries

Transaction boundaries have been enforced so that developers do not need to worry about them at all in production code. GET requests run in a transaction that will always be rolled back, POST requests will always commit if successful and always roll back if an error is thrown. Some serious work went into taming Hibernate so that it did not auto commit, had no mutable static state and was completely encapsulated.

Extended SiteMesh

We are using SiteMesh not only for decoration of the site but for the composition of web components and the ability to extract any content from a page so that when combined with AHAH only the smallest payload is returned to the client keeping response time down.

AHAH (Asynchronous HTML and HTTP)

This enables much simpler JavaScript to be written (think JavaScript that never needs to replicate the domain model or business rules on the client) and allows for complete reuse of server side logic. This combined with behaviour CSS bindings ( ) leads to NO in-line JavaScript nastiness and more semantic html.

No Session State just persistent documents

We have NO session state, all state changes to documents are persisted which leads to a number of advantages: Users never loose data they have filled in, marketing can see exactly how far a user got before balling out of a work flow. Users can fill in data in any order they want and only when they are ready do we action their request. Domain objects are only updated once the document has been validated.

In memory web acceptance testing

Think super fast builds, no deployment (in fact no need for a web server), pure Java acceptance tests (i.e. refactorable) and the 80/20 rule on what is good enough to give you confidence that the system works.

Progressive Enhancement and Accessibility

All stories are played vanilla html version first and a second story for JavaScript enhancements. This leads to cleaner simpler semantic html and also allows feedback from the first story to be tacked onto the second story.

NO Logic in the View

We are using StringTemplate to enforce that NO logic can be written in the view, plus it’s super fast, has no “for loops” (think ruby each blocks).