And now for something completely different

After almost 10 years in Master Data Management, most of which with the rather lovely view below, I’ve moved on to Watson.


I can’t quite believe I stayed in the same department that long but there were plenty of fresh challenges along the way, and no shortage of people inside and outside IBM to keep it interesting.

I’ve been particularly lucky to have had so much support building up the MDM Developers community, which should be in safe hands to continue growing in the future. (If you’re interested in MDM and haven’t attended one of the live tech talk sessions, I would definitely recommend trying one. There are recordings of all the previous events on YouTube and check out Dany’s OSGi talk for a great example.)

If my first day in Watson is anything to go by, the next challenge is going to be far from dull!

Real Ale Train

After a recommendation after my last trip on the Watercress Line way back in 2009 I finally made it on to the RAT! (And it was another birthday present… in fact it was a coordinated effort from the in-laws which included the ticket, beer, @mrsjtonline sitting and a lift back from Alton! Thank you!)

I suspect that there may have been a few regulars on the train who knew what they were doing and arrived earlier than we did. Fortunately there was still space right at the back/front of the train (it spends the evening going to Alresford and back):


It was a fantastic evening with a superb mix of beer, company, food, weather, views and the steam train of course. Best evening out for a long time and well worth the wait.

There was plenty of time to get off the train as well, with a chance to peek inside a signal box:


Or just spend time outside on the platform:


Definitely should have done this sooner! Maybe I’ll get a chance to try the dining train in another six years or so…

Decisions, decisions

I was caught unprepared by an actual parliamentary candidate knocking on the door this evening! Unfortunately I was about to bath the kraken so I didn’t have time to come up with any sensible questions. Still, it’s about time I took an interest in the election…

It’s not long since I last got to vote for an MP and, like last time, I’m undecided. Unfortunately there aren’t as many candidates to choose from this time, with the final list being:

  • Declan Clune (TUSC)
  • Patricia Culligan (UKIP)
  • Mims Davis (Conservative)
  • Ray Hall, Beer (Baccy and Scratchings)
  • Mark Latham (Labour)
  • Ron Meldrum (Green)
  • Mike Thornton (Lib Dem)

There’s more information about all the Eastleigh candidates on YourNextMP, which was handy for finding most of them on twitter. At some point I hope to get round to contacting all the candidates with a few questions and, if I do, I’ll post any responses in case it helps anyone else. The first source of inspiration I’ve spotted is the Open Rights Group general election party manifesto wiki.

Early impressions are:

Hopefully I’ll have more to go on by 7th May! If anyone has any thoughts, including any of the candidates, please leave a comment.

Open Data Camp Day 1

If I don’t post a few notes from today’s Open Data Camp now, I never will, so here are a few things I scribbled down- it could be worse, I could have posted a PDF containing photos of the the actual scribbles!

So out of this choice


…I picked, Open Data for Elections, Open Addresses, Data Literacy, Designing Laws using Open Data, and Augmented Reality for Walkers.

Open Data for Elections

I’ve been following @floppy‘s crazy plan to get elected for a while, so this was the easiest decision of the day: what drives someone to embrace the gory inner workings of democracy like this?

Falling turnout it would seem, and concern for a functioning democracy.

The first step of his journey was the Open Politics Manifesto, which I’ve so far failed to edit- must try harder.

Perhaps more interesting was how this, and use of open data, fits into a political platform as a service. It would be nice to have the opportunity to see a few additions to the usual suspects at the ballot box, and Eastleigh got a rare chance to see what that could be like with a by election. Perhaps open data services for candidates could tip he balance enough to encourage more people to stand.

Things that sounded interesting:

  • Democracy Club
  • OpenCorporates
  • Data Packages
  • Open data certificates (food hygiene certificates for data?)
  • Candidates get one free leaflet delivery by Royal Mail- I wonder how big they expect those leaflets to be!

Open Addresses

@floppy and @giacecco introduced the (huge) problems they need to overcome to rebuild a large data set without polluting that data with any sources with intellectual property restrictions. Open Addresses still have a long way to go and there were comments about how long Open Street Map has been around, and it still has gaps.

They have some fun ideas about crowd sourcing address data (high vis jacket required) and there are some interesting philosophical questions around consent for addresses to be added.

It will be interesting to see whether Open Addresses can get enough data to provide real value, and what services they build.

Data Literacy

Mark and Laura led a discussion around data literacy founded in the observation that competent people, with all the skills you could reasonably expect them to have, still struggle with handling data sets.

Who needs to be data literate? Data scientists? Data professionals? Everyone?

Data plumbers? There were some analogies with actual plumbers! You might not be a plumber but it’s useful to know something about it.

If we live in a data driven society, we should know how to ask the right questions. Need domain expertise and technical expertise.

Things that sounded interesting:

Designing Laws using Open Data

@johnlsheridan pointed out that the least interesting thing to do with legislation is to publish it and went on to share some fascinating insights into the building blocks of statute law. It sounds like the slippery language used in legislation boils down to a small number of design patterns built with simple building blocks, such as a duty along with a claim right, and so on.

Knowing these building blocks makes it easier to get the gist of what laws are trying to achieve, helps navigate statutes, and could give policy makers a more reliable way to effect a goal.

For example, it’s easier to make sense of the legislation covering supply of gas, and it’s possible to identify where there may be problems. The gas regulator has a duty to protect the interests of consumers by promoting competition, but that’s a weak duty without a clear claim right to enforce it.

John also demonstrated a tool – – exploring how the language used in legislation has changed over time, for example how the use of “shall” has declined and been replaced by “is to be”.

Augmented Reality for Walkers

My choice of Android tablet was largely based on what might work reasonably well for maps and augmented reality, so I seized this opportunity!

Nick Whitelegg described the Hikar Android app he’s been working on, which is intended to help hikers follow paths by overlaying map data on a live camera feed.

The data is a combination of Open Street Map mapping data, with Ordnance Survey height data, which is downloaded and cached as tiles around your current location. Open GL is used to overlay a 3D view of the map data on the live camera feed, using the Android sensor APIs to detect the device’s rotation.

I’ve just downloaded and installed Hikar and, while my tablet is a tad slow, it works really well. I live somewhere flat and boring but the height data made a noticeable difference when Nick demonstrated the app in hilly Winchester.

Still to come: Day 2!

Why doesn’t Eclipse/Installation Manager work on Linux?

For the next time I’m grumbling about yet more incompatibilities causing problems with Eclipse on Linux, adding the following properties to the bottom of the launcher configuration file seems to help:


For example, edit the install.ini file for Installation Manager, which is where I first encountered problems after updating Red Hat Enterprise Linux last year. The problem appears to be due to incompatible GTK and Cairo versions, and there’s a related IBM technote for Installation Manager on RHEL 6.6.

Unfortunately, while that was enough to get Installation Manager working, the Eclipse IDE still seems somewhat unstable. I been exporting the following environment variable for a while but, based on the SDK Known Issues wiki page, maybe that doesn’t make any difference with recent versions:


Another suggestion I’ve seen recently, related to tooltips, is to export another environment variable, although so far I haven’t tried it:

export GRE_HOME=/dev/null

Need to experiment a bit more with those last two, and see if I can narrow down whether there are any other real problems.

Hadoop as a service

It’s been a fun year learning new stuff, and along the way Andy Piper helped out with a bite sized architectural debate while I was experimenting with a Hadoop service on Bluemix. Having a short lived/disposable memory I thought it would be worth posting the discussion here for future reference…

‏@jtonline: Still pondering how a hadoop buildpack might compare to a hadoop service

@andypiper: @jtonline why would you want a buildpack for Hadoop – surely data store = service (broadly) not runtime. #cloudfoundry

@jtonline: @andypiper hmm, maybe, but you want to bring the processing to the data don’t you? Currently seems like services will hold big data in silos

@jtonline: @andypiper for example, I might want to use one of the address verification services from my map reduce job. I’m probably missing something.

@andypiper: @jtonline multiple services can be bound to multiple apps. And you can call jobs in those services from those apps.

@andypiper: @jtonline PivotalHD ships as a service in PivotalCF – obviously you may need data access libraries in the buildpack for the app.

@jtonline: @andypiper not convinced hadoop is just a data store. Do I need apps on runtime to kick off oozie jobs with details of other services?

@andypiper: @jtonline the runtime/service debate on CF has been a long one but I think fairly clean/clear. I’d see Hadoop as a shared resource.

@andypiper: @jtonline bear in mind buildpack -> droplet -> runnable containerised app instance.

@jtonline: @andypiper agreed. Maybe what I’m missing is an easy way to wire services together?

@andypiper: @jtonline yeah maybe – you end up with apps acting as service coordinators I guess.

@andypiper: @jtonline coupled with the fact that apps are intentionally short-lived and best stateless… interesting architectural debate :-)

@andypiper: @jtonline (for “short-lived” read “disposable” my bad)

It should be an interesting 2015.

BigInsights Quicker Start

I’ve been taking a break from Liberty and JAX-RS recently to start tinkering with IBM’s BigInsights Hadoop distribution. To make things easier/more interesting my first attempts were using the Analytics for Hadoop service on BlueMix. In case it helps anyone, here’s what I ended up with before needing to install BigInsights myself:

And this is the script I used to upload data in the video (unfortunately I didn’t have any luck using the HttpFS API):



curl -iv –user ${BIUSER}:${BIPASSWORD} –insecure -X POST “${BIURI}/user/${BIUSER}/sample-data?isfile=false”

curl -iv –user ${BIUSER}:${BIPASSWORD} –insecure -X POST “${BIURI}/user/${BIUSER}/sample-data/orgdata.unl” –header “Content-Type:application/octet-stream” –header “Transfer-Encoding:chunked” -T “orgdata.unl”

curl -iv –user ${BIUSER}:${BIPASSWORD} –insecure -X POST “${BIURI}/user/${BIUSER}/sample-data/persondata.unl” –header “Content-Type:application/octet-stream” –header “Transfer-Encoding:chunked” -T “persondata.unl”


  • since recording the demo Bluemix has added a United Kingdom region, however it looks like the Analytics for Hadoop service is currently only available on the US South region.
  • there is also now a BigInsights service on Bluemix which allows you to provision multi-node Hadoop clusters.