And now for something completely different

After almost 10 years in Master Data Management, most of which with the rather lovely view below, I’ve moved on to Watson.


I can’t quite believe I stayed in the same department that long but there were plenty of fresh challenges along the way, and no shortage of people inside and outside IBM to keep it interesting.

I’ve been particularly lucky to have had so much support building up the MDM Developers community, which should be in safe hands to continue growing in the future. (If you’re interested in MDM and haven’t attended one of the live tech talk sessions, I would definitely recommend trying one. There are recordings of all the previous events on YouTube and check out Dany’s OSGi talk for a great example.)

If my first day in Watson is anything to go by, the next challenge is going to be far from dull!

BigInsights Quicker Start

I’ve been taking a break from Liberty and JAX-RS recently to start tinkering with IBM’s BigInsights Hadoop distribution. To make things easier/more interesting my first attempts were using the Analytics for Hadoop service on BlueMix. In case it helps anyone, here’s what I ended up with before needing to install BigInsights myself:

And this is the script I used to upload data in the video (unfortunately I didn’t have any luck using the HttpFS API):



curl -iv –user ${BIUSER}:${BIPASSWORD} –insecure -X POST “${BIURI}/user/${BIUSER}/sample-data?isfile=false”

curl -iv –user ${BIUSER}:${BIPASSWORD} –insecure -X POST “${BIURI}/user/${BIUSER}/sample-data/orgdata.unl” –header “Content-Type:application/octet-stream” –header “Transfer-Encoding:chunked” -T “orgdata.unl”

curl -iv –user ${BIUSER}:${BIPASSWORD} –insecure -X POST “${BIURI}/user/${BIUSER}/sample-data/persondata.unl” –header “Content-Type:application/octet-stream” –header “Transfer-Encoding:chunked” -T “persondata.unl”


  • since recording the demo Bluemix has added a United Kingdom region, however it looks like the Analytics for Hadoop service is currently only available on the US South region.
  • there is also now a BigInsights service on Bluemix which allows you to provision multi-node Hadoop clusters.

Java dumps

I recently had to debug a problem with the MDM Workbench where exporting a tailoring project for Information Server didn’t do anything. In fact it didn’t even report any problems!

Unfortunately the code in question likes to put a brave face on things and just reports that everything was OK, even when something goes wrong. This was the perfect opportunity to try out some of the diagnostic tools available for the IBM Java runtime, which I’ve been meaning to try for ages. I had an idea where the problem was likely to be but to find out for sure I started the workbench using the following command line:

eclipsec -vmargs -Xdump:system:events=catch,filter=java/lang/AbstractMethodError#com/ibm/mdm/tools/export/infoserver/job/MDMDatabaseDAO.queryDatabaseWithoutFilter

Sure enough the failing export produced a dump which I could check using the Memory Analyzer tool. You can get the IBM version via IBM Support Assistant but it’s probably easier to get the standard Eclipse Memory Analyzer and add the required IBM plugins from the DTFJ update site.

I’m fortunate enough to work in Hursley so I could pester someone who works on IBM Java runtime diagnostics, but there’s also a helpful article on developerWorks with details of how to trigger dumps, and how to run queries using OQL:

Debugging from dumps: Diagnose more than memory leaks with Memory Analyzer

So mystery solved- if you have an Oracle database and want to exporting tailoring projects for Information Server, make sure you set up the database connection with a more recent JDBC driver than the defaults.

Getting a handle on social MDM

Anyway, I recently spotted an MDM enhancement request, Improve Better support for social handle support, and it seemed odd that there wasn’t already something in the data model that could do a better job than using misc values. There are probably several options but I think this is what I’d do…

Add a new “Social Network” contact method category, and associated contact method types, for example: “Twitter”, “LinkedIn”, etc. Here’s what those look like in the Business Admin UI:



Now you can just add social network contact methods in the same way as you would for telephone numbers and email addresses, which means you get all the standard functionality you’re likely to need.

For example, here’s what an example getPerson response looks like with my Twitter and LinkedIn details:

<?xml version="1.0" encoding="UTF-8"?>
<TCRMService xmlns="" xmlns:xsi="" xsi:schemaLocation=" MDMDomains.xsd">
                <DisplayName>James Taylor</DisplayName>
                <CreatedDate>2013-11-03 07:10:40.909</CreatedDate>
                <PartyLastUpdateDate>2013-11-03 07:10:41.175</PartyLastUpdateDate>
                <PersonLastUpdateDate>2013-11-03 07:10:41.767</PersonLastUpdateDate>
                    <StartDate>2013-11-03 07:13:47.966</StartDate>
                    <AddressGroupLastUpdateDate>2013-11-03 07:14:17.854</AddressGroupLastUpdateDate>
                    <LocationGroupLastUpdateDate>2013-11-03 07:14:17.839</LocationGroupLastUpdateDate>
                        <AddressLineOne>IBM UK Ltd</AddressLineOne>
                        <AddressLineTwo>Hursley Park</AddressLineTwo>
                        <ZipPostalCode>SO21 2JN</ZipPostalCode>
                        <CountryValue>Great Britain and N Ireland</CountryValue>
                        <AddressLastUpdateDate>2013-11-03 07:14:17.839</AddressLastUpdateDate>
                    <StartDate>2013-11-03 07:17:24.762</StartDate>
                    <ContactMethodGroupLastUpdateDate>2013-11-03 07:17:24.778</ContactMethodGroupLastUpdateDate>
                    <LocationGroupLastUpdateDate>2013-11-03 07:17:24.762</LocationGroupLastUpdateDate>
                        <ContactMethodValue>Social Network</ContactMethodValue>
                        <ContactMethodLastUpdateDate>2013-11-03 07:17:24.762</ContactMethodLastUpdateDate>
                    <StartDate>2013-11-03 07:12:03.523</StartDate>
                    <ContactMethodGroupLastUpdateDate>2013-11-03 07:12:03.57</ContactMethodGroupLastUpdateDate>
                    <LocationGroupLastUpdateDate>2013-11-03 07:12:03.523</LocationGroupLastUpdateDate>
                        <ContactMethodValue>Social Network</ContactMethodValue>
                        <ContactMethodLastUpdateDate>2013-11-03 07:12:03.289</ContactMethodLastUpdateDate>
                    <StartDate>2013-11-03 07:10:41.986</StartDate>
                    <PersonNameLastUpdateDate>2013-11-03 07:10:41.986</PersonNameLastUpdateDate>
                    <LastUpdatedDate>2013-11-03 07:10:41.986</LastUpdatedDate>

Does that sounds sensible? Are there any enhancements? For example, I wonder about standardization: I put an ‘@’ on my Twitter ID, but I can easily imagine several variations ending up in there. I’ll leave that as an exercise for another day!

Check out the MDM Developers community for much more useful MDM related posts, forums and other resources.

Getting Technical: Resources for MDM Developers

One reason for the lack of new posts here is that I’ve been attempting to write my first article for the Mastering Data Management blog. After several false starts, and falling back to old school pencil and paper to finally get going, it’s finally done! So, for anyone interested in master data management, here is the very latest Mastering Data Management front page hot off the press:

Getting Technical: Resources for MDM Developers

Master Data Management links: March

It’s been a while since I last pulled together a few MDM related links, and I haven’t done it in March before, so here are a few sites I’ve been keeping an eye on lately. There’s a loose theme this month as well; these are a few of the MDM communities that are out there.

First up is a brand new group on LinkedIn. Created by Henrik at the end of February, this group already has 125 members and has immediately sparked some interesting discussions. (The group may be new but Henrik has been blogging for a while.)

There are plenty of other MDM groups on LinkedIn, and I’m a member of a few others, but so far none have really stood out. I’m hoping Multi-Domain MDM maintains momentum after it’s good start.

The next link is a community I’ve been a member of for a bit longer, in fact it featured in my first MDM links post! Dan was one of the earliest MDM bloggers I discovered, and is still posting on the Hub Designs Blog, so I’m not surprised this community is still going strong.

Blogging provided strong foundations for both these communities, and in some respects you don’t really need a purpose built ‘community’ site. An active blog with plenty of posts and discussions is just as good, and I’ve recently been enjoying this one:

And finally, communities are all about people, and that’s what twitter is good at. Lists make it easy to share people worth following and luckily Dan has one already, which saved me creating one myself!

Are there any other MDM communities you’d recommend?

Happy New Year!

Following Dan Power and Crysta Anderson’s lead, I’m going to kick off the new year with a look back at the most popular posts from 2010. So with barely a pause and not even a drum roll, the winners are…

1. My second CurrentCost development board circuit

Way out ahead at number one is the only circuit board I’ve completed and put to regular use. Still working fine, apart from a brief pause when the batteries ran out. Kind of regretting replacing the batteries just in time for the recent spell of cold weather!

2. Master Information Hub: Getting Started

Not a close second, but still respectably ahead of the pack, this post is one I regularly point people to the first time they use the MDM Workbench. Hopefully it’s helped a few people out this year.

3. New clock radio

Leading the pack is this surprise entry to the top ten. Unlike some Joggler owners, I still use it fairly regularly and, apart from the occasional experiment, I’m still using the O2 software it came with. I did give Jolicloud another go yesterday, to see whether a little bluetooth keyboard helps; nice, but just not quite fast enough to switch permanently. Might give MeeGo a try next.

4. Get off my hashtag

Had a really interesting chat at the last homecamp about tagging, so this is a subject I’m likely to return to this year.

5. Weather Underground + Mashup Hub + Pachube = orb food

Maybe it’s just me but I get quite excited about the potential that this kind of data mashup has. Perhaps it’s because I’ve seen what you can do with enterprise data and software like Message Broker; now imagine the possibilities with open data and simple ways for anyone to manipulate it. (That’s not manipulation in the political sense of course!)

6. Master Information Hub: Social Bookmark Services

This follows on from the number 2 post, while the third in the series has some catching up to do and didn’t make the top 10. I also have some has some catching up to do; I hope to get to the next instalment early this year.

7. Liberal Democrats can’t win here

Politicians, gotta love ’em. I wonder how these graphs will look if we get proportional representation for the next election.

8. Home Easy ambient orb

All soldered together but not yet receiving that lovely data from the number 5 post. I’m currently pondering whether to just hard code things ‘for now’ or hack some more so that the three orbs could be programmed using the BlinkM sequencer.

9. Digital House Arrest

Politicians again. Really. Very. Annoying. I never did get a reply to my last letter to my MP, Chris Huhne.

10. Manifesto

Given that all politicians seem to be as bad as each other I was half tempted to stand as a RON (reopen nominations) candidate Anyone else up for a For The Win party next time?!

Highly commended: It takes two

Not actually in the top ten but this post about Hedge End twinning deserves an honourable mention for the great comments about Frome’s twins.

Happy new year!