Hadoop as a service


It’s been a fun year learning new stuff, and along the way Andy Piper helped out with a bite sized architectural debate while I was experimenting with a Hadoop service on Bluemix. Having a short lived/disposable memory I thought it would be worth posting the discussion here for future reference…

‏@jtonline: Still pondering how a hadoop buildpack might compare to a hadoop service

@andypiper: @jtonline why would you want a buildpack for Hadoop – surely data store = service (broadly) not runtime. #cloudfoundry

@jtonline: @andypiper hmm, maybe, but you want to bring the processing to the data don’t you? Currently seems like services will hold big data in silos

@jtonline: @andypiper for example, I might want to use one of the address verification services from my map reduce job. I’m probably missing something.

@andypiper: @jtonline multiple services can be bound to multiple apps. And you can call jobs in those services from those apps.

@andypiper: @jtonline PivotalHD ships as a service in PivotalCF – obviously you may need data access libraries in the buildpack for the app.

@jtonline: @andypiper not convinced hadoop is just a data store. Do I need apps on runtime to kick off oozie jobs with details of other services?

@andypiper: @jtonline the runtime/service debate on CF has been a long one but I think fairly clean/clear. I’d see Hadoop as a shared resource.

@andypiper: @jtonline bear in mind buildpack -> droplet -> runnable containerised app instance.

@jtonline: @andypiper agreed. Maybe what I’m missing is an easy way to wire services together?

@andypiper: @jtonline yeah maybe – you end up with apps acting as service coordinators I guess.

@andypiper: @jtonline coupled with the fact that apps are intentionally short-lived and best stateless… interesting architectural debate :-)

@andypiper: @jtonline (for “short-lived” read “disposable” my bad)

It should be an interesting 2015.

Advertisements

Getting a handle on social MDM


Since this is the first work related post for a while, it’s probably a good idea to drop in the usual disclaimer as a reminder: “The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.”

Anyway, I recently spotted an MDM enhancement request, Improve Better support for social handle support, and it seemed odd that there wasn’t already something in the data model that could do a better job than using misc values. There are probably several options but I think this is what I’d do…

Add a new “Social Network” contact method category, and associated contact method types, for example: “Twitter”, “LinkedIn”, etc. Here’s what those look like in the Business Admin UI:

cdcontmethcat

cdcontmethtp

Now you can just add social network contact methods in the same way as you would for telephone numbers and email addresses, which means you get all the standard functionality you’re likely to need.

For example, here’s what an example getPerson response looks like with my Twitter and LinkedIn details:

<?xml version="1.0" encoding="UTF-8"?>
<TCRMService xmlns="http://www.ibm.com/mdm/schema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.ibm.com/mdm/schema MDMDomains.xsd">
    <ResponseControl>
        <ResultCode>SUCCESS</ResultCode>
        <ServiceTime>17</ServiceTime>
        <DWLControl>
            <requesterName>mdmadmin</requesterName>
            <requesterLanguage>100</requesterLanguage>
            <requesterLocale>en</requesterLocale>
            <userRole>mdm_admin</userRole>
            <requesterTimeZone>EST5EDT</requesterTimeZone>
            <requestID>247353</requestID>
        </DWLControl>
    </ResponseControl>
    <TxResponse>
        <RequestType>getPerson</RequestType>
        <TxResult>
            <ResultCode>SUCCESS</ResultCode>
        </TxResult>
        <ResponseObject>
            <TCRMPersonBObj>
                <PartyId>531938348064117624</PartyId>
                <DisplayName>James Taylor</DisplayName>
                <PartyType>P</PartyType>
                <CreatedDate>2013-11-03 07:10:40.909</CreatedDate>
                <PartyLastUpdateDate>2013-11-03 07:10:41.175</PartyLastUpdateDate>
                <PartyLastUpdateUser>mdmadmin</PartyLastUpdateUser>
                <PartyLastUpdateTxId>153838348064091041</PartyLastUpdateTxId>
                <PersonPartyId>531938348064117624</PersonPartyId>
                <PartyActiveIndicator>Y</PartyActiveIndicator>
                <PersonLastUpdateDate>2013-11-03 07:10:41.767</PersonLastUpdateDate>
                <PersonLastUpdateUser>mdmadmin</PersonLastUpdateUser>
                <PersonLastUpdateTxId>153838348064091041</PersonLastUpdateTxId>
                <TCRMPartyAddressBObj>
                    <PartyAddressIdPK>537638348082796792</PartyAddressIdPK>
                    <PartyId>531938348064117624</PartyId>
                    <AddressId>539338348085784022</AddressId>
                    <AddressUsageType>3</AddressUsageType>
                    <AddressUsageValue>Business</AddressUsageValue>
                    <StartDate>2013-11-03 07:13:47.966</StartDate>
                    <PreferredAddressIndicator>Y</PreferredAddressIndicator>
                    <AddressGroupLastUpdateDate>2013-11-03 07:14:17.854</AddressGroupLastUpdateDate>
                    <AddressGroupLastUpdateUser>mdmadmin</AddressGroupLastUpdateUser>
                    <AddressGroupLastUpdateTxId>537038348085749779</AddressGroupLastUpdateTxId>
                    <LocationGroupLastUpdateDate>2013-11-03 07:14:17.839</LocationGroupLastUpdateDate>
                    <LocationGroupLastUpdateUser>mdmadmin</LocationGroupLastUpdateUser>
                    <LocationGroupLastUpdateTxId>537038348085749779</LocationGroupLastUpdateTxId>
                    <TCRMAddressBObj>
                        <AddressIdPK>539338348085784022</AddressIdPK>
                        <ResidenceType>11</ResidenceType>
                        <ResidenceValue>Office</ResidenceValue>
                        <AddressLineOne>IBM UK Ltd</AddressLineOne>
                        <AddressLineTwo>Hursley Park</AddressLineTwo>
                        <City>Winchester</City>
                        <ZipPostalCode>SO21 2JN</ZipPostalCode>
                        <CountryType>183</CountryType>
                        <CountryValue>Great Britain and N Ireland</CountryValue>
                        <AddressLastUpdateDate>2013-11-03 07:14:17.839</AddressLastUpdateDate>
                        <AddressLastUpdateUser>mdmadmin</AddressLastUpdateUser>
                        <AddressLastUpdateTxId>537038348085749779</AddressLastUpdateTxId>
                    </TCRMAddressBObj>
                </TCRMPartyAddressBObj>
                <TCRMPartyContactMethodBObj>
                    <PartyContactMethodIdPK>533238348104476375</PartyContactMethodIdPK>
                    <PartyId>531938348064117624</PartyId>
                    <ContactMethodId>534438348104476393</ContactMethodId>
                    <ContactMethodUsageType>10</ContactMethodUsageType>
                    <ContactMethodUsageValue>LinkedIn</ContactMethodUsageValue>
                    <SolicitationIndicator>N</SolicitationIndicator>
                    <StartDate>2013-11-03 07:17:24.762</StartDate>
                    <ContactMethodGroupLastUpdateDate>2013-11-03 07:17:24.778</ContactMethodGroupLastUpdateDate>
                    <ContactMethodGroupLastUpdateUser>mdmadmin</ContactMethodGroupLastUpdateUser>
                    <ContactMethodGroupLastUpdateTxId>535838348104476350</ContactMethodGroupLastUpdateTxId>
                    <LocationGroupLastUpdateDate>2013-11-03 07:17:24.762</LocationGroupLastUpdateDate>
                    <LocationGroupLastUpdateUser>mdmadmin</LocationGroupLastUpdateUser>
                    <LocationGroupLastUpdateTxId>535838348104476350</LocationGroupLastUpdateTxId>
                    <TCRMContactMethodBObj>
                        <ContactMethodIdPK>534438348104476393</ContactMethodIdPK>
                        <ReferenceNumber>http://www.linkedin.com/in/taylorjm</ReferenceNumber>
                        <ContactMethodType>3</ContactMethodType>
                        <ContactMethodValue>Social Network</ContactMethodValue>
                        <ContactMethodLastUpdateDate>2013-11-03 07:17:24.762</ContactMethodLastUpdateDate>
                        <ContactMethodLastUpdateUser>mdmadmin</ContactMethodLastUpdateUser>
                        <ContactMethodLastUpdateTxId>535838348104476350</ContactMethodLastUpdateTxId>
                    </TCRMContactMethodBObj>
                </TCRMPartyContactMethodBObj>
                <TCRMPartyContactMethodBObj>
                    <PartyContactMethodIdPK>539138348072352465</PartyContactMethodIdPK>
                    <PartyId>531938348064117624</PartyId>
                    <ContactMethodId>532838348072329035</ContactMethodId>
                    <ContactMethodUsageType>9</ContactMethodUsageType>
                    <ContactMethodUsageValue>Twitter</ContactMethodUsageValue>
                    <PreferredContactMethodIndicator>Y</PreferredContactMethodIndicator>
                    <StartDate>2013-11-03 07:12:03.523</StartDate>
                    <ContactMethodGroupLastUpdateDate>2013-11-03 07:12:03.57</ContactMethodGroupLastUpdateDate>
                    <ContactMethodGroupLastUpdateUser>mdmadmin</ContactMethodGroupLastUpdateUser>
                    <ContactMethodGroupLastUpdateTxId>536538348072325964</ContactMethodGroupLastUpdateTxId>
                    <LocationGroupLastUpdateDate>2013-11-03 07:12:03.523</LocationGroupLastUpdateDate>
                    <LocationGroupLastUpdateUser>mdmadmin</LocationGroupLastUpdateUser>
                    <LocationGroupLastUpdateTxId>536538348072325964</LocationGroupLastUpdateTxId>
                    <TCRMContactMethodBObj>
                        <ContactMethodIdPK>532838348072329035</ContactMethodIdPK>
                        <ReferenceNumber>@jtonline</ReferenceNumber>
                        <ContactMethodType>3</ContactMethodType>
                        <ContactMethodValue>Social Network</ContactMethodValue>
                        <ContactMethodLastUpdateDate>2013-11-03 07:12:03.289</ContactMethodLastUpdateDate>
                        <ContactMethodLastUpdateUser>mdmadmin</ContactMethodLastUpdateUser>
                        <ContactMethodLastUpdateTxId>536538348072325964</ContactMethodLastUpdateTxId>
                    </TCRMContactMethodBObj>
                </TCRMPartyContactMethodBObj>
                <TCRMPersonNameBObj>
                    <PersonNameIdPK>533538348064198718</PersonNameIdPK>
                    <NameUsageType>7</NameUsageType>
                    <NameUsageValue>Preferred</NameUsageValue>
                    <PrefixType>14</PrefixType>
                    <PrefixValue>Mr.</PrefixValue>
                    <GivenNameOne>James</GivenNameOne>
                    <StdGivenNameOne>JAMES</StdGivenNameOne>
                    <LastName>Taylor</LastName>
                    <StdLastName>TAYLOR</StdLastName>
                    <PersonPartyId>531938348064117624</PersonPartyId>
                    <StartDate>2013-11-03 07:10:41.986</StartDate>
                    <PersonNameLastUpdateDate>2013-11-03 07:10:41.986</PersonNameLastUpdateDate>
                    <PersonNameLastUpdateUser>mdmadmin</PersonNameLastUpdateUser>
                    <PersonNameLastUpdateTxId>153838348064091041</PersonNameLastUpdateTxId>
                    <LastUpdatedBy>mdmadmin</LastUpdatedBy>
                    <LastUpdatedDate>2013-11-03 07:10:41.986</LastUpdatedDate>
                </TCRMPersonNameBObj>
                <DWLStatus>
                    <Status>0</Status>
                </DWLStatus>
            </TCRMPersonBObj>
        </ResponseObject>
    </TxResponse>
</TCRMService>

Does that sounds sensible? Are there any enhancements? For example, I wonder about standardization: I put an ‘@’ on my Twitter ID, but I can easily imagine several variations ending up in there. I’ll leave that as an exercise for another day!

Check out the MDM Developers community for much more useful MDM related posts, forums and other resources.

Master Data Management links: March


It’s been a while since I last pulled together a few MDM related links, and I haven’t done it in March before, so here are a few sites I’ve been keeping an eye on lately. There’s a loose theme this month as well; these are a few of the MDM communities that are out there.

First up is a brand new group on LinkedIn. Created by Henrik at the end of February, this group already has 125 members and has immediately sparked some interesting discussions. (The group may be new but Henrik has been blogging for a while.)

There are plenty of other MDM groups on LinkedIn, and I’m a member of a few others, but so far none have really stood out. I’m hoping Multi-Domain MDM maintains momentum after it’s good start.

The next link is a community I’ve been a member of for a bit longer, in fact it featured in my first MDM links post! Dan was one of the earliest MDM bloggers I discovered, and is still posting on the Hub Designs Blog, so I’m not surprised this community is still going strong.

Blogging provided strong foundations for both these communities, and in some respects you don’t really need a purpose built ‘community’ site. An active blog with plenty of posts and discussions is just as good, and I’ve recently been enjoying this one:

And finally, communities are all about people, and that’s what twitter is good at. Lists make it easy to share people worth following and luckily Dan has one already, which saved me creating one myself!

Are there any other MDM communities you’d recommend?

New business cards


I get easily distracted when I’m attempted to tidy up, so when I found a pile of business card sized bits of paper I ended up making some DIY business cards instead. (If I had to choose between an old fashioned printing press and the latest/greatest 3d printer, I’m not sure which I’d go for… can I have both please?)

All ready for any future Tuesday Tweetup or Winchester Web events I make it to… except I’ll have probably lost them under a pile of stuff that needs tidying up by then!

Get off my hashtag


I’m still not completely convinced by hashtags on twitter. On the plus side, they can make following what people are saying about a show (#bbcrevolution) or an event (#iod2010) easier. On the other hand, these are a few random thoughts about the downside to hashtags…

Hashtags are common property, which is not a problem when people are cooperating to join threads of conversation, but it’s easy to see how that could #fail:

  • I might think #iod is a great tag for Information on Demand, but there are plenty of others who think it means something else.
  • An ‘official’ hashtag can avoid some of the confusion but, even if you manage to stake your claim to something unique enough, you can’t control it. While #bbcrevolution was talking about denial of service attacks, I was thinking about how easy it would be for anyone to launch a denial of hashtag attack. Not to mention when marketing tags get hijacked.
  • If there isn’t an official or obvious hashtag for something, it’s easy to end up with multiple hashtags. A bit of discussion can usually get things on track, but it always seems a little odd talking about the tag, rather than the subject of the tag.

And finally, the situation that got me thinking about hashtags in the first place. When different people use the same tag for almost the same thing, especially if one of those uses is much noisier than the other(s). In this case, tweets related to the Current Cost meter, and automated Current Cost meter readings have both used the #currentcost tag. Not a huge problem, except it would be easy to miss interesting information if the tag was swamped by even a few tweetjects posing meter readings:

“seems a shame that @mmnHouse is inserting the #currentcost hashtag to their house temp and elec reading. creates major noise” @yellowpark

“thinking we need a new hashtag for #CurrentCost stuff: one for bots and noisy automated stuff, another for discussion. what do people think?” @dalelane

“@dalelane How about #CurrentCostData ? I agree my searches are becoming muddled with people’s bots, and not information on #currentcost” @cumbers

“@dalelane Agreed – It frustrates me no end having countless #currentcost tweets popping up all day!” @markphelan

“thinking we should use a specific hashtag for tweeting #currentcost data to avoid creating noise. any suggestions? #ccdata ?” @yellowpark

“Moving from #currentcost hashtag to #mymeter with a data format for auto graphing.Join in discussion at http://is.gd/7b0si (via @ScaredyCat)” @stuartpoulton

So one solution is to agree on uses for #currentcost, #currentcostbot, #ccdata, #mymeter, etc. which is likely to work reasonably well for the Current Cost audience, but it may not be as practical for every situation.

Alternatively, as well as being able to mute retweets, it would be handy to be able to mute selected people using a hashtag in a way you’re not interested in. Even better if lists could be muted: if I could mute any tweets containing the #currentcost hashtag from anyone in my @jtonline/tweetject list, this problem goes away.

So, a fairly random collection of #thoughts on #hashtags. What are yours?

Pic and Mix


Unfortunately decorating the bathroom is higher up the to do list than blogging at the moment — I’d rather be blogging as I’ve yet to gas myself typing on a computer! — so I haven’t had a chance to mention some cool and interesting things that have been cluttering up my list of open browser tabs. While I wait for the paint fumes to subside before going to bed, here are a few of them, in no particular order…

First from the Mix and Mash Blog, and giving this post its title, Pic and Mix project from Kent County Council: I wonder if Eastleigh do anything similar.

From John’s Random Musings, Exposing your WebSphere logs as ATOM feeds: definitely want to give this a try with MDM Server.

From knolleary.net, Twitterlogue: wish twitter had been around when I was in New Zealand. Brilliant.

From developerWorks, Leverage DataPower SOA Appliances to extend InfoSphere Master Data Management Server security capabilities: looks interesting but I haven’t had a chance to read it in detail yet.

And finally, also from developerWorks, two new articles for the user interface generator:

The IET gets sociable


I recently read Yes, we can twitter while catching up on some E&T reading. Probably the most interesting bit for me was seeing @TheIET is also on twitter, so I tore off the bottom of the page with the link on to check out. Web 0.1 bookmarking then; I still like reading on paper.

The IET twitter account doesn’t look like it’s progressed beyond getting their brand on there; they don’t follow anyone, have a surprisingly small 168 followers and don’t seem to be talking to anyone. Still, hopefully it’s just a small beginning and, amongst the links to their web site, I did spot a press release about the launch of the new IET social networking site! That news somehow passed me by until now, so I’ve been investigating to see what it offers. To start with, I have yet another profile, which is not a big surprise. Earlier today I was scratching my head over a spiced up developerWorks profile. It has a bookmarking service which, for anyone in the IET new to such things is great. While I already use delicious for my own bookmarks, IET Discover combines bookmarks with groups, in what looks quite a similar way to Lotus Connections. There’s already a good selection of groups, although I’ve not found any that appear that active yet. Groups have always been a bit of a mystery to me in things like Facebook, never quite fulfilling their apparent potential, mostly ending up little more than a way to tag yourself as being interested in something.

Talking of tagging, from what I can tell on first look, I can tag my own profile, but other people can’t tag me, which seems like a missed opportunity. I think there’s much more value in tagging other people. In networks where you can tag yourself, I tend to have a poor attempt to start with, and then never return to keep the tags up-to-date.

And finally, I can watch people… except so far I’ve not found anyone to watch. I’m guessing it’s much like adding people to your delicious network.

Overall, it’s an interesting foray into the world of social networking. Like LinkedIn, it has a more professional focus, but it feels more limited by association with a single professional body. With recent homecamp, arduino and related projects in mind, I joined the electronic circuits group, but there are already more established social networks around those topics, whether IET members or not. Having said that, I think there is a place for more focused social networks. For example, I’m a big fan of developerWorks, where I’ve been trying to get some momentum for a community around the MDM Workbench, which is after all a pretty niche topic. So IET Discover looks interesting, and it has the potential to get me more involved in the IET. Time will tell how it turns out… maybe @TheIET will share their view…