It’s been a fun year learning new stuff, and along the way Andy Piper helped out with a bite sized architectural debate while I was experimenting with a Hadoop service on Bluemix. Having a short lived/disposable memory I thought it would be worth posting the discussion here for future reference…
@jtonline: Still pondering how a hadoop buildpack might compare to a hadoop service
@andypiper: @jtonline why would you want a buildpack for Hadoop – surely data store = service (broadly) not runtime. #cloudfoundry
@jtonline: @andypiper hmm, maybe, but you want to bring the processing to the data don’t you? Currently seems like services will hold big data in silos
@jtonline: @andypiper for example, I might want to use one of the address verification services from my map reduce job. I’m probably missing something.
@andypiper: @jtonline multiple services can be bound to multiple apps. And you can call jobs in those services from those apps.
@andypiper: @jtonline PivotalHD ships as a service in PivotalCF – obviously you may need data access libraries in the buildpack for the app.
@jtonline: @andypiper not convinced hadoop is just a data store. Do I need apps on runtime to kick off oozie jobs with details of other services?
@andypiper: @jtonline the runtime/service debate on CF has been a long one but I think fairly clean/clear. I’d see Hadoop as a shared resource.
@andypiper: @jtonline bear in mind buildpack -> droplet -> runnable containerised app instance.
@jtonline: @andypiper agreed. Maybe what I’m missing is an easy way to wire services together?
@andypiper: @jtonline yeah maybe – you end up with apps acting as service coordinators I guess.
@andypiper: @jtonline coupled with the fact that apps are intentionally short-lived and best stateless… interesting architectural debate :-)
@andypiper: @jtonline (for “short-lived” read “disposable” my bad)
It should be an interesting 2015.