I’ve been taking a break from Liberty and JAX-RS recently to start tinkering with IBM’s BigInsights Hadoop distribution. To make things easier/more interesting my first attempts were using the Analytics for Hadoop service on BlueMix. In case it helps anyone, here’s what I ended up with before needing to install BigInsights myself:
And this is the script I used to upload data in the video (unfortunately I didn’t have any luck using the HttpFS API):
#!/bin/sh
BIUSER=biblumix
BIPASSWORD=password
BIURI=https://hostname:8443/data/controller/dfscurl -iv –user ${BIUSER}:${BIPASSWORD} –insecure -X POST “${BIURI}/user/${BIUSER}/sample-data?isfile=false”
curl -iv –user ${BIUSER}:${BIPASSWORD} –insecure -X POST “${BIURI}/user/${BIUSER}/sample-data/orgdata.unl” –header “Content-Type:application/octet-stream” –header “Transfer-Encoding:chunked” -T “orgdata.unl”
curl -iv –user ${BIUSER}:${BIPASSWORD} –insecure -X POST “${BIURI}/user/${BIUSER}/sample-data/persondata.unl” –header “Content-Type:application/octet-stream” –header “Transfer-Encoding:chunked” -T “persondata.unl”
Notes:
- since recording the demo Bluemix has added a United Kingdom region, however it looks like the Analytics for Hadoop service is currently only available on the US South region.
- there is also now a BigInsights service on Bluemix which allows you to provision multi-node Hadoop clusters.