Tuesday, August 9, 2016

H2O Upgrade: Detailed Steps

H2O Upgrade: Detailed Steps

I wanted to upgrade both R and Python to the latest version of H2O (as of Aug 9th 2016). Here are the exact steps, and I think you will find them relevant even if you only need to update one or the other of those clients. Remember if following this at a later date, that you should follow the spirit of it, rather than copy-and-pasting: all the version numbers will have changed. This was with Linux Mint, but it should apply equally well to all other Linux distros.
Make sure you first close any R or Python clients that are using the H2O library; and separately shutdown H2O if it is still running after that.

The First Time

The below instructions are all for upgrading, which means I know I have all the dependencies in place. If this is your first H2O install, well I’d first recommend you buy my new book: Practical Machine Learning with H2O, published by O’Reilly… But it is not out as I type this! So let me know if you are interested, and I’ll not just let you know when it is available, but also send the best discount code I can find.
As a quick guide, from R I recommend you use CRAN
install.packages("h2o")
and from Python I recommend you use pip:
pip install h2o
Both these approaches get all the dependences for you; you may end up back a version or two from the very latest, but it won’t matter.

The Download

cd /usr/local/src/
wget http://download.h2o.ai/versions/h2o-3.10.0.3.zip
unzip h2o-3.10.0.3.zip
It made a “h2o-3.10.0.3” directory, with python and R sub-directories.

R

I installed the h2o the first time as root, so I will continue to do that, hence the sudo:
cd /usr/local/src/h2o-3.10.0.3/R/
sudo R
Then:
remove.packages("h2o")
install.packages("h2o_3.10.0.3.tar.gz")
Then ctrl-d to exit.

Python

cd /usr/local/src/h2o-3.10.0.3/python/
sudo pip uninstall h2o
sudo pip install -U h2o-3.10.0.3-py2.py3-none-any.whl 
(The -U means upgrade any dependencies; the first time I forgot it, and ended up with some very weird errors when trying to do anything in Python.)

The Test

I started RStudio, and ran:
library(h2o)
h2o.init(nthreads=-1)
as.h2o(iris)
I then started ipython and ran:
import h2o,pandas
h2o.init()
iris = h2o.get_frame("iris")
print(iris)
As well as making sure the data arrived, I’m also checking the h2o.init() call in both cases said the cluster version was “3.10.0.3”.

AWS Scripts

If you use the AWS scripts (https://github.com/h2oai/h2o-3/tree/master/ec2) and want to make sure EC2 instances start with exactly the same version as you have installed locally, the file to edit is h2o-cluster-download-h2o.sh. (If not using those scripts, just skip this section.)
First find the h2oBranch= line and set it to “rel-turing” (notice the “g” on the end - there is also a version without the “g”!). Then comment out the two curl calls that follow, and instead set version to be whatever you have above, and build to be the last digit in the version number. So, for 3.10.0.3, I set:
h2oBranch=rel-turing

#echo "Fetching latest build number for branch ${h2oBranch}..."
#curl --silent -o latest https://h2o-release.s3.amazonaws.com/h2o/${h2oBranch}/latest
h2oBuild=3

#echo "Fetching full version number for build ${h2oBuild}..."
#curl --silent -o project_version https://h2o-release.s3.amazonaws.com/h2o/${h2oBranch}/${h2oBuild}/project_version
h2oVersion=3.10.0.3
The rest of that script, and the other EC2 scripts, can be left untouched.

Summary

Well that was easy! No excuses! Having said that, I recommend you upgrade cautiously - I have seen some hard-to-test-for regressions (e.g. model learning no longer scaling as well over a cluster) when grabbing the latest version.

Monday, March 21, 2016

WebGL: RWD, Mobile-First And Progressive Enhancement

WebGL: adapting to the device

This is a rather under-studied area, but it is going to become more important as WebGL is increasingly used to make websites. This article is a summary of my thoughts and learnings on the topic, so far.

I should say that my context is using WebGL for things other than games. Informational websites, educational apps, data visualization, etc., etc.

Please use the comments to add related links you can recommend; or just to disagree.

What is RWD?

Responsive Web Design. The idea is that, rather than make a mobile version of your website and a separate desktop version of your website, you make a single version in such a way that it will adapt and be viewable on all devices, from mobile phones (both portrait and landscape), through tablets, to desktop computers.

Mobile-First? Progressive Enhancement?

Mobile First is the idea that you first make your site work on the smallest screen, and the device with least capability. Then for the larger screens you add more sections.

This is contrast to starting at the other end: make beautiful graphics designed for a FullHD desktop monitor, using both mouse and keyboard, then hiding and removing things as you move to the smaller devices.

Just remember Mobile-First is a guideline, not a rule. If you end up with a desktop site where the user is getting frustrated by having to simulate touch gestures with their mouse, then you’ve missed the point.

It Is Hard!

RWD for a complex website can get rather hard. On toy examples it all seems nice and simple. But then add some forms. Make it multilingual. Add some CSS transitions and animations. Add user uploaded text or images. Then just as you start to crack under the strain of all the combinations that need testing, the fatal blow: the client insists on Internet Explorer 8 being supported.

But if you thought RWD and UX for normal websites was hard, then WebGL/3D takes it to a whole new dimension…

Progressive Enhancement In 3D

Progressive enhancement can be obvious things like using lower-polygon models and lower-resolution textures, or adding/removing shadows (see below).

But it can also be quite subtle things: in a 3D game presentation I did recently, the main avatar had an “idle” state animation: his chest moved up and down. But this requires redrawing the whole screen 60 times a second; without that idle animation the screen only needs to be redrawn when the character moves. Removing the idle animation can extend mobile battery life by an order of magnitude.

And that can lead to political issues. If you’ve ever seen a designer throw a tantrum just because you re-saved his graphics as 80% quality jpegs, think about what will happen if two-thirds of the design budget, and over three-quarters of the designer’s time went on making those subtle animations, and you’ve just switched them off for most of your users.

By the way, it is also about the difference between zero and one continuous animations. Remember an always-on “flyover” effect counts as animation. An arcade game where the user’s character is constantly being moved around the screen does too. So, once one effect requires constantly re-drawing the scene, the extra load of adding those little avatar animations will be negligible.

Lower-Poly Models

I mentioned this in passing above. Be aware that it is often more involved than with 2D images, where using Gimp/Photoshop to turn the 800x600 image to 320x240 and as lower quality can be automated. In fact you may end up with doubling your designer costs, if they have to make two versions.

If the motivation for low-poly is to reduce download time, you could consider running a sub-surf modifier once the data has been downloaded. Or describing the shape with a spline and dynamically extrude it.

If the motivation is to reduce the number of polygons to reduce CPU/GPU effort, again consider the extrude approach but using different splines, and/or different bevels.

Shadows

Adding shadows increases the realism of a 3D scene, but adds more CPU/GPU effort. Also more programmer effort: you need to specify which objects cast shadows, which objects receive shadows, and which light sources cast shadows. (All libraries I mentioned in Comparison Of Three WebGL Libraries handle shadows in this way.)

For many data visualization tasks, shadows are unnecessary, and could even get in the way. Even for games they are usually an optional extra. But in some applications the sense of depth that shadows give can really improve the user experience (UX).

If you have a fixed viewing angle on your 3D scene, and fixed lighting, you can use pre-made shadows: these are simply a 2D graphic that holds the shadow.

VR

With virtual reality headsets you will be updating two displays, and it has to be at a very high refresh rate, so it is demanding on hardware.

But virtual reality is well-suited for progressive enhancement: just make sure your website or application is fully usable without it, but if a user has the headset they are able to experience a deeper immersion.

Controls In 3D

Your standard web page didn’t need much controlling: up/down was the only axis of movement, and being able to click a link. Touch-only mobile devices could adapt easily: drag up/down, and tap to follow a link.

Mouseover hints are usually done as progressive enhancements, meaning they are not available to people not using a device with a mouse. (Meaning in mobile apps I often have no idea what all the different icons do…)

If your WebGL involves the user navigating around a 3D world, the four arrow keys can be a very natural approach. But there is no common convention on a touch-only device. Some games show a semi-transparent joystick control on top, so you press that for the 4 directions. Others have you touch the left/right halves of the screen to steer left and right, and perhaps you move at a constant speed.

Another approach is to touch/click the point you want to move to, and have your app choose the route, and animate following it.

Zoom is an interesting one, as the approach for standard web sites can generally be used for 3D too. There are two conventions on mobile: the pinch to grow/shrink, or double-tap to zoom a fixed distance (and double-tap to restore). With a mouse, the scroll-wheel, while holding down ctrl, zooms. With only a keyboard, ctrl and plus/minus, with ctrl and zero to restore to default zoom.

Friday, March 18, 2016

Timestamp helper in Handlebars

Handlebars is a widely-used templating language for web pages. In a nutshell, the variables to insert go between {{ and }}. Easy. It offers a few bits of logic, such as if/else clauses, for-each loops, etc. But, just as usefully, Handlebars allows you to add helper functions of your own.
In this article I will show a nice little Handlebars helper to format datestamps and timestamps. Its raison d’etre is its support for multiple languages and timezones. The simplest use case (assuming birthday is their birthday in some common format):
<p>Your birthday is on {{timestamp birthday}}.</p>
It builds on top of sugar.js’s Date enhancements; I was going to do this article without using them, to keep it focused, but that would have made it unreasonably complex.
There are two ways to configure it: with global variables, or with per-tag options. For most applications, setting the globals once will be best. Here are the globals it expects to find:
  • tzOffset: the number of seconds your timezone is ahead of UTC. E.g. if in Japan, then tzOffset = 9*3600. If in the U.K. this is either 0 or 3600 depending on if it is summer time or not.
  • lang: The user-interface language, e.g. “en” for English, “ja” for Japanese, etc.
(By the way, if setting lang to something other than “en”, you will also need to have included locale support into sugar.js for the languages you are supporting - this is easy, see the sugar.js customize page, and check Date Locales.)
The default timestamp format is the one built-in to sugar.js for your specified language. All these configuration options (the two above, and format) can be overridden when using the tag. E.g. if start is the timestamp of when an online event starts, you could write:
<p>The live streaming will start at
{{timestamp start tzOffset=0}} UTC,
which is {{timestamp start tzOffset=32400}}
in Tokyo and {{timestamp start tzOffset=-25200}}
in San Francisco.</p>
Here is the basic version:
Handlebars.registerHelper('timestamp', function(t, options){
var offset = options.hash.tzOffset;
if(!offset)offset = tzOffset;

if(!Object.isDate(t)){
    if(!t)return "";
    if(Object.isString(t))t = Date.create(t + "+0000").setUTC(true).addSeconds(offset);
    else t = Date.create(t*1000).setUTC(true).addSeconds(offset);
    }
else t = t.clone().addSeconds(offset);

if(!t.isValid())return "";

var code = options.hash.lang;
if(!code)code = lang;   //Use global as default

var format = options.hash.format ? options.hash.format : '';
return t.format(format, lang);
});
The first two-thirds of the function turn t into a Date object, coping whether it was already a Date object, or a string (in UTC, and in any common format the Date.create() can cope with), or a number (in which case it is seconds since Jan 1st 1970 UTC). However, be careful if giving a pre-made Date object: make sure it was the time in UTC and specifies that is in UTC.
The rest of the function just chooses the language and format, and returns the formatted date string.
If you were paying attention you would have noticed t stores a lie. E.g. for 5pm BST, t would be given as 4pm UTC. We then turn it into a date that claims to be 5pm UTC. Basically this is to stop format() being too clever, and adjusting for local browser time. (This trick is so you can show a date in a browser for something other than the user’s local timezone.)
But it does mean that if you include any of the timezone specifiers in your format string, they will wrongly claim it is UTC. {{timestamp
theDeadline format="{HH}:{mm} {tz}" }} will output 17:00 +0000.
To allow you to explicitly specify the true timezone, here is an enhanced version:
Handlebars.registerHelper('timestamp', function(t, options){
var offset = options.hash.tzOffset;
if(!offset)offset = tzOffset;   //Use global as default
if(!Object.isDate(t)){
    if(!t)return "";
    if(Object.isString(t))t = Date.create(t + "+0000").setUTC(true).addSeconds(offset);
    else t = Date.create(t*1000).setUTC(true).addSeconds(offset);
    }
else t = t.clone().addSeconds(offset);
if(!t.isValid())return "";

var code = options.hash.lang;
if(!code)code = lang;   //Use global as default

var format = options.hash.format ? options.hash.format : '';
var s = t.format(format, lang);
if(options.hash.appendTZ)s+=tzString;
if(options.hash.append)s+=options.hash.append;
return s;
});
(the only change is to add a couple of lines near the end)
Now if you specify appendTZ=true then it will append the global tzString. Alternatively you can append any text you want by specifying append. So, our earlier example becomes one of these:
{{timestamp theDeadline format="{HH}:{mm}" appendTZ=true}}
{{timestamp theDeadline format="{HH}:{mm}" append="BST"}}
{{timestamp theDeadline format="{HH}:{mm}" append=theDeadlineTimezone}}
The first one assumes a global tzString is set. The second one hard-codes the timezone, which is unlikely to be the case; the third one is the same idea but getting timezone from another variable.
VERSION INFO: The above code is for sugar.js v.1.5.0, which is the latest version at the time of writing, and likely to be so for a while. If you need it for sugar.js 1.4.x then please change all occurrences of setUTC(true) to utc().

Wednesday, March 9, 2016

Comparison Of Three WebGL Libraries

Comparison Of Three WebGL Libraries

For many people, WebGL is a technology for making browser-based games, but I am more interested in all the other uses: data visualization, data presentation, making web sites look fantastic, new and interesting user experience (UX), etc. (I have spent many years using Flash for similar things.)

What is WebGL?

WebGL is an API to allow browsers to use a GPU to speed up 2D and 3D graphics; you write in a mix of JavaScript and a shader language. Because it is low-level and complex I recommend against writing in raw WebGL; use a library instead.

It is supported on just about any popular OS/browser combination, including working on tablets and mobile phones. Your device does not need to have a dedicated GPU to run WebGL.

What libraries are there?

There are actually quite a few choices, but for this article I will focus on the three libraries I have made (non-trivial) WebGL applications with:
The first two are fairly low-level (Babylon.JS has a few more abstractions built-in), meaning you will be thinking in terms of vertices, faces, 3D coordinates, cameras, lighting, etc. A 3D graphics background will be useful. Superpowers is higher-level, but more focused on games development. Some Blender (or equivalent) skills will also come in handy, whichever library you go for.

Three.js And Its Resources


Three.JS is the most established WebGL library, with some published books, many demos (http://threejs.org/, https://stemkoski.github.io/Three.js/ and others), even a Udacity course.

However it has scant regard for backwards compatibility, meaning that frequently the code in the published books (or the source code of older demos and tutorials) will not work with the latest library version. It has a relatively aggressive developer community, who think that having an uncommented demo of a feature counts as documentation.

It uses the MIT license (the most liberal open source - fine for commercial use), hosted on github; bug reports to github, but support questions to StackOverflow’s [three.js] tag.

Babylon.js And Its Resources

Babylon.JS is now two years old, and was developed at Microsoft in France, though it is open source (Apache license, so fine for commercial use). It is primarily intended for making games, but is flexible enough for other work.

Like Three.JS, it has plenty of demos, and again they are often undocumented. There is an active web forum; explanations and experiments there often link to the Babylon Playground, which is a live coding editor. There is also a very useful eight hour training video course (free), presented by the two David’s who created Babylon.JS. (There is a just-released book, https://www.packtpub.com/game-development/babylonjs-essentials, but I’ve not seen it, so cannot comment.)

Superpowers And Its Resources

Superpowers is a bit different: it is a gaming system, with its own IDE. It is very new, only released as open source (ISC license, which is basically the nice liberal MIT license again) in the middle of January 2016, though appears to have a year’s closed development behind it. (The IDE is cross-platform; it has been running nicely for me on Linux, I’ve not tried it on other platforms.)

Some of the initial batch of demos and games have been released on GitHub (kind of as open-source - the licenses are a bit vague, especially regarding re-use of assets), which has been my main source of learning. A few tutorials have also appeared recently (GameFromScratch.com, and https://itch.io/board/11494/tutorials-guides).

What grabbed my attention was the quality and completeness of the Fat Kevin game, combined with the fact that I could download all source and assets for it, to learn from. (The Discover Superpowers demo is similar, but simpler, so easier to learn from.)

Support is through forums on itch.io, with separate English and French sections. This requires yet another user account; I find it a shame they didn’t use StackOverflow, Github, or at least HTML5 Game Devs (as Babylon did). I’d not heard of itch.io (“an open marketplace for independent digital creators with a focus on independent video games”) before, but I think their choice tells you how they see Superpowers being used.

The coding language is TypeScript, basically JavaScript 1.8 plus types; it is worth specifying those types, as then the IDE’s helpful autocomplete can work. Note that Superpowers is closely tied to the IDE - you need to be clicking and dragging things; doing everything in code is not realistic (though this might just be the style of the initial few games). Superpowers is built on Three.JS, but I’m not seeing anything exposed, so I don’t think you can take a Three.JS example and use it directly.

Conclusion

Which library to choose? I suggest you try out the demos for each of these, and choose the library that has demos that cover all the things you want to do. If the choice comes down to Three.JS vs. Babylon.JS, and you cannot find a killer reason to choose one over the other, this is because it doesn’t really matter, they can each do 95+% of what the other can: follow your hunch, choose one or the other, and dive in and learn it.

Finally I should say that WebGL for website development is hard: your programmer(s) will need 3D experience, as will your graphic designer(s). If you are using RWD/mobile-first to target both mobile and desktop, it is even more complex. My company, QQ Trend Ltd. can help (contact me at dc [at] qqtrend.com).