Tuesday, November 19, 2013

input type=numeric fix for Android 4.x

Gosh, RWD (responsive web design) combined with targeting both desktop, iOS and Android browsers is hard work. I just thought I was there when I got my  <input type="numeric" />  box working in all of Firefox, Chrome, Opera and Android 2.3, to my satisfaction. (Not identical behaviour: the up/down buttons only appear in Chrome and Opera, but that is the principle of progressive enhancement, and good enough.) I had to add a hack for Android 2.3, which makes things one pixel out of alignment in the desktop browsers at 400px or less. I can live with that, assuming desktop users won't be at 400px or less.

It worked in iOS7 (iPhone). Yeah and Phew!

On Android 4.2, native browser, there is this gap on the right of the input boxes!?!? It is just wide enough for the up/down buttons, but they are not drawn. Grrr....
(Some googling found someone on 4.0 with the same problem; other than that it seems almost unknown!)

Let me cut straight to the fix:

@media (max-width: 800px) {
input[type=number]::-webkit-outer-spin-button {
    -webkit-appearance: none;
    margin: 0;

All of that just came to me in a dream, and it worked first time. (No, actually I got that from this StackOverflow answer )

The media query is crude, as it means desktop Chrome won't show the buttons if you make the browser narrower than that, whereas a tablet with higher screen resolution will still show the gap. So, I'll probably switch to using user-agent detection later on.

Tuesday, November 12, 2013

Just discovered a new way to loop through months, up to the current month. I used to have horrible code like this:

    if($y==date("Y") && $m>=date("m"))continue;
    $t = strtotime("$Y-$m-01");
    echo date("M_Y",$t)."\n";

If wanting days, you get $t then do $days_in_month = date("t",$t);

Here is my new solution:

$start = strtotime("2009-01-15");
$now = time();
    echo date("M_Y",$t)."\n";

I.e. choose a day in the middle of the month, and do everything in seconds. Fewer calls to date() and shorter code. Obvious in hindsight!

Thursday, November 7, 2013

Saving downloaded files in SlimerJS (and Casper and Phantom)

It seems a common request is to be able to see not just the HTML of the main page that PhantomJS/SlimerJS are downloading, but also all the other files (images, CSS, JavaScript, fonts, etc.) that are being fetched. You can use onResourceReceived to see them being fetched, but not their body.

The situation with PhantomJS is a bit confusing: I believe there is a patch to allow this, but it hasn't been applied yet. There is also a download API being proposed (or possibly already implemented), but that appears to be for the special case of files that have a Content-Disposition: attachment header. (?)

In SlimerJS it is possible to use response.body inside the onResourceReceived handler. However to prevent using too much memory it does not get anything by default. You have to first set page.captureContent to say what you want. You assign an array of regexes to page.captureContent to say which files to receive. The regex is applied to the mime-type. In the example code below I use /.*/ to mean "get everything". Using [/^image/.+$/] should just get images, etc.

The below code sample will download and save all files. It is complete; you just have to edit the url at the top.

var url="http://...";

var fs=require('fs');
var page = require('webpage').create();


page.captureContent = [ /.*/ ];

page.onResourceReceived = function(response) {
//console.log('Response (#' + response.id + ', stage "' + response.stage + '"): ' + JSON.stringify(response));
if(response.stage!="end" || !response.bodySize)return;

var matches = response.url.match(/[/]([^/]+)$/);
var fname = "contents/"+matches[1];

console.log("Saving "+response.bodySize+" bytes to "+fname);

page.onResourceRequested = function(requestData, networkRequest) {
//console.log('Request (#' + requestData.id + '): ' + JSON.stringify(requestData));


It is verbose in that it says what it is saving. If you want it much more verbose, to see what other information is passing back and forth, there are two logging lines commented out.

WARNING: this works in SlimerJS 0.9 (and should work in 0.8.x), but the API may change in future (to keep in sync with PhantomJS).

Thursday, October 24, 2013

Why does JavaScript's LocalStorage behave like SessionStorage?

I added LocalStorage to a web page. It seemed like it was working well: when I reload the page, it found and used the existing data. Close the window, when another browser window is still open, then load the URL again: this also works fine, it finds the previous data.

But when I closed all browser windows, and restarted, the data was gone. In other words it appears to just be storing the data per-session, not indefinitely as should be happening. This was happening with both Firefox and Chrome!

It turns out it was configuration, in each browser! See below for the solutions for each. What is frustrating is that neither browser gave any error message; I'm not even sure there is a way to query that the storage is going to end up session-only. I didn't even expect LocalStorage to have the same privacy policy as cookies: cookies get sent back to the server, whereas LocalStorage does not.

Anyway, to the fixes:


In Firefox Preferences, Privacy tab, I had both "Accept cookies from sites" and accept "third-party cookies" checked. But under 3rd party cookies I had "Keep Until I Close Firefox". When I changed it to "Keep until they expire" then LocalStorage started working.

This is pretty strange, as it sounds like LocalStorage is considered a 3rd party cookie.

By the way, in about:config, there is also dom.storage.enabled. I had this set to true. If it was false that would be another reason it would not work (though I think then it would not work when just pressing reload either).


In Chrome, settings, advanced settings, content settings: under cookies I had "Keep local data only until I quit browser"
One solution is to the change that to "Allow local data to be set (recommended)". The alternative is to add an exception of "allow" for the domain name in question.

Tuesday, October 15, 2013

SlimerJS: getting it to work with self-signed HTTPS

SlimerJS (as of 0.8.3) lacks the commandline options of PhantomJS to say "relax about bad certificates". Unfortunately the self-signed SSL certificate, that developers typically use during development, counts as a bad certificate.

Here are the steps needed to handle this:

1. slimerjs --createprofile AllowSSL
  Make a note of the directory it has created.
  (You can call your new profile anything, "AllowSSL" is just for example.)

2. Go to normal desktop Firefox, browse to the URL in question, see the complaint, add it as a security exception.
  Chances are, if you have been testing your website already, that you've already done this and you can skip this step.

3. Go to your Firefox profile, and look for the file called "cert_override.txt". Copy that to the directory you created in step 1.

4. Have a look at the copy you just made of "cert_override.txt".
  If it only has the entry you added in step 2, you are done.
  Otherwise, remove the entries you don't want.
  (The file format is easy: one certificate per line.)

5. Now when you need to run slimerjs you must run it with the "-P AcceptSSL" commandline parameter.
  E.g. "slimerjs -P AcceptSSL httpstest.js"

If you are using SlimerJS with CasperJS (requires CasperJS 1.1 or later), do the same, e.g.
   casperjs test --engine=slimerjs -P AcceptSSL tests_involving_https.js

Monday, October 14, 2013

Cursed Closures In Javascript Loops

I hit this so many times, scratch my head for a while, then spit out: "Closures!" like it is a really nasty curse that could cause your Grandmother to faint.

I then waste half an hour trying to remember how to get around them, struggling to squint at the various StackOverflow answers to see how they relate to my own loop. So, here is a step-by-step example, that hopefully will make sense to me next time I hit this.

I'm using CasperJS here, but that is not too important. Just consider those lines as "something that gets executed later but is using local variables".

Here is the before:
var url="http://example.com/";
var A=['a','b','c'];
var B=['','99'];

for(var ix1 = 0;ix1 < A.length;++ix1){
    for(var ix2 = 0;ix2 < B.length;++ix2){
        var label = A[ix1] + "-" + B[ix2];
        casper.test.begin(label, {
            test:function(test){ runTheTests(test,url,B[ix2],A[ix1]); }
And here is the intermediate stage:
var url="http://example.com/";
var A=['a','b','c'];
var B=['','99'];

for(var ix1 = 0;ix1 < A.length;++ix1){
  for(var ix2 = 0;ix2 < B.length;++ix2)(function f(){

    var label = A[ix1] + "-" + B[ix2];
    casper.test.begin(label, {
      test:function(test){ runTheTests(test,url,B[ix2],A[ix1); }
Then you need to pass into that new function anything outside it that is changing on each pass of the loop. I.e. anything involving ix1 or ix2. It ends up looking like this:
var url="http://example.com/";
var A=['a','b','c'];
var B=['','99'];

for(var ix1 = 0;ix1 < A.length;++ix1){
  for(var ix2 = 0;ix2 < B.length;++ix2)(function f(a,b){

    var label = a + "-" + b;
    casper.test.begin(label, {
      test:function(test){ runTheTests(test,url,b,a); }

Thursday, October 3, 2013

Backing-up a bunch of small files to a remote server

I have a directory, containing lots of files, and I want an off-site, secure backup.

Even though the remote server might be a dedicated server that only I know root password for, I still don't trust it. Because of the recent NSA revelations I no longer consider myself paranoid. Thanks guys, I can look at myself in the mirror again!

As a final restriction, I don't want to have to make any temp files locally: disk space is tight, and the files can get very big.

Here we go:

tar cvf - MY_FOLDER/ | gpg -c --passphrase XXX | ssh REMOTE_SERVER 'cat > ~/MYFOLDER.tar.gpg'

(the bits in capitals are the things you replace.)

  • The "v" in "tar cvf" means verbose. Once you are happy it is working you will want to use "tar cf" instead.
  • The passphrase has to be given in the commandline because stdin is being used for the data!! A better way is to put the passphrase in another file: --passphrase-file passfile.txt. However note that this is only "better" on multi-user machines; on a single-user machine there is no real difference.
  • I'm using symmetric encryption. You could encrypt with your key pair, in which case the middle bit will change to: gpg -e -r PERSON  Then you won't need to specify the passphrase.
  • In my case REMOTE_SERVER is an alias to an entry in ~/.ssh/config. If you are not using that approach, you'll need to specify username, port number, identity file, etc. By the way, I'm not sure this method will work with password login, only keypair login, because stdin is being used for the data.
  • Any previous MYFOLDER.tar.gpg gets replaced on the remote server. So, if the connection gets lost halfway during the upload then you've lost your previous backup. I suggest using a datestamp in the filename, or something like that.
What about to get the data back?

ssh REMOTE_SERVER 'cat ~/MYFOLDER.tar.gpg' | gpg -d --passphrase XXX | tar xf -

You should now have a directory called MYFOLDER, with all your files exactly as they were.

Outstanding questions

Is it possible to use this approach in conjunction with Amazon S3, Google Drive, Rackspace cloud files, or similar storage providers? E.g. 100GB mounted as a Rackspace drive is $15/month (plus the compute instance of course, but I already have that), whereas 100GB as cloud files is $10/month, or $5/month on google drive. ($9.50/month on S3, or $1/month for glacier storage). Up to 15x cheaper: that is quite an incentive.

2013-10-08 Update: The implicit first half of that question is: is there a way to stream stdout to the remote drive (whether using scp or a specific commandline tool).
For Amazon S3 the answer is a clear "no": http://stackoverflow.com/q/11747703/841830 (the size has to be known in advance).
For Google Drive the answer is maybe. There is a way to mount google drive with FUSE: https://github.com/jcline/fuse-google-drive   It looks very complicated, describes itself as alpha, and the URL for the tutorial is a 404.
For Rackspace CloudFiles (and this should cover all OpenCloud providers), you can use curl to stream data! See " Chunked Transfer Encoding" in the cloud files developer guide HOWEVER, note that there is a 5GB limit on a file. That is a show-stopper for me. (Though by adding a custom script instead of "ssh REMOTE_SERVER 'cat > ~/MYFOLDER.tar.gpg'", I could track bytes transferred and start a new connection and file name at the 5GB point, so there is still hope. Probably only 10 lines of PHP will do it. But if I'm going to do that, I could just as easily buffer say 512MB in memory at a time, and use S3)

NOTE: Because I've not found an ideal solution yet, I never even got to the implicit second part of the question, which is if the need to "cat" on the remote server side will cause problems. I think not, but need to try it to be sure.

Friday, September 6, 2013

PHP: A jump in performance for free

Not often I just post a link to another blog, but Lorna Jane has some very interesting numbers on PHP versions:

But what is even more interesting is the benchmarks of each PHP version which I'm going to reproduce here:
  • 5.2.17: 3.77 seconds
  • 5.3.23: 2.63 seconds
  • 5.4.15: 1.98 seconds
  • 5.5RC1: 2.11 seconds

Assuming that benchmark accurately reflects your own application, then your code on php 5.4 will run in 75% of the time it previously took on 5.3.  (PHP 5.5 appears to only be 80% of php 5.3 speed; not sure if the slowdown was due to working with a release candidate, rather than the final release?)

(PHP 5.5 stops supporting Windows XP and Windows 2003, and adds nothing that caught my interest. PHP 5.4 on the other hand had some interesting features, and now I see a worthwhile speed-up I will target that for new projects.)

Saturday, August 31, 2013

Connecting Android 4.x devices to Linux (including Ubuntu 10.04)

The Problem

Android 2.x is easy: plug 'n' play with linux.
But in Android 4.x that component was deprecated/removed, and you have to use "MTP" to connect. And most (all?) Linux distros have no support for "MTP".

Since writing this, what I actually do is let the device be detected as a camera. Then I open the "Pictures" folder, and just copy everything in there. Movies, PDFs, etc. it does not seem to matter. So I've given up being able to organize stuff intelligently for the convenience of that having to mess around with the below steps.

The Solution

1. Download go-mtpfs.x86_64 from http://hanwen.home.xs4all.nl/public/software/go-mtpfs/
2. Make it executable ( chmod 777 go-mtpfs.x86_64 )
3. Make a mount point directory. E.g. mkdir nexus7  (may have to be root depending on permissions of the parent directory)
4. Plugin Android 4.x device. Make sure it is using MTP.
5. Mount:  sudo ./go-mtpfs.x86_64 nexus7
6. View:  sudo nautilus nexus7    (you could also use the commandline for your file actions: sudo bash )
7. To stop I did three things: close nautilus, then press ctrl-c to stop the go-mtpfs program, then sudo umount nexus7


That is using a pre-compiled binary, that you are going to run as root. Be afraid, be very afraid.
However it is an open source project and you can compile it yourself:  https://github.com/hanwen/go-mtpfs/
You need to install the "go" language, and probably some other stuff.

I used the pre-compiled binary, the 28 Jun 2013 version; here is its md5sum:
  $ md5sum go-mtpfs.x86_64
  bdc71b5e92dbabfa6d33c03a8a7f0e09  go-mtpfs.x86_64 

(This is meaningless without some trusted source telling us it is correct, as it may already have been hacked; but if you get a different value and it still says 28 Jun 2013, I'd be suspicious...)


This worked on Ubuntu 10.04, doing nothing extra than described above.
It should also work on Ubuntu 12.x

It is slow. 10MB file in 3.7 seconds seems typical. A 435MB and a 480MB file in 4m55s and 5m43s respectively (copied at the same time). A huge file is averaging 2.2MB/sec.

To make sure you are using MTP, go to settings, then Storage, then press the icon with three vertical dots, and choose "USB computer connection". The choices are MTP and PTP. PTP works out of the box with Linux, but only shows you the photographs directory.  That is perfect for plugging in to a printer at your local 7-11 or Fujifilm shop (or local equivalent if you are not in Japan), as it does not expose any other files but your photos. But not useful for anything else.

I suspect you always have to run go-mtpfs as root, but there is probably a way to have the mount point be visible to normal users, not just root. For me it was easier to just run nautilus as root, than try to work out how. Comments welcome!

If you want an alternative, or just more background information, this page looks good: http://bernaerts.dyndns.org/linux/74-ubuntu/247-ubuntu-automount-nexus7-mtp

Monday, July 8, 2013

Android and iPhone OS version breakdown

Here are two useful links on the breakdown of actual mobile users for each of Android and iPhone:

The reason I love these stats is that: a) they are meaningful, and b) they are meaningful. They are first meaningful because they record active users (people who access the Google PlayStore, and the Apple AppStore, respectively), as opposed to based on sales of handsets, or stats of some website that is focused on a particular group (e.g. Apple developers!!) And they are meaningful because almost all active users will have visited these particular sites at least once a month, the survey times are 14 days, so we're talking a boat-load of people. Big numbers means reliable stats (all else being equal - see previous comment about the likely lack of bias).

The big disappointment is both Google and Apple choose to show percentages, not actual numbers. And no breakdown by country of the user. (If you know how to get those stats, please let me know!!!)

I saw a thread on Slashdot saying that these stats show that Android handset market is much more fragmented than Apple. It does in fact show that 94% of people who visit the AppStore have upgraded to iOS6. There is some bias there, as I think they might be strongly encouraged to when they visit the AppStore? (Just checked, and it constantly reminds you to update, even if you don't visit AppStore). But, anyway, if the above theory that practically all active users visit the AppStore at least once a month is true, then it is useful information for a developer (i.e. testing on iOS6 should be sufficient).

Android users, on the other hand, get upgraded by their network provider? We still need to test on each of Android 2.3, 4.0 and 4.1.

But even more important for a developer, if you want to do any animation or special effects, is knowing the hardware power of the devices. The Android 2.3 stats suggest that percentage of users are still on relatively old hardware, single core. And that the other 2/3rds of users probably have a bit more firepower. For Apple users I've no idea of the breakdown. (Again, information donations are very welcome!)

Thursday, July 4, 2013

Some updating line charts in d3

There are so many lovely d3 demos... but I think this might be more due to the creativity and stubbornness of the people who have chosen to use d3 than because of some innate quality of the library.
(d3 is a charting library, that claims it is not a charting library; instead it likes to see itself as an alternative to JQuery, I think.)

As a case in point, there is this page showing three beautiful examples:

In particular scroll down to the bottom and it is tracking how much scrolling you've done on the page, with the time on the x-axis. But the script for this cleverness is about 110 lines. And it is quite opaque. To put that comment in context I've read the O'Reilly d3 book cover-to-cover, then played with every d3 example in the book.

By the way, another example of scrolling charts, this one with multiple charts overlaid: http://lyngbaek.com/real-time-stream-graph.html

Here is a similar chart done in Rickshaw (which is another library built on top of d3): http://code.shutterstock.com/rickshaw/examples/fixed.html

And another example done in Rickshaw, that is quite heavy-duty: http://code.shutterstock.com/rickshaw/examples/extensions.html

Sunday, June 16, 2013

RUnit and callback functions and scope

I have a function (called something() in the below example code) that takes a callback function as a parameter. To be flexible it can take the callback as a Function object, or as a character string, or as a list (which keeps the function name and the parameters together).

But I hit a problem when trying to write a unit test for something(). It works with a global function as the callback (such as mean in the below code), but not for a locally defined function (thingy in the below code):
thingy=function(x){ mean(x) }


something( list("mean",x) )
something("thingy",x)    #Fails
something( list("thingy",x) )   #Fails
The problem appears to be that functions in a test file are all loaded in a special environment. And, when I thought about it, that makes sense, unit test functions are supposed to be small and independent, and I don't want to be defining a function at the same level as the unit tests.

I tried a few things, but here is the one that worked:
assign("thingy",function(x){ mean(x) }, envir=globalenv())

something( list("mean",x) )
something( list("thingy",x) )

remove("thingy", envir=globalenv())
So, we explicitly put our thingy() in the global environment, where everyone can find it and make use of it. And we define it inside the test function, no earlier. Removing it at the end is optional, but Murphy's Law states that if we don't then it will end up clashing with something. Late on a Friday afternoon, in six months time, when we've forgotten about this code.

Monday, April 8, 2013

PhantomJS: POST, auth and timeouts

I recently discovered PhantomJS, which (for me) is an alternative to Selenium, with two key differences that (again, for me) make it very useful:
  1. It is headless, meaning you don't see anything graphical. That means I can run it on a server, from the commandline, without needing X installed. It also means it causes less load.
  2. It embeds webkit, rather than attempting to interface with many browsers, and control them as a user would.
The second point allows me to POST to a URL, which is great for testing how web services work in a real browser. Selenium refuses to offer this because it is not something a user can do with a browser. (The workaround in Selenium was to make a temporary page that uses an AJAX call to POST to the URL, then does something with what is returned.)

There are two things that PhantomJS makes difficult, which I will show techniques for here. The first is that authorization is kind-of-broken. The second is timeouts for requests that never finish (e.g. an http streaming web service). But, first, the basic example, without auth or timeouts, and using GET:
var page=require('webpage').create();
var callback=function(status){
    if (status=='success'){
        console.log('Failed to load.');
var url="http://example.com/something?name=value";
Now here is the same code with basic auth (shown in orange), and a five second time-out (shown in red):
var page=require('webpage').create();
page.customHeaders={'Authorization': 'Basic '+btoa('username:password')};
var callback=function(status){
    if (status=='success' || status=='timedout') {
        console.log('Failed to load.');
var timer=window.setTimeout(callback,5000,'timedout');
var url="http://example.com/something?name=value";
Don't use page.settings.userName = 'username';page.settings.password = 'password'; because it has a bug as of PhantomJS 1.9.0 (it uses two connections for GET requests and doesn't work at all for POST requests). Instead make your own basic auth header as shown here (thanks to Igor Semenko, on the PhantomJS mailing list for this trick).

For the time-out code I still call the same callback, but pass a status of "timedout" instead of "success" (so the callback could react differently, if timedout was a bad thing - here I treat them the same). So, if the URL finishes loading within 5000ms, then callback is called (by the page.open() call) with status equal to "success". If it has not finished within 5000ms then callback is called (by the javascript timer), with status equal to "timedout".

I explicitly clear the timer immediately when entering callback(). This is not really necessary, as we're about to shutdown (the phantom.exit() call) anyway. But it feels safer because otherwise callback() might be called twice (i.e. if the page loaded in exactly 5000ms); the more computation being done in callback(), especially if asynchronous, the more this might occur. (Well to be precise: that catches the case when page loads in just under 5000ms and triggers the callback before the timer does. But, if the timer gets in first, and then the page loads in just over 5000ms, and callback computation takes a while, then we may still get two calls. I think calling page.close() in callback() might prevent this, but that is untested.)

Finally, here is the same code using POST instead of GET:
var page=require('webpage').create();
page.customHeaders={'Authorization': 'Basic '+btoa('username:password')};
var callback=function(status){
    if (status=='success' || status=='timedout') {
        console.log('Failed to load.');
var timer=window.setTimeout(callback,5000,'timedout');
var url="http://example.com/something";
var data="name=value";
The differences are shown in red. It couldn't be easier!

Monday, February 4, 2013

Making sense of vfsStream

vfsStream is a very cool PHP library that abstracts the file system. Its primary use case is in unit tests, as a way to mock file system activity, and it integrates nicely with PHPUnit, even being mentioned in the PHPUnit manual.

Sadly the documentation is a bit lacking, so this article will try provide some middle ground behind the unexplained usage example you'll find in a few places, and the dry API docs. (In fact skip the API docs completely - the source is more understandable.)

I'm going to show this completely outside PHPUnit, as that clouds the issues. Let's start with the include we need:
   require_once 'vfsStream/vfsStream.php';

(Aside: all tests here are being done with version 0.12.0, installed using pear.)

Here is our minimal example:


The first line creates our filesystem.
The second line creates a file called "test.log", and puts a string inside it.

There is something I want to emphasize here, as it confused me no end. We have not created a directory called "logs". We have created a virtual file system called "logs". We have no sub-directories at all. test.log lives in the root of the filesystem.

Here is our troubleshooting tool:
   require_once 'vfsStream/visitor/vfsStreamStructureVisitor.php';
   print_r(vfsStream::inspect(new vfsStreamStructureVisitor())->getStructure());

It outputs:
        [logs] => Array
                [test.log] => Hello

(Yes, I know it still looks like logs is a directory name there too. It isn't.)

Another way to troubleshoot this is using PHP functions directly:
    while(($entry=$dir->read())!==false)echo $entry."\n";

This outputs:

Note: you cannot use "vfs://" as a url to get the root. This is like trying to access the root of a website with "http://". You need to use "http://example.com/" and you need to use "vfs://logs" for a virtual filesystem. As I said, I found this confusing, so I prefer to rewrite my above examples as follows (this also shows the complete code):

    require_once 'vfsStream/vfsStream.php';
    print_r(vfsStream::inspect(new vfsStreamStructureVisitor())->getStructure());
    while(($entry=$dir->read())!==false)echo $entry."\n";


This time we do have a directory called "logs", which is in the root of a virtual file system called "_virtualroot_". You may hate that approach for its horrible verbosity. Me? I like it.

A few random other points about vfsStream:
  • chdir("vfs://logs") does not work. The Known Issues page lists a few more.
  • vfsStream::url("logs/test.log") simply returns "vfs://logs/test.log".
  • fopen($fname,"at") fails. The "t" is not supported. You have to change your code to use "a" or "ab" (aside: "ab" is better, as "a" could work differently on different systems).
  • Calling vfsStream::setup  a second time removes the previous filesystem, even if you use a different virtual filesystem name.
  • I've not worked out how to use the git version. It may be that my above examples do not work with the very latest version. If I work it out, and that is the case, I'll post additional examples.