Showing posts with label testing. Show all posts
Showing posts with label testing. Show all posts

Friday, March 31, 2017

The Seven Day A Year Bug

I’ll cut straight to the chase: when you use d.setMonth(m - 1) in JavaScript, always set the optional second parameter.

What’s that, you didn’t know there was one? Neither did I until earlier today. It allows you to set the date. Cute, I thought at the time, minor time-saver, but hardly worth complicating an API for.

Ooh, how wrong I was. Let me take you back to when it happened. Friday, March 31st….

After a long coding session, I did a check-in. And then ran all unit tests. That is strange, three failing, but in code I hadn’t touched all day. I peer at the code, but it looks correct - it was to do with dates, specifically months, and I was correctly subtracting 1.

Aside: JavaScript dates inherit C’s approach of counting months from 0. In the first draft of this blog post I used a more judgemental phrase than “approach”. But to be fair, it was a 1970s design decision, and the world was different back then. Google “1970s men fashion”.

So, back to the test failures. I start using “git checkout xxxx” to go back to earlier versions, to see exactly when it broke. Running all tests every time. I know something fishy is going on, by the time I’ve gone back 10 days and the tests still fail. I am fairly sure I ran all tests yesterday, and I am certain it hasn’t been 10 days.

Timezones?! Unlikely, but we did put the clocks back last weekend. But a quick test refutes that. (TZ=XXX mocha . will run your unit tests in timezone XXX.)

So, out of ideas, I litter the failing code with console.log lines, to find out what is going on.

Here is what is happening. I initialize a Date object to the current date (to set the current year), then call setMonth(). I don’t use the day, so don’t explicitly set it. I was calling setMonth(8), expecting to see “September”, but the unit test was being given “October”. Where it gets interesting is that the default date today is March 31st. In other words, when I set month to September the date object becomes “September 31st”, which isn’t allowed. So it automatically changes it to October 1st.

You hopefully see where the title of this piece comes from now? If I was setting a date in February I would have discovered the bug two days earlier, and if my unit test had chose October instead of September, the bug would never have been detected. If I’d thought, “ah, I’ll run them Monday”, the bug would not have been discovered until someone used the code in production on May 31st. I’d have processed their bug report on June 1st and told them, “can’t reproduce it”. And they’d have gone, “Oh, you’re right, neither can I now.”

To conclude with a happy ending, I changed all occurrences of d.setMonth(m - 1) into d.setMonth(m-1, 1), and the test failures all went away. I also changed all occurrences of d.setMonth(m-1);d.setDate(v) (where v is the day of the month) into: d.setMonth(m-1, v) not because it is shorter and I can impress people with my knowledge of JavaScript API calls, but because two separate calls was a bug that I simply didn’t have a unit test for.

But writing that unit test can wait until Monday.

Wednesday, June 25, 2014

Casper, d3, jquery and clicking

I spent (wasted? invested?) an awful lot of time yesterday on trying to use CasperJS to click a link inside an SVG diagram that had been made by d3.js. I’ll start this article by showing my Foolish Mistake, but then will document what I did learn.

Foolishness

Normally, in CasperJS, we write this.click('#myButton'), where “#myButton” can be just about any CSS selector.
Aside: it is normally this.click() rather than casper.click()
because it is normally in the handler function of a casper.waitXXX()
function.
This wasn’t working for me trying to click an SVG <g> tag, that starts a search running. I was taking a screenshot 0.5s later to check it had worked, and it showed the page had not changed.
And that turned out to be foolish. The click was working, the search was working, d3 and SVG had nothing to do with anything. The problem was simply were no search results found, and so it never moved to the next page. When I changed the search string, suddenly click() was working.
So my troubleshooting turned out to be barking up the wrong tree. But, still, I did learn a few things from it.

Using evaluate() With jQuery

Instead of calling this.click() you can do something like this:
this.evaluate(function(){
  $('#myButton').click();
  };
This runs JavaScript from within the browser’s context. In this case I use jQuery. This works just as well as using Casper’s click() outside the evaluate().
Here there is no advantage. But, by being in a different scope, we have extra flexibility: we could call other functions, or add new event handlers, etc, etc.

Using evaluate() With d3

Here was another of my attempts, but this one does not work:
this.evaluate(function(){
  d3.select('#myButton').click();
  };
The reason: a d3 selection does not have a ready-made click() function.

Making events happen

this.evaluate(function(){
  var evt = document.createEvent("MouseEvents");
  evt.initMouseEvent("click", true, true, window,
    0, 0, 0, 0, 0, false, false, false, false, 0, null);
  return d3.select('#myButton').node().dispatchEvent(evt);
  };
This is how you do a click with d3 (you could use this approach with jQuery too (see a helper function), but are unlikely to ever need to). The first lines create a (simulated) mouse click. The final line sends that event to the DOM item of interest.
Aside: dispatchEvent() returns false if any of the event handlers
cancelled it, true otherwise.
I learned this here; that answer also says this should have worked:
this.evaluate(function(){
  d3.select('#myButton').on("click")()
  };
This definitely does not work for me. Why? It is actually a cheat, trying to find and call the click handler for the button. In my case the click handler was attached to the parent object (a <g>) not the object I was calling select on. (Also, because it is a cheat, this approach does not work when you’ve attached multiple handlers.)
By the way, if not using jQuery or d3, you can use querySelector() and do it this way:
this.evaluate(function(){
  var evt = document.createEvent("MouseEvents");
  evt.initMouseEvent("click", true, true, window,
    0, 0, 0, 0, 0, false, false, false, false, 0, null);
  return document.querySelector('#myButton').dispatchEvent(evt);
  };

Summary

The real lesson here, for me, was when something doesn’t work, make sure you are judging success in the right way!
Beyond that, it turns out there are a whole host of ways to click a button in a CasperJS script. Use the simplest when you can, bear the others in mind for special occasions.
Written with StackEdit.

Sunday, March 2, 2014

Clearing cookies between CasperJS tests

I spent absolutely ages searching for how to do this, and it turns out to be really easy: phantom.clearCookies();

One page I found suggested casper.clear() would do it. I'm not fully sure what it does clear, but cookies are not covered.

In context here is what it looks like:

  var username = ..., password = ...;
  (function f(label,url){    casper.test.begin(label, {
      test:function(test){ runTheTests(test,label,url,username,password); }
    });
  })("first", "http://127.0.0.1/first");


  (function f(label,url){
    casper.test.begin(label, {
      test:function(test){ runTheTests(test,label,url,username,password); }
    });
  })("second", "http://127.0.0.1/second");

     function runTheTests(test,label,url,username,password){
  casper.start();
  casper.clear();

  phantom.clearCookies();
  casper.thenOpen(url,function(){
    console.log('Loaded:'+url);
    });
 

  ...Other actions and tests here...

  casper.then(function(){
    test.done();
    });

  casper.run();

  }

I.e. first test does a successful login. Second test was supposed to then see a different login page, but in fact it was still logged-in. Adding phantom.clearCookies() did the trick.


Tuesday, October 15, 2013

SlimerJS: getting it to work with self-signed HTTPS

SlimerJS (as of 0.8.3) lacks the commandline options of PhantomJS to say "relax about bad certificates". Unfortunately the self-signed SSL certificate, that developers typically use during development, counts as a bad certificate.

Here are the steps needed to handle this:

1. slimerjs --createprofile AllowSSL
  Make a note of the directory it has created.
  (You can call your new profile anything, "AllowSSL" is just for example.)

2. Go to normal desktop Firefox, browse to the URL in question, see the complaint, add it as a security exception.
  Chances are, if you have been testing your website already, that you've already done this and you can skip this step.

3. Go to your Firefox profile, and look for the file called "cert_override.txt". Copy that to the directory you created in step 1.

4. Have a look at the copy you just made of "cert_override.txt".
  If it only has the entry you added in step 2, you are done.
  Otherwise, remove the entries you don't want.
  (The file format is easy: one certificate per line.)

5. Now when you need to run slimerjs you must run it with the "-P AcceptSSL" commandline parameter.
  E.g. "slimerjs -P AcceptSSL httpstest.js"

If you are using SlimerJS with CasperJS (requires CasperJS 1.1 or later), do the same, e.g.
   casperjs test --engine=slimerjs -P AcceptSSL tests_involving_https.js


Sunday, June 16, 2013

RUnit and callback functions and scope

I have a function (called something() in the below example code) that takes a callback function as a parameter. To be flexible it can take the callback as a Function object, or as a character string, or as a list (which keeps the function name and the parameters together).

But I hit a problem when trying to write a unit test for something(). It works with a global function as the callback (such as mean in the below code), but not for a locally defined function (thingy in the below code):
thingy=function(x){ mean(x) }

test.something=function(){
x=1:10

something(mean,x)
something("mean",x)
something( list("mean",x) )
something(thingy,x)
something("thingy",x)    #Fails
something( list("thingy",x) )   #Fails
}
The problem appears to be that functions in a test file are all loaded in a special environment. And, when I thought about it, that makes sense, unit test functions are supposed to be small and independent, and I don't want to be defining a function at the same level as the unit tests.

I tried a few things, but here is the one that worked:
test.something=function(){ 
assign("thingy",function(x){ mean(x) }, envir=globalenv())
 
x=1:10

something(mean,x)
something("mean",x)
something( list("mean",x) )
something(thingy,x)
something("thingy",x)
something( list("thingy",x) )

remove("thingy", envir=globalenv())
}
So, we explicitly put our thingy() in the global environment, where everyone can find it and make use of it. And we define it inside the test function, no earlier. Removing it at the end is optional, but Murphy's Law states that if we don't then it will end up clashing with something. Late on a Friday afternoon, in six months time, when we've forgotten about this code.

Monday, April 8, 2013

PhantomJS: POST, auth and timeouts

I recently discovered PhantomJS, which (for me) is an alternative to Selenium, with two key differences that (again, for me) make it very useful:
  1. It is headless, meaning you don't see anything graphical. That means I can run it on a server, from the commandline, without needing X installed. It also means it causes less load.
  2. It embeds webkit, rather than attempting to interface with many browsers, and control them as a user would.
The second point allows me to POST to a URL, which is great for testing how web services work in a real browser. Selenium refuses to offer this because it is not something a user can do with a browser. (The workaround in Selenium was to make a temporary page that uses an AJAX call to POST to the URL, then does something with what is returned.)

There are two things that PhantomJS makes difficult, which I will show techniques for here. The first is that authorization is kind-of-broken. The second is timeouts for requests that never finish (e.g. an http streaming web service). But, first, the basic example, without auth or timeouts, and using GET:
var page=require('webpage').create();
var callback=function(status){
    if (status=='success'){
        console.log(page.plainText);
        }else{
        console.log('Failed to load.');
        }
    phantom.exit();
    };
var url="http://example.com/something?name=value";
page.open(url,callback);
Now here is the same code with basic auth (shown in orange), and a five second time-out (shown in red):
var page=require('webpage').create();
page.customHeaders={'Authorization': 'Basic '+btoa('username:password')};
var callback=function(status){
    if(timer)window.clearTimeout(timer);
    if (status=='success' || status=='timedout') {
        console.log(page.plainText);
        }else{
        console.log('Failed to load.');
        }
    phantom.exit();
    };
var timer=window.setTimeout(callback,5000,'timedout');
var url="http://example.com/something?name=value";
page.open(url,callback);
Don't use page.settings.userName = 'username';page.settings.password = 'password'; because it has a bug as of PhantomJS 1.9.0 (it uses two connections for GET requests and doesn't work at all for POST requests). Instead make your own basic auth header as shown here (thanks to Igor Semenko, on the PhantomJS mailing list for this trick).

For the time-out code I still call the same callback, but pass a status of "timedout" instead of "success" (so the callback could react differently, if timedout was a bad thing - here I treat them the same). So, if the URL finishes loading within 5000ms, then callback is called (by the page.open() call) with status equal to "success". If it has not finished within 5000ms then callback is called (by the javascript timer), with status equal to "timedout".

I explicitly clear the timer immediately when entering callback(). This is not really necessary, as we're about to shutdown (the phantom.exit() call) anyway. But it feels safer because otherwise callback() might be called twice (i.e. if the page loaded in exactly 5000ms); the more computation being done in callback(), especially if asynchronous, the more this might occur. (Well to be precise: that catches the case when page loads in just under 5000ms and triggers the callback before the timer does. But, if the timer gets in first, and then the page loads in just over 5000ms, and callback computation takes a while, then we may still get two calls. I think calling page.close() in callback() might prevent this, but that is untested.)

Finally, here is the same code using POST instead of GET:
var page=require('webpage').create();
page.customHeaders={'Authorization': 'Basic '+btoa('username:password')};
var callback=function(status){
    if(timer)window.clearTimeout(timer);
    if (status=='success' || status=='timedout') {
        console.log(page.plainText);
        }else{
        console.log('Failed to load.');
        }
    phantom.exit();
    };
var timer=window.setTimeout(callback,5000,'timedout');
var url="http://example.com/something";
var data="name=value";
page.open(url,'post',data,callback);
The differences are shown in red. It couldn't be easier!

Monday, February 4, 2013

Making sense of vfsStream

vfsStream is a very cool PHP library that abstracts the file system. Its primary use case is in unit tests, as a way to mock file system activity, and it integrates nicely with PHPUnit, even being mentioned in the PHPUnit manual.

Sadly the documentation is a bit lacking, so this article will try provide some middle ground behind the unexplained usage example you'll find in a few places, and the dry API docs. (In fact skip the API docs completely - the source is more understandable.)

I'm going to show this completely outside PHPUnit, as that clouds the issues. Let's start with the include we need:
   require_once 'vfsStream/vfsStream.php';

(Aside: all tests here are being done with version 0.12.0, installed using pear.)

Here is our minimal example:

   vfsStream::setup('logs');
   file_put_contents(vfsStream::url('logs/test.log'),"Hello\n");


The first line creates our filesystem.
The second line creates a file called "test.log", and puts a string inside it.

There is something I want to emphasize here, as it confused me no end. We have not created a directory called "logs". We have created a virtual file system called "logs". We have no sub-directories at all. test.log lives in the root of the filesystem.

Here is our troubleshooting tool:
   require_once 'vfsStream/visitor/vfsStreamStructureVisitor.php';
   print_r(vfsStream::inspect(new vfsStreamStructureVisitor())->getStructure());

It outputs:
    Array
    (
        [logs] => Array
            (
                [test.log] => Hello
            )
    )


(Yes, I know it still looks like logs is a directory name there too. It isn't.)

Another way to troubleshoot this is using PHP functions directly:
    $dir=dir("vfs://logs");
    while(($entry=$dir->read())!==false)echo $entry."\n";


This outputs:
    test.log

Note: you cannot use "vfs://" as a url to get the root. This is like trying to access the root of a website with "http://". You need to use "http://example.com/" and you need to use "vfs://logs" for a virtual filesystem. As I said, I found this confusing, so I prefer to rewrite my above examples as follows (this also shows the complete code):

   
    require_once 'vfsStream/vfsStream.php';
    vfsStream::setup('_virtualroot_',null,array('logs'=>array()));
    file_put_contents(vfsStream::url('_virtualroot_/logs/test.log'),"Hello\n");
    print_r(vfsStream::inspect(new vfsStreamStructureVisitor())->getStructure());
    $dir=dir("vfs://_virtualroot_");
    while(($entry=$dir->read())!==false)echo $entry."\n";

    ?>

This time we do have a directory called "logs", which is in the root of a virtual file system called "_virtualroot_". You may hate that approach for its horrible verbosity. Me? I like it.

A few random other points about vfsStream:
  • chdir("vfs://logs") does not work. The Known Issues page lists a few more.
  • vfsStream::url("logs/test.log") simply returns "vfs://logs/test.log".
  • fopen($fname,"at") fails. The "t" is not supported. You have to change your code to use "a" or "ab" (aside: "ab" is better, as "a" could work differently on different systems).
  • Calling vfsStream::setup  a second time removes the previous filesystem, even if you use a different virtual filesystem name.
  • I've not worked out how to use the git version. It may be that my above examples do not work with the very latest version. If I work it out, and that is the case, I'll post additional examples.

Sunday, July 1, 2012

Mock The Socket, in PHP

I wanted to put a unit test around some PHP code that use a socket, and of course hit a problem: how do I control what a call to fgets returns? You see, in PHP, you cannot replace one of the built-in functions: you get told "Fatal error: Cannot redeclare fgets() ...".

Rename Then Override Then Rename Again!


I asked on StackOverflow, not expecting much response, but almost immediately got told about the rename_function(). Wow! I'd never heard of that before. The challenge then was that this is in the apd extension, which was last released in 2004 and does not support php 5.3. I've put instructions on how to get it installed on the StackOverflow question so I won't repeat them here.

The next challengyou'll meet is that naively using rename_function to a move a function out of the way fails. You still get told "Fatal error: Cannot redeclare fgets() ..." ?! You need to use override_function to replace its behaviour. All Done? Not quite, what you discover next is that you can only override one function. Eh?! But all is not lost: the comments in the marvellous PHP manual described the solution, which goes like:
  1. Use override_function to define the new behaviour
  2. Use rename_function to give a better name to the old, original function.
However, when you go to restore a function (see further down), it turns out that does not work. What you actually need to do is:
  1. Use rename_function to give a name to the old, original function, so we can find it later.
  2. Use override_function to define the new behaviour
  3. Use rename_function to give a dummy name to __overridden__
You do those three steps for each function you want to replace. Here is a complete example that shows how to override fgets and feof to return strings from a global array. NOTE: this is a simplistic example; I should really be overriding fopen and fclose too (they'd be set to do nothing).


$GLOBALS['fgets_strings']=array(
    "Line 1",
    "Line 2",
    "Line 3",
    );

rename_function('fgets','real_fgets');
override_function('fgets','$handle,$length=null','return array_shift($GLOBALS["fgets_strings"])."\n";');
rename_function("__overridden__", 'dummy_fgets'); 

rename_function('feof','real_feof');
override_function('feof','$handle','return count($GLOBALS["fgets_strings"])==0;');
rename_function("__overridden__", 'dummy_feof');

$fname="rename_test.php";
$fp=fopen($fname,"r");

if($fp)while(!feof($fp)){
    echo fgets($fp);
    }
fclose($fp);


Mock The Sock(et)


So, what about the original challenge, to unit test a socket function? Here is some very minimal code to request a page from a web server; let's pretend we want to test this code:

$fp=fsockopen('127.0.0.1',80);

fwrite($fp,"GET / HTTP/1.1\r\n");
fwrite($fp,"Host: 127.0.0.1\r\n");
fwrite($fp,"\r\n");

if($fp)while(!feof($fp)){
    echo fgets($fp);
    }

fclose($fp);

To take control of its behaviour, we prepend the following block; the above code does not have to be touched at all.

rename_function('fwrite','real_fwrite');
override_function('fwrite','$fp,$s','');
rename_function("__overridden__", 'dummy_fwrite');

rename_function('fsockopen','real_fsockopen');
override_function('fsockopen',
    '$hostname,$port=-1,&$errno=null,&$errstr=null,$timeout=null',
    'return fopen("socket_mock_contents.txt","r");'
    );
rename_function("__overridden__", 'dummy_fsockopen');

I.e. We replace the call to fsockopen with a call to fopen, and tell it to read our special file. (If you don't want to use an external file, and instead want a fully self-contained test, with the contents in a string, you could use phpstringstream or if you don't want Yet Another Dependency, you could write your own, as the code is fairly short and straightforward )

The other thing to note about the above code is that we had to replace fwrite as well. This is needed because we're creating a read-only stream to be the stand-in for a read-write stream. If you are using other functions (e.g. ftell or stream_set_blocking) you will need to consider if those functions need a mock version too.


The Fly In The Ointment

The thing about replacing a global function is that you're replacing a global function, as in, globally! Any other code that calls that function is going to call your mock version. Maybe we can get away with this with fsockopen, but it becomes quite a major problem if you are replacing things like fgets or fwrite, in a phpUnit unit test, as phpUnit is quite likely to call those functions itself!

So, we want to restore the functions once we're done with them? It is very easy to get this wrong and get a segfault. You must use rename_function before using override_function, the first time. The following code has been tested and restores the behaviour of the above example:

override_function('fwrite','$fp,$s','return real_fwrite($fp,$s);');
rename_function("__overridden__", 'dummy2_fwrite');

override_function('fsockopen','$hostname,$port','return real_fsockopen($hostname,$port);');
rename_function("__overridden__", 'dummy2_fsockopen');

Notice how we still need to deal with __overridden__ each time.

Food For Thought

There is another approach, which might be more robust. It involves checking inside each overridden function if this is the stream you want to be falsifying data for, and whenever it isn't you call the original versions. Here I'll show how to just do that with the fwrite function:

$GLOBALS["mock_fp"]=null;

rename_function('fwrite','real_fwrite');
override_function('fwrite','$fp,$s','
if($fp==$GLOBALS["mock_fp"]){echo "Skipping a fwrite.\n";return;}   //Do nothing
return real_fwrite($fp,$s);
');
rename_function("__overridden__", 'dummy_fwrite');

rename_function('fsockopen','real_fsockopen');
override_function('fsockopen','$hostname,$port=-1,&$errno=null,&$errstr=null,$timeout=null',
    'return $GLOBALS["mock_fp"]=fopen("socket_mock_contents.txt","r");');
rename_function("__overridden__", 'dummy_fsockopen');


Then we can test it, as follows:
$fp=fsockopen('127.0.0.1',80);

fwrite($fp,"GET / HTTP/1.1\r\n");
fwrite($fp,"Host: 127.0.0.1\r\n");
fwrite($fp,"\r\n");

if($fp)while(!feof($fp)){
    echo fgets($fp);
    }

fclose($fp);

$fp=fopen("tmp.txt","w");
fwrite($fp,"My output\n");
fclose($fp);

Monday, May 28, 2012

php-webdriver bindings for selenium: how to add time-outs

Not all webpages finish loading. In particular I've a page that keeps streaming data back to the client, and never finishes. (For instance it might be used from an ajax call.) I want to test this from Selenium, but have been hitting problems. The main problem is Selenium's get() function, which is used to fetch a fresh URL, does not return until the page has finished loading [1]. In my case that meant never, and so my test script locked up!

However all is not lost; you can specify a page load timeout. It is hidden in the protocol docs, but I've added it to the php webdriver library I use (v0.9). See the three functions below [2]; just paste them in to the bottom of WebDriver.php.

I also needed one bug fix in WebDriver.php's public function get($url). It currently ends with:
    $response=curl_exec($session);

Just after that line you should add this:
    return $this->extractValueFromJsonResponse($response);


The time-out, and that bug fix, can be used like this:

require_once "/usr/local/src/selenium/php-webdriver-bindings-0.9.0/phpwebdriver/WebDriver.php";
$webdriver = new WebDriver("localhost", "4444");
$webdriver->connect("firefox");
$webdriver->setPageLoadTimeout(2000);   //2 seconds
$url="http://example.com/forever.php"; //A page that never finishes loading
$obj=$webdriver->get($url);
if($obj===null){
    $current_url=$webdriver->getCurrentUrl();
    if(!$current_url){
        //Selenium-server not running
        }
    else{
        //It worked! (it completed loading in under two seconds)
        }
    }
elseif($obj->class=='org.openqa.selenium.TimeoutException'){
    //It timed out
    }
elseif($obj->class=='org.openqa.selenium.remote.UnreachableBrowserException'){
    //Browser was closed (or selenium-server was shutdown)
    }
else{
    echo "FAILED:";print_r($obj);
    }

This is useful stuff. There is still one problem left for me: I wanted to load two seconds worth of data and then look at it. But I cannot. The browser refuses to listen to selenium while it is loading a page! So though get() returned control to my script after two seconds, I cannot do anything with that control (except close the browser window), because the URL is still actually loading. And it will do that forever!!  (I've played with an interesting alternative approach, which also fails, but suggests that a solution is possible. But that is out of the scope of this post, which is to show how to add the time limit functions to php-webdriver-bindings.)

[1]: This is browser-specific behaviour, not by Selenium design. Firefox and Chrome, at least, behave this way.


[2]: Consider this code released, with no warranty, under the MIT license, and permission granted to use in the php-webdriver-bindings project with no attribution required.

    /**
     * Set wait for a page to load.
     *
     * This timeout is for the get() function. (Firefox and Chrome, at least, won't return from get()
     * until a page is fully loaded.  If remote server is streaming content, they would never return
     * without this time-out.)
     *
     * @param Number $timeout Number of milliseconds to wait.
     * @author Darren Cook, 2012
     * @internal http://code.google.com/p/selenium/wiki/JsonWireProtocol#/session/:sessionId/timeouts
     */
    public function setPageLoadTimeout($timeout) {
        $request = $this->requestURL . "/timeouts";       
        $session = $this->curlInit($request);
        $args = array('type'=>'page load', 'ms' => $timeout);
        $jsonData = json_encode($args);
        $this->preparePOST($session, $jsonData);
        curl_exec($session);       
    }

    /**
     * Set wait for a script to finish.
     *
     * @param Number $timeout Number of milliseconds to wait.
     * @author Darren Cook, 2012
     * @internal http://code.google.com/p/selenium/wiki/JsonWireProtocol#/session/:sessionId/timeouts
     */
    public function setAsyncScriptTimeout($timeout) {
        $request = $this->requestURL . "/timeouts";       
        $session = $this->curlInit($request);
        $args = array('type'=>'script', 'ms' => $timeout);
        $jsonData = json_encode($args);
        $this->preparePOST($session, $jsonData);
        curl_exec($session);       
    }
    /**
     * Set implict wait.
     *
     * This is for waiting for page elements to appear. Not useful for scripts or
     * waiting for the initial get() call to time out.
     *
     * @param Number $timeout Number of milliseconds to wait.
     * @author Darren Cook, 2012
     * @internal http://code.google.com/p/selenium/wiki/JsonWireProtocol#/session/:sessionId/timeouts
     */
    public function setImplicitWaitTimeout($timeout) {
        $request = $this->requestURL . "/timeouts";       
        $session = $this->curlInit($request);
        $args = array('type'=>'implicit', 'ms' => $timeout);
        $jsonData = json_encode($args);
        $this->preparePOST($session, $jsonData);
        curl_exec($session);       
    }

Tuesday, August 9, 2011

Debugging Regexes

Cue: dramatic music. There I was under pressure, enemy fire going off in all directions, and my unit test had started complaining. The test regex was 552 characters long, the text being tested was almost as long, and each run of the unit test takes 30 seconds. Talk about finding a needle in a haystack. James Bond only had to choose between cutting the red or the blue wire. He had it easy.

But I lived to tell the tale. Playing the Bond Girl in this scenario was http://www.regextester.com/ (I actually used version 2 which, though alpha, worked fine).

It still wasn't smooth sailing. The above site assumes the regex is surrounded by /.../ but mine wasn't. So, first I had my unit test output the regex, then I escaped it correctly for use with /.../ then pasted it into the Regex Tester. I also pasted in the text to test. It should match; it doesn't. So I put the cursor at the end of my regex and deleted a few characters at a time. After deleting about two-thirds (!!) of it, finally the text turned red and I had a match. I could see exactly where the match stopped and realize what was missing in my regex. I fixed the regex (simultaneously in RegexMatcher and in my unit test script) and repeated. I had to delete back to almost the same point. It took half a dozen passes before the whole regex matched.

The code looks to be open source javascript. So maybe I will hack on it, to automate the above process (my better Bond Girl, if you like): I would give the regex, the target text, say I expect a match, and it will find the longest regex that matches and show me how much of the target text got matched. (Ah, it uses ajax requests to back-end PHP for the preg and ereg versions, and that code is not available; but at least I could do this for javascript regexes.)

Enough with poking around inside today's Bond Girl. Down the cocktail, jacket on, back to the field...

Wednesday, October 14, 2009

Control firefox from PHP?

Internet Explorer can be controlled from a COM object interface, and therefore from PHP (i.e. any scripting language that has COM support).

But is there a way to get script control of firefox? Ideally I'm looking for a platform-independent solution, and something I can use from PHP. Google is not helping (PHP's dominance as a server-side technology overwhelms the client-side related hits).

Here is my dream PHP script:

$firefox=new FirefoxInstance();
$firefox->set_url("http://dcook.org/work/");
$firefox->wait_until_fully_loaded();
$links=$firefox->get_links();
$found=false;
foreach($links as $id=>$info){
if($info['text']=="MLSN"){$firefox->click_link($id);$found=true;break;}
}
if(!$found)echo "MLSN link is missing...\n";

Or:

$form=$firefox->get_form("login");
$form->set("username","guest");
$form->set("password","guest");
$form->submit();

Et cetera. I.e. I'm talking about operating firefox the same way a user does; I know I can grab the raw HTML, parse it, etc. all from PHP, but that doesn't test a web site the same way clicking links in a browser does. Especially web pages with javascript, iframes, AJAX, etc.

(I'd heard of XPCOM but, if I've understood it correctly, it is a library to build firefox and its extensions, not something to control firefox? It also has no PHP bindings.)

BTW, going back to controlling IE from the COM interface, I don't suppose anyone has seen a detailed tutorial on how to use it to fill out and submit forms? I only ever see simple examples of how to set the URL, but I believe full control should be possible.
Dec 16th 2009 UPDATE: This came to the top of my to-do list so I read the MSDN docs on the Internet Explorer COM object, and now it is my understanding that I cannot manipulate and submit forms via the COM object. None of the example usage even hinted at doing this.

Sunday, February 15, 2009

PHP comments from PHP!

php|architect December 2008 edition's most interesting article was on phpdoc. I'm a fanatic when it comes to the javadoc style of commenting. When I create a new function the first thing I do is write function somename(){}, then the second thing I do is write /** */ above it.

If I'm refactoring then I might copy and paste in some code straightaway, but usually the third thing I do is document. I describe what the function does, er, will do, and usually define the @return tag (if I've not worked out what data, if any, it should return then I've got ahead of myself - how can I even name a function if I don't know what it produces??). The @param tags I usually leave until after I've written and tested the function, as they can change a lot (i.e. I need an algorithm before I know what inputs my algorithm takes).

During the writing of the function body I will usually jump up to the comment block and add an @internal comment (describing the implementation), or a @todo comment ("Need to add error-checking on call to somefunc()" is a very common one!).

To quote from the php|a article: "What's really exciting,though, is that since PHP 5.1, the Reflection classes can also supply the subject's "doc comment".

See http://php.net/reflection

The php|a article introduced a base class for automatically creating setters for class vars. My own interest is in somehow using it for contract based programming (also called DBC: Design By Contract).

I believe the Java implementation of DBC uses docblock entries, but with a pre-compiler. Pre-compilers suck: being able to edit and run the same source code file is a big bonus, and maybe reflection of docblocks enables that?

Of course using comments is poor cousin of DBC built into the language itself (especially if it can do compile-time analysis). But, for some reason I've not yet grasped, language designers just don't understand how important it is. They limp along with lint and asserts instead.

Back to the docblocks, I've not thought out the details; does PHP provide some hook that can be called when each function is entered and exited? If not I guess you'd have to write VALIDATE() as the first line of each function, and a RETURN() function everywhere you use return, which would be rather intrusive.

Either way, it could then look for @paramvalidate tags which could describe valid data to be found in parameters. It could look for @variant tags in the class doc block to check the object is valid on each of entry and exit.

Did I mention DBC finds loads of bugs, without you having to write a single unit test; all you have to do is document properly. Which, of course, you are going to do anyway if you are a professional programmer.

But did you realize DBC can also be used for optimization, by prioviding hints to the optimizer? As a concrete example (I'm thinking about C++ here) you have two input variables both described as being in the range 1..100. When they are multiplied together the range is 1..10000. Therefore there is no need to consider overflow. If you then divide by that variable then a divide by zero error is impossible. I'm not a compiler writer, and haven't written assembly in about 15 years, but I'm sure the above information, or similar, can help.

As another example, that also applies to PHP, static analysis of possible values can tell you that for an if/elseif/elseif/else block only the second elseif block can ever be reached. This can be both reported as a compiler warning and all the conditional stuff thrown away.

But, that is all a pipe dream, it requires DBC being implemented in the language itself, and I've drifted from the point of this article. Which was: a PHP script can read its own comments! Cool!