Monday, May 28, 2012

php-webdriver bindings for selenium: how to add time-outs

Not all webpages finish loading. In particular I've a page that keeps streaming data back to the client, and never finishes. (For instance it might be used from an ajax call.) I want to test this from Selenium, but have been hitting problems. The main problem is Selenium's get() function, which is used to fetch a fresh URL, does not return until the page has finished loading [1]. In my case that meant never, and so my test script locked up!

However all is not lost; you can specify a page load timeout. It is hidden in the protocol docs, but I've added it to the php webdriver library I use (v0.9). See the three functions below [2]; just paste them in to the bottom of WebDriver.php.

I also needed one bug fix in WebDriver.php's public function get($url). It currently ends with:
    $response=curl_exec($session);

Just after that line you should add this:
    return $this->extractValueFromJsonResponse($response);


The time-out, and that bug fix, can be used like this:

require_once "/usr/local/src/selenium/php-webdriver-bindings-0.9.0/phpwebdriver/WebDriver.php";
$webdriver = new WebDriver("localhost", "4444");
$webdriver->connect("firefox");
$webdriver->setPageLoadTimeout(2000);   //2 seconds
$url="http://example.com/forever.php"; //A page that never finishes loading
$obj=$webdriver->get($url);
if($obj===null){
    $current_url=$webdriver->getCurrentUrl();
    if(!$current_url){
        //Selenium-server not running
        }
    else{
        //It worked! (it completed loading in under two seconds)
        }
    }
elseif($obj->class=='org.openqa.selenium.TimeoutException'){
    //It timed out
    }
elseif($obj->class=='org.openqa.selenium.remote.UnreachableBrowserException'){
    //Browser was closed (or selenium-server was shutdown)
    }
else{
    echo "FAILED:";print_r($obj);
    }

This is useful stuff. There is still one problem left for me: I wanted to load two seconds worth of data and then look at it. But I cannot. The browser refuses to listen to selenium while it is loading a page! So though get() returned control to my script after two seconds, I cannot do anything with that control (except close the browser window), because the URL is still actually loading. And it will do that forever!!  (I've played with an interesting alternative approach, which also fails, but suggests that a solution is possible. But that is out of the scope of this post, which is to show how to add the time limit functions to php-webdriver-bindings.)

[1]: This is browser-specific behaviour, not by Selenium design. Firefox and Chrome, at least, behave this way.


[2]: Consider this code released, with no warranty, under the MIT license, and permission granted to use in the php-webdriver-bindings project with no attribution required.

    /**
     * Set wait for a page to load.
     *
     * This timeout is for the get() function. (Firefox and Chrome, at least, won't return from get()
     * until a page is fully loaded.  If remote server is streaming content, they would never return
     * without this time-out.)
     *
     * @param Number $timeout Number of milliseconds to wait.
     * @author Darren Cook, 2012
     * @internal http://code.google.com/p/selenium/wiki/JsonWireProtocol#/session/:sessionId/timeouts
     */
    public function setPageLoadTimeout($timeout) {
        $request = $this->requestURL . "/timeouts";       
        $session = $this->curlInit($request);
        $args = array('type'=>'page load', 'ms' => $timeout);
        $jsonData = json_encode($args);
        $this->preparePOST($session, $jsonData);
        curl_exec($session);       
    }

    /**
     * Set wait for a script to finish.
     *
     * @param Number $timeout Number of milliseconds to wait.
     * @author Darren Cook, 2012
     * @internal http://code.google.com/p/selenium/wiki/JsonWireProtocol#/session/:sessionId/timeouts
     */
    public function setAsyncScriptTimeout($timeout) {
        $request = $this->requestURL . "/timeouts";       
        $session = $this->curlInit($request);
        $args = array('type'=>'script', 'ms' => $timeout);
        $jsonData = json_encode($args);
        $this->preparePOST($session, $jsonData);
        curl_exec($session);       
    }
    /**
     * Set implict wait.
     *
     * This is for waiting for page elements to appear. Not useful for scripts or
     * waiting for the initial get() call to time out.
     *
     * @param Number $timeout Number of milliseconds to wait.
     * @author Darren Cook, 2012
     * @internal http://code.google.com/p/selenium/wiki/JsonWireProtocol#/session/:sessionId/timeouts
     */
    public function setImplicitWaitTimeout($timeout) {
        $request = $this->requestURL . "/timeouts";       
        $session = $this->curlInit($request);
        $args = array('type'=>'implicit', 'ms' => $timeout);
        $jsonData = json_encode($args);
        $this->preparePOST($session, $jsonData);
        curl_exec($session);       
    }

Friday, May 18, 2012

EC2 and Windows: a match made in... Hell

Henry Ford is famous for his lack of flexibility over the Model T: You can it in any colour you want, as long as it is black.

I think Amazon have taken a leaf out if his book, when offering their Windows instances: You can have any size disk you want, as long as it is 30Gb.

Don't believe me? Go on, try automating the creation of a 100Gb boot disk Windows machine. Or creating one from the web interface, without any post-configuration steps in Windows itself.

(I'll save you the Google: here is the AWS engineers telling you the multiple steps needed to achieve that. The instructions are different depending on the exact version of Windows.)

In fact, try automating anything to do with Windows configuration, using EC2. You can't. It can't be, won't be, scripted. You always have to log on afterwards (going through the EC2 Console to get your almost-impossible-to-type password) to do something. Usually quite a tedious and time-consuming something.
The bottom-line: Windows is not designed for the cloud.



...and yet, some of my clients, and some of my potential clients, insist on trying to use Windows anyway. The cloud is where they feel they should be, so people want to move their legacy apps there. Whenever I ask them how they do it, or why they do it, they seem to find justification. It is the cloud, look we're scaling. We're faster. It works! Like a man let out of prison, and running free in the meadow... only he still has the manacles and chains from his time doing time. Am I carrying my metaphor too far by wishing people would stop and take the Linux axe to the manacles before rushing off to the meadow?

Are you an expert at automating Windows on EC2? Please post a comment showing off your knowledge. I'm willing to learn, and will edit this article if you convince me it can be done :-)