Do you smoke test?

over 12 years ago

I broke Discourse yesterday.

Not in the “its just a little bit broken sense”. Instead, in the absolutely everything is broken and every page returns a 500 error code, sense.

We already do so much to ensure Discourse always works, in true Eric Ries lean startup, Rails best practice way.

We have 1800 specs that exercise a huge portion of the Ruby code.

We have 80 or so tests the test some of the JavaScript (we need more, lots more)

We constantly deploy to our test servers as soon as a commit happens. Only after this succeeds will we allow deploys to production.

Once a deploy to test happens we get a group chat message informing us it succeeded with a clear link to click (that takes us to our staging environment).

Nonetheless, somehow I managed to mess it all up and deploy a junk build.

###What happened?

The Rails asset pipeline is a bit of a Rubik’s Cube. For example, if in your master layout you have <%= asset_path "my_asset" %> it may work fine in your dev and test environments. However, if you forgot to set a magic switch to pre-compile the asset, well… everything breaks in production. In my case, I had it all mostly wired up and working in dev, but a missing .js extension meant that I just was not close enough.

I clicked build, everything passed and forgot to check the test site.

This is my fault, its 100% my fault.

Often when we hit these kind of issues, as developers, we love assigning blame. Blame points my way … don’t do it again … move along nothing more to see.

That is not good enough.

What is good enough?

I would like to follow a simple practice at Discourse. If you break production for any reason, we should make sure an automated system catches that kind of break next time along.

If you break production, the only thing you are allowed to work on should be the system that stops that kind of break in future.

What kind of system can avoid a totally broken build from getting out there?

The trivial thing to do is simply make a HTTP request to staging (our non customer facing production clone) and ensure it comes back with a 200 code. Trivial to add.

However, that is not really good enough.

I would like to know that the pages are all rendered properly. At least 3 key pages to start off with. The home page, a topic page and a user page.

What makes Discourse particularly tricky is that it is an Ember.js app. You only get to see the “real” page after a pile of JavaScript work happens. Simply downloading the content and testing it, is not going to cut it.

Back in the old days we would use Selenium for these kind of tests, trouble is its not really easy to automate and involves a fairly complex setup.

These days people mostly use PhantomJS, a headless WebKit browser.

Now, if you are planning on using PhantomJS I would strongly recommend using a framework like CasperJS to lean on. It does a lot of the messy work for you. For my initial humble test I decided to write it all by hand. There were quite a few reasons.

I wanted to know how the underlying APIs work. I needed a bunch of special hacks to get it to test in a particular way with special magic delays. I did not want to bring in another complex install process in to the open source project.

I ended up with this test:

/*global phantom:true */

console.log('Starting Smoke Test');
var system = require('system');

if(system.args.length !== 2) {
  console.log("expecting phantomjs {smoke_test.js} {base_url}");
  phantom.exit(1);
}

var page = require('webpage').create();

page.waitFor = function(desc, fn, timeout, after) {
  var check,start;

  start = +new Date();
  check = function() {
    var r;

    try {
      r = page.evaluate(fn);
    } 
    catch(err) {
      // next time
    }

    var diff = (+new Date()) - start;

    if(r) {
      console.log("PASSED: " + desc + " " + diff + "ms" );
      after(true);
    } else {
      if(diff > timeout) {
        console.log("FAILED: " + desc + " " + diff + "ms");
        after(false);
      } else {
        setTimeout(check, 50);
      }
    }
  };

  check();
};


var actions = [];

var test = function(desc, fn) {
  actions.push({test: fn, desc: desc});
};

var navigate = function(desc, fn) {
  actions.push({navigate: fn, desc: desc});
};

var run = function(){
  var allPassed = true;
  var done = function() {
    if(allPassed) {
      console.log("ALL PASSED");
    } else {
      console.log("SMOKE TEST FAILED");
    }
    phantom.exit();
  };

  var performNextAction = function(){
    if(actions.length === 0) {
      done();
    }
    else{
      var action = actions[0];
      actions = actions.splice(1);
      if(action.test) {
        page.waitFor(action.desc, action.test, 10000, function(success){
          allPassed = allPassed && success;
          performNextAction();
        });
      } 
      else if(action.navigate) {
        console.log("NAVIGATE: " + action.desc);
        page.evaluate(action.navigate);
        performNextAction();
      }
    }
  };

  performNextAction();
};

page.runTests = function(){

  test("more than one topic shows up", function() {
    return jQuery('#topic-list tbody tr').length > 0;
  });

  test("expect a log in button", function(){
    return jQuery('.current-username .btn').text() === 'Log In';
  });

  navigate("navigate to first topic", function(){
    Em.run.later(function(){
      jQuery('.main-link a:first').click();
    }, 500);
  });

  test("at least one post body", function(){
    return jQuery('.topic-post').length > 0;
  });
  
  navigate("navigate to first user", function(){
    // for whatever reason the clicks do not respond at the beginning
    Em.run.later(function(){
      jQuery('.topic-meta-data a:first').focus().click();
    },500);
  });

  test("has about me section",function(){
    return jQuery('.about-me').length === 1;
  });

  run();
};

page.open(system.args[1], function (status) {
    console.log("Opened " + system.args[1]);
    page.runTests();
});

Now… after we deploy staging we run rake smoke:test URL=http://staging.server and get a result.

Amazingly, less than a day after I wrote it, it already caught another junk build.

This is a start, I imagine that in a few months we will have a much more extensive smoke test process.

That said, if you do not have any kind of smoke test process I would strongly recommend exploring PhantomJS. Getting something basic up is a matter of hours.

Posted by: Sam Permalink | Comments (19)

Comments

Michael_Sarchet over 12 years ago

Sam,

Thanks for the post. The links to phantom and casper are exactly what I have been looking for to build some automated testing for a very knockout heavy application I wrote.

Ryan_Williams over 12 years ago

There was a company here in Portland, BrowserMob, that did automated Selenium testing, it was acquired, looks like the service is still available.

http://www.neustar.biz/enterprise/web-performance/what-is-website-monitoring

Doesn’t necessarily help for staging if it’s behind a firewall, but it’s an option anyway for production apps that don’t render “real” pages via straight http, as you say.

Alexander_Overvoorde over 12 years ago

I’m not familiar with these JavaScript frameworks, but using tests to solve this seems like a massive hack. Isn’t the point of a testing environment to be as close to production as possible to find exactly these problems? Production deployment sounds flawed if you need tests to check for missing files and incorrect extensions. These developer mistakes should be causing problems in testing already, making these additional tests seem like a massive hack.

Sam Saffron over 12 years ago

I think my wording was a bit unclear, staging is a test production clone

Gerart over 12 years ago

you should have a look at http://casperjs.org

Sam Saffron over 12 years ago

you do realise I linked to it

Christian over 12 years ago

What about using Selenium IDE to record the tests, then export them as RSpec or Unit tests and run them in an automated way from then on? That doesn’t require more setup than installing Phantom, Casper and writing the test code by hand…

Luis over 12 years ago

Have you tried capybara using poltergeist as the web driver and phantomjs with that. It lets you write your front end spec test in ruby and is very simple to use. It works wonders with our team. Also as part of a deploy script we do a sanity test to check that the deploy pushed something that works. If the sanity check fails you rollback automatically

Sam Saffron over 12 years ago

no, have not, will give capybara a shot

Harri_Paavola over 12 years ago

Since PhantomJS come with webdriver support, I would go with Selenium2 Library for Robot Framework. This way your tests will be much more clearer, almost plain English. So the first test would be like

Page Should Contain Element css=#topic-list tbody tr

Sam Saffron over 12 years ago

interesting, will have a look at it

Ross_Patterson over 12 years ago

Check out GhostDriver (http://blog.ivandemarino.me/2012/12/04/Finally-GhostDriver-1-0-0), the PhantomJS implementation of the Selenium/WebDriver API. Really simple – 34 lines of PhantomJS JavaScript to test the Google search page, comments included.

Tmk85 over 12 years ago

I recommend using SauceLabs. They have a great variety of browsers and operating systems. You don’t have to worry about maintaining and configurating selenium servers yourself. Well worth the price.

Alfasin over 12 years ago

I know it’s a matter of “self preference” but I would rather initializing var allPassed to “false”. Second, I have two “stupid” questions 1. since the product is written in ruby (which I’m not familiar with) – why not write the tests in ruby as well ? 2. I have little experience with selenium, but it sounds ideal for such tests – why don’t you want to use it besides the fact that it’s “old” (which I call “mature” ;) ?

Sam Saffron over 12 years ago

Getting selenium headless is totally doable, just tricky Does Selenium support headless browser testing? - Stack Overflow , phantom is easier to get going and less fragile cause its driving its own webkit

capybara is an option in Ruby, however we are a heavily mixed platform, a very large amount of Discourse is JavaScript including some server side components

Eslam_El_Husseiny over 12 years ago

as i understand you are smoke testing from development prospective which is cool, but i’m trying to figure out if smoke testing applies also on configuration management i.e. if you are using infrastructure as a code aspect e.g. puppet framework in deployment, can i smoke test deployment ?

Sam Saffron almost 12 years ago

The only place we use this technique is smoke testing production data. During our jenkins build we clone the production environment and test it, before proceeding with the deployment into staging.

Damián José over 10 years ago

I found a bit confusing this comment. Selenium is not a browser, it is a framework to automate browsers/pages. It will be headless or not depending on the browser you are using. PhantomJS, is a headless browser and as there is a webdriver implementation for it (ghostdriver), so you can use PhantonJS through Selenium2 and you will have an automated testing system using Selenium and a headless browser. Selenium and PhantomJS are completely different things, and in fact they can work together.

Sam Saffron over 10 years ago

My point though is that selenium is more hairy to setup than phantom which is far simpler to wire up for testing.

Sam Saffron

Do you smoke test?

What is good enough?

What kind of system can avoid a totally broken build from getting out there?

Comments

Twitter updates

Stack Overflow

Online content

About me