Simple WSGI A/B testing - Swab

What is A/B testing?

A/B testing is a way of comparing two versions of a web page against each other, to see which performs best for your visitors. It could be testing changes to your website copy, visual design or user interface.

When you run an A/B test experiment you need to tell Swab what variants you have, and what goals you want to optimize for. Swab will then randomly assign visitors to each variant and keep track of how many times each variant is shown, along with how many of those visits resulted in a conversion.

Using this data, Swab can show you the conversion rate for each variant along with some basic statistics to help you decide whether there is a meaningful difference between the versions.

Setting up a Swab instance

Swab needs a directory where it can save the data files it uses for tracking trial and conversion data:

from swab import Swab
s = Swab('/tmp/.swab-test-data')

Then you need to tell swab about the experiments you want to run, the variants available and the name of the conversion goal:

s.add_experiment('button-color', ['red', 'blue'], 'signup')

Finally you need to wrap your WSGI app in swab’s middleware:

application = s.middleware(application)

Integrating swab in your app

Swab makes a number of functions available to you that you can put in your application code:

show_variant(environ, experiment, record=False, variant=None)

Return the variant name to show for the current request. In the above example, a call to show_variant('button-color', environ) would return either 'red' or 'blue'

record_trial_tag(environ, experiment)

Return the HTML tag for a javascript beacon that should be placed in the page you are testing. The tag causes the user’s browser to load a referenced javascript file, triggering swab to record a trial for the given experiment.

If you only have a single experiment running on the requested page and have previously called show_variant you can safely omit the experiment name.

record_trial(environ, experiment)

If you don’t want to use the javascript beacon to track trials, you can call record_trial directly. The javascript beacon method is preferred as it is unlikely to be triggered by bots.

If you only have a single experiment running on the requested page and have previously called show_variant you can safely omit the experiment name.

record_goal(environ, goal, experiment)

Record a goal conversion for the named experiment

Viewing results

Test results are available at the URL /swab/results.

Caching

Swab automatically adds a Cache-Control: no-cache response header if show_variant or record_trial was called during the request. This helps avoid proxies caching your test variants. It will also remove any other cache related headers (eg ‘ETag’ or ‘Last-Modified’). If you don’t want this behaviour, you need to pass cache_control=False when creating the Swab instance.

Viewing the variants

To test your competing pages append ‘?swab.<experiment-name>=<variant-name>’ to URLs to force any given variant to be shown.

Basic design

Each visitor is assigned an identity which is persisted by means of a cookie. The identity is a base64 encoded randomly generated byte sequence. This identity is used as a seed for a RNG, which is used to switch visitors into test groups.

Every time a test is shown, a line is entered into a file at <datadir>/<experiment>/<variant>/__all__. This is triggered by calling record_trial

Every time a goal is recorded (triggered by calling record_goal), a line is entered into a file at <datadir>/<experiment>/<variant>/<goal>

Each log line has the format <timestamp>:<identity>\n.

No file locking is used: it is assumed that this will be run on a system where each line is smaller than the fs blocksize, allowing us to avoid this overhead. The lines may become interleaved, but there should be no risk of corruption even with multiple simultaneous writes. See http://www.perlmonks.org/?node_id=486488 for a discussion of the issue.

Changelog

0.2.2 (released 2018-02-23)

  • Bugfix: fix for exception triggered when a bot visits a page containing record_trial_tag

0.2.1 (released 2018-02-23)

  • Bugfix: fixed link rendering on test results page

0.2.0 (released 2018-02-23)

  • Compatibility with python 3
  • Allow the application to force a variant when calling show_variant
  • Improved JS snippet no longer blocks browser rendering
  • No longer records duplicate trials if show_variant is called twice
  • Allow experiments to customize the swabid generation strategy - useful if you want to deterministically seed the RNG based on some request attribute.
  • Allow weighted variants: add_experiment('foo', 'AAAB') will show variant A 75% of the time.
  • Include bayesian results calculation based on http://www.evanmiller.org/bayesian-ab-testing.html#binary_ab_implementation
  • Better caching: only sets cookies on pages where an experiment is invoked
  • record_trial_tag can now infer the experiment name from a previous call to show_variant: less duplicated code when running an experiment.
  • Results now show results per visitor by default

Version 0.1.3

  • Added a javascript beacon to record tests (helps exclude bots)
  • Better exclusion of bots on server side too
  • Record trial app won’t raise an error if the experiment name doesn’t exist
  • Removed debug flag, the ability to force a variant is now always present
  • Strip HTTP caching headers if an experiment has been invoked during the request
  • Improved accuracy of conversion tracking
  • Cookie path can be specified in middleware configuration

Version 0.1.2

  • Minor bugfixes

Version 0.1.1

  • Bugfix for ZeroDivisionErrors when no data has been collected

Version 0.1

  • Initial release

API documention

class swab.Swab(datadir, wsgi_mountpoint='/swab')[source]

Simple WSGI A/B testing

collect_experiment_data(dedupe=False)[source]

Collect experiment data from the log files

Return a dictionary of:

{<experiment>: {
    'goals': [goal1, goal2, ...],
    'variants': {
        'v1': {
            'trials': 1062,
            'goals': {
                'goal1': {'conversions': 43, 'rate': 0.0405},
                'goal2': {'conversions': 29, 'rate': 0.0273},
            }
        },
        ...
    }
}
experiments = None

Mapping of {<experiment name>: <Experiment object>}

experiments_by_goal = None

Mapping of {<goal name>: [<Experiment object>, …]}

middleware(app, cookie_domain=None, cookie_path=None, cache_control=True)[source]

Middleware that sets a random identity cookie for tracking users.

The identity can be overwritten by setting environ['swab.id'] before start_response is called. On egress this middleware will then reset the cookie if required.

Parameters:
  • app – The WSGI application to wrap
  • cookie_domain – The domain to use when setting cookies. If None this will not be set and the browser will default to the domain used for the request.
  • cache_control – If True, replace the upstream application’s cache control headers for any request where show_variant is invoked.
swab.count_entries(path, dedupe=True)[source]

Count the number of entries in path.

Parameters:dedupe – if True, dedupe Entries so only one conversion is counted per identity.
swab.cumulative_normal_distribution(z)[source]

Return the confidence level from calculating of the cumulative normal distribution for the given zscore.

See http://abtester.com/calculator/ and http://www.sitmo.com/doc/Calculating_the_Cumulative_Normal_Distribution

swab.generate_id(urandom=<built-in function urandom>, encode=<function b64encode>)[source]

Return a unique id

swab.get_identities(path)[source]

Return a Counter for identity entries in path

swab.get_rng(environ, experiment, swabid)[source]

Return a random number generator with a fixed seed based on the current session’s swabid

swab.get_seed_from_bytes(s)[source]

Given a byte string, return a RNG seed value

swab.is_bot(environ, is_bot_ua=<built-in method match of _sre.SRE_Pattern object>)[source]

Return True if the request is from a bot. Uses rather simplistic tests based on user agent and header signatures, but should still catch most well behaved bots.

swab.is_bot_ua()

Matches zero or more characters at the beginning of the string.

swab.makedir(path)[source]

Create a directory at path. Unlike os.makedirs don’t raise an error if path already exists.

swab.record_goal(environ, goal, experiment=None)[source]

Record a goal conversion by adding a record to the file at swab-path/<experiment>/<variant>/<goal>.

If experiment is not specified, all experiments linked to the named goal are looked up.

This doesn’t use any file locking, but we should be safe on any posix system as we are appending each time to the file. See http://www.perlmonks.org/?node_id=486488 for a discussion of the issue.

swab.show_variant(environ, experiment, record=False, variant=None)[source]

Return the variant name that environ is assigned to within experiment

If record is true, write a line to the log file indicating that the variant was shown. (No deduping is done - the log line is always written. A page with show_variant might record multiple hits on reloads etc)

Parameters:
  • experiment – Name of the experiment
  • environ – WSGI environ
  • record – If True, record a trial for the experiment in the log file
  • variant – force the named variant. Use this if your application wants to choose the variant based on some other criteria (eg SEO a/b testing where you assign the variant based on the URL)
swab.zscore(p, n, pc, nc)[source]

Calculate the zscore of probability p over n tests, compared to control probability pc over nc tests

See http://20bits.com/articles/statistical-analysis-and-ab-testing/.

Indices and tables