Flea: WSGI testing

Overview of testing with flea

The Agent class provides a user agent that drives a WSGI application:

>>> from flea import Agent
>>> agent = Agent(my_wsgi_app)

You can now use this agent to navigate your WSGI application by…

…making GET requests:

>>> r = r.get('/my-page')

…making POST requests:

>>> r = r.post('/contact', data={'message': 'your father smells of elderberries'})

…clicking links:

>>> # Click on a link with content 'foo'
>>> r = r.click("foo")

>>> # Click a link matching a regular expression
>>> import re
>>> r = r.click(re.compile('f.*o'))

>>> # Find a link using a CSS selector
>>> r = r("a#mylink").click()

>>> # Or an XPath expression
>>> r = r("//a[@id='mylink']").click()

…and submitting forms:

>>> r = r("form[name=login-form]").fill(username='me', password='123').submit()
>>> r = r("form[name=contact] button[name=send]").submit()

Finding elements

There are several methods for traversing the DOM. The simplest is usually to use CSS selectors:

>>> r.css("a.highlighted")
<ResultWrapper ...>

For more complex requirements you can also use XPath, with find() or with dictionary-style access:

>>> r.find("//a[@class='highlighted']")
<ResultWrapper ...>
>>> r["//a[@class='highlighted']"]
<ResultWrapper ...>

You can also call the Agent directly, passing either an XPath expression or a CSS selector. Flea will autosense the expression type:

>>> r("a.highlighted")
<ResultWrapper ...>
>>> r("//a[@class='highlighted']")
<ResultWrapper ...>

If an expression could be interpreted as both a valid XPath and CSS selector, flea defaults to ‘css’. You can force an expression to be interpreted as one or the other by passing a flavor argument:

>>> r("a.highlighted", 'css')
<ResultWrapper ...>
>>> r("//a[@class='highlighted']", 'xpath')
<ResultWrapper ...>

Filling and submitting forms

Although you can fill fields by altering the necessary DOM properties: checked (checkboxes, radio buttons), selected (select options), text (textareas) and value for other input types, it’s usually more convenient to use the fill() method, which presents a common interface to all control types.

When you fill in form fields, the underlying DOM is updated. This makes it really easy to check your form is correctly filled while developing tests:

app = Response([
    '<html>'
    '<form>'
        '<input type="text" name="subject" />'
        '<textarea name="message"/>'
    '</form>'
    '</html>'
]).buffered()
>>> r = Agent(app).get('/')
>>> r('form').fill(subject='hello', message='how are you?')
<...>

>>> # Display the updated HTML
>>> # You could also use r.serve() to interact with the completed form in a web browser
>>> r('form').html()
'<form><input type="text" name="subject" value="hello"><textarea name="message">how are you?</textarea></form>'

fill() will raise an exception if you ask it to fill in a field that does not exist in the form. fill_sloppy() does not have this restriction, and will ignore any fields it can’t find.

Text inputs and textareas:

app = Response([
    '<html>'
    '<form>'
        '<input type="text" name="subject" />'
        '<textarea name="message"/>'
    '</form>'
    '</html>'
]).buffered()
>>> r = Agent(app).get('/')
>>> print(r.html())
<html><body><form><input type="text" name="subject"><textarea name="message"></textarea></form></body></html>
>>> r('input[name=subject]').fill('hello')
<...>
>>> r('textarea[name=message]').fill('world')
<...>
>>> r('form').submit_data()
[('subject', 'hello'), ('message', 'world')]

Checkboxes:

>>> app = Response([
...     '<html>'
...     '<form>'
...         '<input type="checkbox" name="opt-in" value="yes" />'
...         '<input type="checkbox" name="items" value="one" />'
...         '<input type="checkbox" name="items" value="two" />'
...         '<input type="checkbox" name="items" value="three" />'
...     '</form>'
...     '</html>'
... ])
>>> r = Agent(app).get('/')
>>> r('input[name=opt-in]').fill(True)
<...>
>>> r('input[name=items]').fill(['two', 'three'])
<...>
>>> r('form').submit_data()
[('opt-in', 'yes'), ('items', 'two'), ('items', 'three')]

Radio buttons:

>>> app = Response([
...     '<html>'
...     '<form>'
...         '<input type="radio" name="item" value="one" />'
...         '<input type="radio" name="item" value="two" />'
...         '<input type="radio" name="item" value="three" />'
...     '</form>'
...     '</html>'
... ])
>>> r = Agent(app).get('/')
>>> r('input[name=item]').fill('two')
<...>
>>> r('form').submit_data()
[('item', 'two')]

Select boxes

>>> app = Response([
...     '<html>'
...     '<form>'
...         '<select name="icecream">'
...             '<option value="strawberry">strawberry</option>'
...             '<option value="vanilla">vanilla</option>'
...         '</select>'
...         '<select name="cake" multiple="">'
...             '<option value="chocolate">chocolate</option>'
...             '<option value="ginger">ginger</option>'
...             '<option value="coffee">coffee</option>'
...         '</select>'
...     '</form>'
...     '</html>'
... ]).buffered()
>>> r = Agent(app).get('/')
>>> r('select[name="icecream"]').fill('strawberry')
<...>
>>> r('select[name="cake"]').fill(['chocolate', 'coffee'])
<...>
>>> r('form').submit_data()
[('icecream', 'strawberry'), ('cake', 'chocolate'), ('cake', 'coffee')]

There are some special functions flea.html.first(), flea.html.last(), flea.html.by_index() and flea.html.random_choice() for filling options from a select box or radio/checkbox group:

>>> from flea import first, by_index
>>> agent = Agent(app).get('/')
>>> agent('form').fill(icecream=first, cake=by_index(1))
<...>
>>> agent('form').submit_data()
[('icecream', 'strawberry'), ('cake', 'ginger')]

File uploads

To test file upload fields, you must pass a tuple of (filename, content-type, data) to fill(). The data part can either be a string:

>>> r = Agent(Response([
...         '<html>'
...         '<form name="upload" action="/" enctype="multipart/form-data">'
...                 '<input type="file" name="image"/>'
...         '</form>'
...         '</html>'
... ])).get('/')
>>> r("input[name=image]").fill(('icon.png', 'image/jpeg', 'testdata'))
<...>

Or a file-like object:

from StringIO import StringIO
r("input[name=image]").fill(('icon.png', 'image/jpeg', StringIO('aaabbbccc')))

Filling forms in a single call

The fill() method, when called on a form element, is a useful shortcut to filling in an entire form with a single call. Keyword arguments are used to populate input controls by id or name:

r = Agent(Response([
    '<html>'
            '<form name="login-form">'
                    '<input type="text" name="username"/>'
                    '<input type="text" name="password"/>'
            '</form>'
    '</html>'
]).buffered()).get('/')
r = r["//form[@name='login-form']"].fill(
    username='fred',
    password='secret'
).submit()

XPath or CSS selector expressions may be used for fields whose names can’t be represented as python identifiers or when you need more control over exactly which fields are selected:

r = r("form[name=login-form]").fill(('.//input[1]', 'fred'),
                                    ('.//input[2]', 'secret')).submit()

HTTP redirects

HTTP redirect responses (301 or 302) are followed by default. If you want to explicitly check for a redirect, you’ll need to specify follow=False when making the request. All methods associated with making a request - click, submit, get, post etc - take this parameter.

To follow a redirect manually:

>>> r = Agent(testapp).get('/')
>>> r = r("form[name=register]").submit(follow=False)
>>> r.request.path
'/register'
>>> r.response.status_code
302
>>> r = r.follow()
>>> r.request.path
'/'
>>> r.response.status_code
200

Querying WSGI application responses

Checking the content of the request

>>> print r.environ
{...}
>>> print r.request.path_info
'/index.html'

request is a fresco.request.Request object, and all attributes of that class are available to examine.

Checking the content of the response

r = Agent(Response('tomato')).get('/')
>>> assert r.content_type == 'text/html; charset=UTF-8'
>>> assert r.status == '200 OK'
>>> assert r.status_code == 200
>>> assert 'tomato' in r.body

You can also query the response directly, via response. This is a fresco.response.Response object, and all attributes of that class are available.

By default, responses are checked for a successful status code (2xx or 3xx), and an exception is raised for any other status code. If you want to bypass this checking, use the check_status argument:

>>> def myapp(environ, start_response):
...     start_response('500 Error', [('Content-Type', 'text/plain')])
...     return [b'Sorry, an error occurred']
...
>>> Agent(myapp).get('/')
Traceback (most recent call last):
...
BadStatusError: GET '/' returned HTTP status '500 Error'
>>> Agent(myapp).get('/', check_status=False)
<Agent '/'>

Testing JSON APIs

Flea has a few methods to help write tests for JSON API endpoints:

>>> r = Agent(Response.json({'fruit': 'tomato', 'color': 'red'})).get('/')
>>> assert r.json['fruit'] == 'tomato'
>>> r.post_json('/fruits', {'fruit': 'aubergine', 'color': 'purple'})
<Agent ...>
>>> r.put_json('/fruits/tomato', {'fruit': 'tomato', 'color': 'green'})
<Agent ...>

Checking returned content

The body property contains the raw response from the server:

>>> r = Agent(Response(["<html><p><strong>How now</strong> brown cow</p></html>"])).get('/')
>>> assert 'cow' in r.body

Any element selected via an xpath query has various helper methods useful for inspecting the document.

body is decoded according to the content type supplied in the response. For the raw response body, use body instead.

The striptags() method returns only the text node descendants of an HTML response. Whitespace is normalized (newlines, tabs and consecutive spaces are reduced to a single space character) in order to make comparisons more reliable in the face of formatting changes to HTML output.

>>> r = Agent(Response(["<html><p><strong>How now</strong> brown cow</p></html>"])).get('/')
>>> r.striptags()
'How now brown cow'

Checking if strings are present in an HTML element

>>> assert 'cow' in r('p')

Accessing the html of selected elements

>>> r('//p[1]').html()
'<p><strong>How now</strong> brown cow</p>'

Note that this is the html parsed and reconstructed by lxml, so is unlikely to be the literal HTML emitted by your application - use body for that.

Accessing textual content of selected elements

striptags() removes all HTML tags and normalizes whitespace to make string comparisons easier:

>>> r = Agent(Response([
... """
...     <html>
...         <p>
...             <strong>How now</strong>
...             brown
...             cow
...          </p>
...    </html>
... """])).get('/')
>>> r('//p[1]').striptags()
' How now brown cow '

WSGI environ

Flea sets the key flea.testing in the WSGI environment so that WSGI applications can sense if they are in a test environment.

This app will say “testing, testing” when called by flea, otherwise it says “hello!”:

>>> def app(environ, start_response):
...     start_response('200 OK', [('Content-Type', 'text/plain')])
...     if environ.get('flea.testing'):
...          return [b'testing, testing']
...     return [b'hello!']
...
>>> Agent(app).get('/').body
'testing, testing'

Inspecting and interacting with a web browser

Flea gives you two methods for viewing the application under test in a web browser.

The showbrowser() method opens a web browser and displays the content of the most recently loaded request:

>>> r.get('/').showbrowser()

The serve() method starts a HTTP server running your WSGI application and opens a web browser at the location corresponding to the most recent request. For example, the following code causes a web browser to open at http://localhost:8080/foobar:

>>> r.get('/foobar').serve()

If you want to change the default hostname and port for the webserver you must specify these when first initializing the Agent object:

>>> r = Agent(my_wsgi_app, host='192.168.1.1', port='8888')
>>> r.get('/foobar').serve()

Now the web browser would be opened at http://192.168.1.1:8888/foobar.

One final note: the first request to the application is handled by relaying the most recent response received to the web browser, including any cookies previously set by the application. Also, if any methods have been called that access the lxml representation of an HTML response – eg finding elements by an XPath query or filling form fields – then the lxml document in its current state will be converted to a string and served to the browser, meaning that while the document should be logically equivalent, it will no longer be a byte-for-byte copy of the response content received from the WSGI application.

This only applies to the first request, and ensures that the web browser receives a copy of the page as currently in memory, with any form fields filled in and with any cookies set so that you can pick up in your web browser exactly where the Agent object left off.

API documention

class flea.agent.Agent(app, environ=None, response=None, cookies=None, history=None, validate_wsgi=True, host='localhost', port='8080', loglevel=None, logger=None, original_environ=None, environ_defaults=None, close_response=True)[source]

A Agent object provides a user agent for the WSGI application under test.

Key methods and properties:

  • get(path), post(path), post_multipart - create get/post requests for the WSGI application and return a new Agent object

  • request, response - the Fresco request and response objects associated with the last WSGI request.

  • body - the body response as a bytes object

  • body_decoded - the body response decoded into a string

  • lxml - the lxml representation of the response body (only

    applicable for HTML responses)

  • reset() - reset the Agent object to its initial state,

    discarding any form field values

  • find() (or dictionary-style attribute access) - evalute the given

    xpath expression against the current response body and return a ResultWrapper object.

app

The original wsgi application

property body

The response body as a string

Return type

str

property body_bytes

The response body as a byte string

Return type

bytes

checkpoint(name)[source]

Checkpoint the history at the current location. The current agent state can later be retrieved with agent.history[name].

click(linkspec, flavor='auto', ignorecase=True, index=0, follow=True, check_status=True, **kwargs)[source]

Click the link matched by linkspec. See findlinks() for a description of the link finding parameters

Parameters
  • linkspec – specification of the link to be clicked

  • flavor – if css, linkspec must be a CSS selector, which must returning one or more links; if xpath, linkspec must be an XPath expression returning one or more links; any other value will be passed to findlinks().

  • ignorecase – (see findlinks())

  • index – index of the link to click in the case of multiple matching links

property content_type

The response Content-Type header value

css(selector)[source]

Return elements matching the given CSS Selector (see lxml.cssselect for documentation on the CSSSelector class.

find(path, namespaces=None, **kwargs)[source]

Return elements matching the given xpath expression.

If the xpath selects a list of elements a ResultWrapper object is returned.

If the xpath selects any other type (eg a string attribute value), the result of the query is returned directly.

For convenience that the EXSLT regular expression namespace (http://exslt.org/regular-expressions) is prebound to the prefix re.

Return a ResultWrapper of links matched by linkspec.

Parameters
  • linkspec – specification of the link to be clicked

  • ignorecase – if True, the link search will be case insensitive

  • flavor – one of auto, text, contains, startswith, re

The flavor parameter is interpreted according to the following rules:

  • if auto: detect links based on the following criteria:

    • if linkspec is a regular expression or otherwise has a

      search method, this will be used to match links.

    • if linkspec is callable, each link will be tested

      against it in turn, and the first link that returns True will be selected.

    • otherwise contains matching will be used

  • if text: for links where the text of the link is linkspec

  • if contains: for links where the link text contains linkspec

  • if startswith: for links where the link text contains linkspec

  • if re: for links where the text of the link matches linkspec

follow()[source]

If response has a 30x status code, fetch (GET) the redirect target. No entry is recorded in the agent’s history list.

follow_all()[source]

If response has a 30x status code, fetch (GET) the redirect target, until a non-redirect code is received. No entries are recorded in the agent’s history list.

html()str[source]

Return an HTML representation of the (html) response’s root element

property json

The response body decoded as a JSON object

property lxml

The response HTML decoded into an lxml tree

make_environ(REQUEST_METHOD='GET', PATH_INFO='', wsgi_input=b'', **kwargs)[source]

Return a dictionary suitable for use as the WSGI environ.

PATH_INFO must be URL encoded. As a convenience it may also contain a query string portion which will be used as the QUERY_STRING WSGI variable.

new_session()[source]

Return a new Agent with all cookies deleted. This gives an easy way to test session expiry.

post_json(path, data, ajax=False, *args, **kwargs)[source]

POST JSON-encoded data to the application.

Parameters

ajax – if True, an ‘X-Requested-With: XMLHttpRequest’ header will be added

post_multipart(PATH_INFO='/', data=None, charset='UTF-8', files=None, *args, **kwargs)[source]

POST a request to the given URI using multipart/form-data encoding.

Parameters
  • PATH_INFO – The path to request from the application. This must be a URL encoded string.

  • data – POST data to be sent to the application, must be either a dict or list of (name, value) tuples.

  • charset – Encoding used for string values.

  • files – list of (name, filename, content_type, data) tuples. data may be either a byte string, iterator or file-like object.

pretty()str[source]

Return an pretty-printed string representation of the (html) response body

Return type

str

put_json(path, data, ajax=False, *args, **kwargs)[source]

PUT JSON-encoded data to the application.

Parameters

ajax – if True, an ‘X-Requested-With: XMLHttpRequest’ header will be added

reload(follow=True, check_status=True)[source]

Reload the current page, if necessary re-posting any data.

Form fields that have been filled in on the loaded page, they will be refilled on the reloaded page, provided that the reloaded page has exactly the same fields present in the same order.

reset()[source]

Reset the lxml document, abandoning any changes made

response_class

alias of fresco.response.Response

serve(open_in_browser=True)[source]

Start a HTTP server for the application under test.

The host/port used for the HTTP server is determined by the host and port arguments to the Agent constructor.

The initial page rendered to the browser will the currently loaded document (in its current state - so if changes have been made, eg form fields filled these will be present in the HTML served to the browser). Any cookies the Agent has stored are also forwarded to the browser.

Subsequent requests from the browser are then proxied directly to the WSGI application under test.

showbrowser()[source]

Open the current page in a web browser

start_response(status, headers, exc_info=None)[source]

No-op implementation.

property status

The server reponse status, as a string (eg 200 OK)

property status_code

The server reponse status, as an integer (eg 404)

striptags()str[source]

Return the (html) response’s root element, with all tags stripped out, leaving only the textual content. Normalizes all sequences of whitespace to a single space.

Use this for simple text comparisons when testing for document content

xpath(path, namespaces=None, **kwargs)

Return elements matching the given xpath expression.

If the xpath selects a list of elements a ResultWrapper object is returned.

If the xpath selects any other type (eg a string attribute value), the result of the query is returned directly.

For convenience that the EXSLT regular expression namespace (http://exslt.org/regular-expressions) is prebound to the prefix re.

exception flea.exceptions.BadStatusError[source]

Raised when a non-success HTTP response is found (eg 404 or 500)

exception flea.exceptions.NotARedirectError[source]

Raised when an attempt is made to call follow() on a non-redirected response

flea.html.by_index(n)[source]

Select the n h option from a select box or set of checkboxes/radio buttons

flea.html.first(el)[source]

Select the first option from a select box or set of checkboxes/radio buttons

flea.html.last(el)[source]

Select the last option from a select box or set of checkboxes/radio buttons

flea.html.random_choice(el)[source]

Select a randomly chosen option from a select box or set of checkboxes/radio buttons

flea.util.base_url(environ)[source]

Return the base URL for the request (ie everything up to SCRIPT_NAME; PATH_INFO and QUERY_STRING are not included)

flea.util.is_html(response)[source]

Return True if the response content-type header indicates an (X)HTML content part.

flea.util.normalize_host(scheme, host)[source]

Normalize the host part of the URL

flea.util.parse_cookies(response, samesite_pattern=re.compile('samesite\\s*=\\s*(?:lax|strict|none);?', re.IGNORECASE))[source]

Return a Cookie.BaseCookie object populated from cookies parsed from the response object

flea.util.url_join_same_server(baseurl, url)[source]

Join two urls which are on the same server. The resulting URI will have the protocol and netloc portions removed. If the resulting URI has a different protocol/netloc then a ValueError will be raised.

>>> from flea.util import url_join_same_server
>>> url_join_same_server('http://localhost/foo', 'bar')
'/bar'
>>> url_join_same_server('http://localhost/foo',
...                      'http://localhost/bar')
'/bar'
>>> url_join_same_server('http://localhost/rhubarb/custard/',
...                      '../')
'/rhubarb/'
>>> url_join_same_server('http://localhost/foo',
...                      'http://example.org/bar')
Traceback (most recent call last):
  ...
ValueError: URI links to another server: http://example.org/bar
flea.util.urlencode_wrapper(data, encoding)[source]

Wrap stdlib urlencode to :

  • handle fresco style multidict arguments

  • encode unicode strings in the specified charset

Parameters
  • data – Data to urlencode as a string, dict or multidict.

  • encoding – String encoding to use

Returns

An encoded str object (a byte string under python 2)

class flea.wsgi.PassStateWSGIApp(testagent, initial_path)[source]

A WSGI application that replays the TestAgent’s cookies and currently loaded response to the downstream UA on the first request, thereafter proxies requests to the agent’s associated wsgi application.

Used by TestAgent.serve.