
PHP Aspis uses character-level taint tracking, i.e. tracks the taint of each string character individually. PHP Aspis can track multiple independent and user provided taint categories. A taint category is a generic way of defining how an application is supposed to sanitise data and how PHP Aspis should enforce that the application always sanitises data before they are used. Each taint category is defined as a set of sanitisation functions and a set of guarded sinks. A sanitisation function is called by the application to transform untrusted user data so that they cannot be used for a particular type of injection attack.Guarded sinks are functions that protect data flow to sensitive sink functions. When a call to a sink function is made, PHP Aspis invokes the guard with references to the parameters passed to the sink function.
In each transformed PHP script, PHP Aspis inserts initialisation code that scans the superglobal arrays to identify the HTTP request data, replaces all submitted values with their Aspis-enclosed counterparts and marks user submitted values as fully tainted. As a result, all initial values are Aspis-protected in the transformed script. Then, all program statements and expressions are transformed to operate with Aspis-protected values, propagate their taint correctly and return Aspis-protected values. For example, the function AspisConcat() replaces all operations for string concatenating (e.g. double quotes or the concat operator .) and returns an Aspis-protected result. Control statements are similarly transformed to access the enclosed original values directly.
Built-in PHP functions cannot operate on Aspis-protected values and do not propagate taint meta-data. PHP Aspis uses interceptor functions to intercept calls to them and attach wrappers for taint propagation. Dynamic PHP features such as variable variables and runtime code generation are also supported.
The taint-tracking transformations generate taint tracking code that handles Aspis-protected values. For example, a tracking function that changes the case of a string parameter $p expects to find the actual string in $p[0]. Such a function can no longer be called directly from non-tracking code with a simple string for its parameter. Instead, PHP Aspis requires additional transformations to intercept this call and automatically convert $p to an Aspis-protected value, which is marked as fully untainted.
Compatibility transformations make changes to both tracking and non-tracking code. These changes alter the data that are exchanged between a tracking context and a non-tracking context, i.e. data exchanged between functions, classes and code in the global scope. For example, PHP Aspis transforms all cross-context function calls: a call from a tracking to a non-tracking context has its taint removed from parameters and the return value Aspis-protected again. Similar support is provided for global variables, accesses to the superglobal arrays and for code generated at runtime.
The USENIX WebApps paper that contains more details on the design PHP Aspis can be found here.
The PHP Aspis source code is available on GitHub. You can try PHP Aspis and see how it performs with your PHP applications. Detailed instructions on how to use it are included with the source. Please, don't forget to share your experiences and/or suggestions with us.
This work was supported by grants EP/F042469 and EP/F044216 (``SmartFlow: Extendable Event-Based Middleware'') from the UK Engineering and Physical Sciences Research Council (EPSRC).