Search results may be wrong when option values contain special chars
Overview
When
- a custom field has one of the multi-value HTML types (Multi-Select, Checkbox), and
- one of the options for the field has a machine value containing "+" or another character that has special meaning in MySQL regular expressions,
using Advanced Search to search for that option may return incorrect results.
Reproduction steps
- Create a multi-value custom field and an option group where some of the options' values contain regex special characters. For example:
CRM.api3('OptionGroup', 'create', {
"name": "how_i_excel",
"title": "How I Excel",
"data_type": "String",
"is_active": 1,
"api.OptionValue.create": [{
"value": "fun drunk",
"label": "I'm Great at Beer Pong"
}, {
"value": "a+ student",
"label": "I Get Good Grades"
}, {
"value": "red[ish] hair",
"label": "I Have Beautiful Auburn Locks"
}]
}).then(function(result) {
console.log(result);
CRM.api3('CustomGroup', 'create', {
"title": "Stuff About Me",
"extends": "Contact",
"api.CustomField.create": {
"label": "Why I'm Awesome",
"data_type": "String",
"html_type": "CheckBox",
"option_group_id": "how_i_excel",
"is_searchable": 1,
"is_active": 1
}
}).then(function(result) {
console.log(result)
}, function(error) {
console.log(error)
});
}, function(error) {
console.log(error)
});
- Create or edit some contacts with the different option values you set up in the previous step.
- Use Advanced Search to search for the different option values.
Current behaviour
Searching for the option whose machine value is "a+ student" or "red[ish] hair" returns zero results, or the wrong results.
Expected behaviour
Searching for an option should return exactly the set of records whose custom value contains that option, regardless of what characters are used to store the option value.
Environment information
- CiviCRM: Master, but this issue may go back to 4.6.6 or thereabouts
- Database: MariaDB 10.4 (equivalent to MySQL 5.7)
Comments
The relevant bit of code is in CRM_Core_BAO_CustomQuery->where().
We have at least a couple options:
- Prohibit option values from containing regex special characters (plus sign +, asterisk *, square brackets [], curly braces {}, pipe | dot/period ., etc. Parentheses are already handled.) This seems infeasible to me, since option values can be created through the UI but may also be created through the API or other means. Also it doesn't handle option values that may have already been created.
- Escape the special characters during search. Although there is no specific function in PHP or our codebase for escaping MySQL regular expressions, it would not be too much trouble to create.
Question: what would be an appropriate test for this?