Add REGEXP/NOT REGEXP operators to API4
We have a few SearchKit use-cases that require regular expressions in
where, so it would be nice for API4 and SearchKit to support MySQL's
REGEXP operator. I have a working patch to add this feature here.
One detail I got stuck on is that for typical Civi deployments (i.e. using the default collation), MySQL's
REGEXP is case-insensitive, which is not what you would usually expect when dealing with regular expressions. In Search Builder, where we also also offer the
REGEXP operator, we've chosen to force case-sensitivity via
BINARY; I feel like while that might make sense for a user-facing component like Search Builder, it's not clear that this is the right default for a more low-level component like API4 - but you could make that argument for SearchKit, so it would be nice for API4 to at least support case-sensitivity when explicitly requested.
->addWhere('first_name', 'REGEXP', ['^A', 'c']) // case-sensitive ->addWhere('first_name', 'REGEXP', ['^A', 'i']) // case-insensitive // optionally, also support the simpler notation ->addWhere('first_name', 'REGEXP', '^A']) // use defaults (i.e. rely on collation)
The above could be added after the fact in a backwards-compatible way, so adding the operator in the simpler version from my patch (and following the collation's case-sensitivity by default) would also be an option.
REGEXP_LIKE()is only available in MySQL 8+, so initially we'd have to implement this using
BINARYand only support the
cvalue for the last operand. Once our minimum requirements are bumped to MySQL 8 (in a few decades
😬), we could switch the implementation over to
REGEXP_LIKE()and support other values.