We need to move away from the usage of mcrypt code to a more modern way of handling crypto within the civicrm code base. Also there have been desire to have more fields other than just the SMTP password field encrypted within the database e.g. Payment Processor Passwords.
Problems:
Not easily able to determine what Crypto method has been used (Mcrypt or Base64 encoded only) to save the SMTP password.
Not easily to assess which extensions have had fields encrypted (e.g. know that Sparkpost has encrypted their API key but other than that unsure)
Relies on CIVICRM_SITE_KEY (which is maybe over-using that value).
Linked to Item 2, is there is no way for the API/DAO/Extensions to know what fields have been encrypted so to decrypt code appropriately.
Potential ideas:
Create a setting which stores the encryption class and use a factory / hooks to allow for extensions to customise which crypto library to use within CiviCRM
Have a hook / use some level of meta data to determine if a field is to be encrypted
Discussion Questions:
Should we completely break CRM_Utils_Crypt class as we re-work the encryption within the app
How should we handle switching between Crypto libraries?
Should extensions be able to use their own Crypto Library whilst leaving rest of the App to use a system specific Crypto?
Do we want to allow for civicrm.settings.php defines / environment variable overrides for encrypted fields / data points?
Should we completely break CRM_Utils_Crypt class as we re-work the encryption within the app
Grepping universe, there are 4 public extensions that rely on it. I think it makes sense to deprecate it - ie it can continue to exist and to behave the same way, but some other service/utility will
do the main work going forward.
Do we want to allow for civicrm.settings.php defines / environment variable overrides for encrypted fields / data points?
I'm a fan of config files. But, to follow this through, if one does this, then there's a follow-up choice -- one of these:
(a) Leave the UI as-is. The user experiences quirky behavior (e.g. they enter a new credential, but it's never activated).
(b) Update the web UI to tell the user when an override is in effect. Prevent them from editing the credential. This requires some way to programmatically inspect for overrides.
(It's similar to the issue in Civi::settings(), where it lets you inspect the default vs explicit vs mandatory values.)
(c) Remove the UI and make the config-file the only way to store credentials.
If the override functionality is supported by core, then I think we'd have to go (b). But if it's via extension, then I guess (a) or (c) are acceptable?
How should we handle switching between Crypto libraries?
Should extensions be able to use their own Crypto Library whilst leaving rest of the App to use a system specific Crypto?
IMHO, the crypto library issues (eg "mcrypt was deprecated, so everyone's gotta use something else"; eg "our server has mcrypt but not libsodium" or vice-versa) are a symptom but not the real problem. It would be much easier to manage crypto changes (libraries, hooks, extensions, etc) if we first deal with another lurking problem.
The lurking problem is the data-model for key-management - every encryption/decryption operation uses a singular key/cipher/blockmode. There's a certain "KISS" allure to storing all encrypted values with one key, but I think of that as the "80% solution" in the 80/20 split - so now we're dealing with 20% issues, eg
When you change the PHP/web-server, you get access to different PECL/crypt functionality. The configuration will have to go through some adaptation/transition during which old crypto and new crypto coexist.
Similarly, if you have extensions which manipulate crypto, then the system will again transition from an old to a new.
Many practitioners believe it's best-practice to rotate keys periodically. Of course, this again means that you multiple keys.
Granted, for credentials stored in Civi, at any given moment, the odds are that you only have one active key. (That's why 80% solution is 80%...) But at some moments you have 2 keys. And, over time, the system may go through a dozen (and your backups/snapshots will reflect this evolution). You could potentially try to be clever and design just for the transitional moment where there's 2
keys... but wouldn't that be short sighted? N-key support allows more use-cases. In the big picture, nearly all crypto-enabled apps support multiple keys. (In PGP/GPG, you have a key database, and each key has an ID. In SSH, you can choose different key files (ssh -i ...). In Apache/Nginx, you can choose a different key for each vhost. The same goes for most GUIs that work with these
systems - eg Jenkins/Github/Gitlab UIs and browser certificate UIs all let you register multiple keys/credentials.)
To work with multiple encryption keys, you need some way to say, "Item #123 is encrypted using key #456 and cihper xyz". For a field like civicrm_mail_settings.password (IMAP credentials), you might store the crypto params alongside password. Either:
Two columns: Ex: In civicrm_mail_settings, have the columns password (ciphertext; the encrypted password) and key_id (identifier for the crypto-system).
Serialized column (encrypted token): Ex: In civicrm_mail_settings, the password stores a serialized record which describes how the password is encrypted. For example, the token #456,xxxxxxxx could mean that it used key #456 to produce ciphertext xxxxxxxx. And plain,xxxxxxxx would mean that it's stored in plain-text without encryption.
Given the current shape of things, I expect the serialized column (encrypted token) is easier to layer-in.
If you want to enable different crypto parameters (e.g. transitioning from Civi's RJ256-ECB rendition on mcrypt to, say, AES256-CBC in any other library), then you just need a different flag on the token. (Ex: rj256-ecb-sitekey vs aes256-cbc-credkey). Similarly, if you have an extension which wants to replace the cryptosystem, it can define its own flag (my-crypto-{$keyid}).
The follow-up question will be: what, exactly, should the token and key-id/cryptosystem-id look like? (e.g. it might use a delimited list, serialize(), JSON, or
JWT/JWE.)
Linked to Item 2, is there is no way for the API/DAO/Extensions to know what fields have been encrypted so to decrypt code appropriately.
I would vote to add a simple boolean field to the metadata, e.g. MailSettings.username has is_encryptable=>FALSE and MailSettings.password has is_encryptable=>TRUE.
A related issue: when reading/writing data via API, how do you differentiate between values in their plaintext vs ciphertext form? How does that affect the API DX? Shooting from the hip, here's a possible ruleset that would apply to encryptable fields in APIv4:
For "write" operations (APIv4 create/update/save), you should be able to pass-in plaintext - and have it converted automatically to ciphertext.
For "write" operations (APIv4 create/update/save), you should be able to pass-in ciphertext - and have it stored literally.
For "read" operations (APIv4 get), you can read the ciphertext.
For "read" operations (APIv4 get), you can set an option (decrypt=>TRUE). If your user has permission to use the relevant key, then you get back the decrypted value.
For "query" operations, you cannot filter or order on encryptable columns.
Just to note - in previous weeks, Seamus and I used the security channel for some more discussions and sketching about this -- particularly with respect to the internal API for working with multiple keys and ciphers. (I'll be posting a PR soon-ish.) But I wanted to add some more notes.
For backward-compatibility with existing versions of extensions, CRM_Utils_Crypt per se should be left alone and deprecated. But we still need a transition path. (From core POV, that's specifically for migrating the SMTP password.) We considered the idea of adding support for a crypto-system like CRM_Utils_Crypt (i.e. rj256-ecb using CIVICRM_SITE_KEY and mcrypt). This doesn't really make sense:
The on-disk data outputted by CRM_Utils_Crypt is too ambiguous, and we need the upgrader to change it to something clearer. Even if we wanted to use rj256-ecb+CIVICRM_SITE_KEY... it wouldn't be a drop-in update.
rj256-ecb+CIVICRM_SITE_KEY is a bad crypto-system. Supporting it for new code is going to invite confusion.
I think we're more likely to see this:
The upgrader removes ambiguous encoding of SMTP password.
For sites with mcrypt PECL, that means decrypting.
For sites without mcrypt PECL, it means no change.
If you want encrypted fields generally, then it needs some post-upgrade configuration:
Add new key in civicrm.settings.php
Call an API to rekey data (eg cv api4 System.rekey)
Can someone explain the threat model here? What is to be protected from what threat?
e.g. taking SMTP credentials as an example. What are we protecting those from?
public/anon HTTP (obvs)
logged in unprivileged accounts (CMS permissions)
ever being accessed via HTTP; i.e. admin users can provide a secret but never get it back using the web UI?
someone who has access to the database via SQL (either SQL commands or unencrypted backup.sql)?
someone who has access to the database via filesystem?
someone who has read only access to the webroot? (and can therefore access any .env or config.settings files)
someone who can run PHP (and can therefore access environment variables as well as the above)
someone who has read write access to the webroot? (and therefore could rewrite forms to intercept credentials, or otherwise obtain the decrypted values/private key)
datacentre (or scrap merchant): vm's raw disk can be read
datacentre or anyone with root access: vm's RAM can be read
and who needs to access them, and how would they be able to access decrypted values?
mailer system run by cron (CiviMail etc.) - so is private key stored on disk? Or just in protected memory somehow?
mailer system run triggered by HTTP request ("Send an email") - ditto
I think we're talking about just storing the encrypted value in the database on disk (which is usually the same disk that stores the private keys), so I think we're only talking about 2-3 of the risks above, but I'd need to know more to be able to input usefully.
FWIW, when I first looked at this topic (CRM_Utils_Crypt, the SMTP password encryption, and suggestions to encrypt database content) a few years ago, I had a skeptical reaction. (The phrases "security theater" and "rip it out" may have been typed on my keyboard...) But I came around after thinking more about "defense in depth" and after looking at some of the other content that merits protection.
I think we're talking about just storing the encrypted value in the database
Yeah.
Can someone explain the threat model here? What is to be protected from what threat?
e.g. taking SMTP credentials as an example. What are we protecting those from?
Let's do one threat more verbosely and then consider others in shorter form.
If you look at the history of Civi security issues, you'll find several SQL injections (SQLI)... and there will probably be more in the future. It's quite rare for SQLI in Civi to allow modifying data (INSERT/DELETE)... it's more typical to allow reading data (JOIN/UNION/sub-SELECT). Of course, it's bad if SQLI allows an attacker to read anything and compromise the confidentiality of your email-list. But it's worse if the attacker can also read a bunch of credentials (SMTP, IMAP, Paypal/Auth.net, Civi API, etc) -- because now it escalates to a multifront integrity problem.
Building on the example of SMTP password and your list, I think cases could be classified in a few buckets:
No added protection: Even if the SMTP credential is encrypted, it will not be protected against this kind of attack/vulnerability.
(N1) The software suffers from arbitrary code execution.
(N2) The system RAM is compromised.
Yes protection: If the SMTP credential is encrypted, then it will remain secure in spite of this attack/vulnerability.
(Y1) The software suffers from a SQL injection.
(Y2) The SQL backup file is exposed.
(Y3) A staffer with access to reporting UI or admin UI becomes disgruntled.
Contingent protection: It can be protected in these cases.. if you take additional steps.
(C1) The files in the webroot are web-readable.
(C2) The system backup is exposed.
(C3) An old disk (or a cloud/VM disk-management service) is exposed.
The contingent cases arise because the admin has some influence over (a) where exactly you store the key and (b) which exact files are exposed. For the default case where the credential-key is stored directly in civicrm.settings.php, you're not protected. But there are several plays to change this:
Create a folder (eg /etc/keys) which is excluded from regular backups. In civicrm.settings.php, set define('CIVICRM_CRED_KEY', file_get_contents('/etc/keys/civicrm_cred_key')). This protects C1+C2 and is fairly portable/generalizable.
Use a cloud VM. During startup, the VM copies CIVICRM_CRED_KEY from a credential-service to a temp file in a ramdisk (/tmp/civicrm.key) or to an environment variable. This protects C1+C2+C3 but it's less portable/generalizable.
(More development) Write a connector for a hardware security module. (Unfortunately, PHP doesn't currently have native bindings for this. owncloud has a bridge which evidently works for their use-case. I'd be curious about cumulative affect on execution-time for Civi use-cases.) This is the strongest protection, and it covers C1+C2+C3, but it's more expensive and has more technical risk.
Overall, I think one should look at this style of database encryption as a matter of "defense in depth". It's not some magic spice that makes all vulnerabilities go away, and it's not effective against Arbitrary Code Execution. However, I do think the cases in "Yes protected" are real/useful for both sophisticated+non-sophisticated admins (and, of course, really paranoid folks can go after C1/C2/C3).
Also I think there is an issue with the fact that the current implementation uses a unique system that isn't easily transferrable and also uses functions that are deprecated in php. I agree with Tim's comments above. I tend to think that the SMTP cred isn't the most important but there are others we may want to protect that are more important.
Looks like an encrypted value is attempted to be written to a text field without first being text encoded.
UPDATE civicrm_setting SET value = .... WHERE id = 28 [nativecode=1366 ** Incorrect string value: '\xBAB\xC0i\x1C\x0D...' for column civicrm.civicrm_setting.value at row 1]
Thank you for raising an issue to help improve CiviCRM. As you may know, this issue has not had any activity for quite some time, so we have closed it.
We would like you to help us to determine if this issue should be re-opened:
If this issue was reporting a bug, can you attempt to reproduce it on a latest version of CiviCRM?
If this issue was proposing a new feature, can you verify if the feature proposal is still relevant? Did it get the concept-approved label? Have other people also shown interest? Could it be implemented as an extension?
If the answer to either question is yes, please feel free to comment or re-open the issue. Please also consider:
Is it something that you could help implement, either by sending a patch or hiring someone who can?
Thank you for your help and contributions to CiviCRM.
P.S. This is an automated message, see infra/gitlab issue 20. We understand that automatic responses are annoying, but given the number of open issues as the project evolves, we need a bit of help to triage and prioritise the most relevant issues.