Re-Thinking our Crypto implementation

The other thing is that we we may need to handle fields and settings separately perhaps? Maybe we just focus on settings for the moment?

Should we completely break CRM_Utils_Crypt class as we re-work the encryption within the app

Grepping universe, there are 4 public extensions that rely on it. I think it makes sense to deprecate it - ie it can continue to exist and to behave the same way, but some other service/utility will do the main work going forward.

Do we want to allow for civicrm.settings.php defines / environment variable overrides for encrypted fields / data points?

I'm a fan of config files. But, to follow this through, if one does this, then there's a follow-up choice -- one of these:

(a) Leave the UI as-is. The user experiences quirky behavior (e.g. they enter a new credential, but it's never activated).
(b) Update the web UI to tell the user when an override is in effect. Prevent them from editing the credential. This requires some way to programmatically inspect for overrides. (It's similar to the issue in Civi::settings(), where it lets you inspect the default vs explicit vs mandatory values.)
(c) Remove the UI and make the config-file the only way to store credentials.

If the override functionality is supported by core, then I think we'd have to go (b). But if it's via extension, then I guess (a) or (c) are acceptable?

How should we handle switching between Crypto libraries?

Should extensions be able to use their own Crypto Library whilst leaving rest of the App to use a system specific Crypto?

IMHO, the crypto library issues (eg "mcrypt was deprecated, so everyone's gotta use something else"; eg "our server has mcrypt but not libsodium" or vice-versa) are a symptom but not the real problem. It would be much easier to manage crypto changes (libraries, hooks, extensions, etc) if we first deal with another lurking problem.

The lurking problem is the data-model for key-management - every encryption/decryption operation uses a singular key/cipher/blockmode. There's a certain "KISS" allure to storing all encrypted values with one key, but I think of that as the "80% solution" in the 80/20 split - so now we're dealing with 20% issues, eg

When you change the PHP/web-server, you get access to different PECL/crypt functionality. The configuration will have to go through some adaptation/transition during which old crypto and new crypto coexist.
Similarly, if you have extensions which manipulate crypto, then the system will again transition from an old to a new.
Many practitioners believe it's best-practice to rotate keys periodically. Of course, this again means that you multiple keys.

Granted, for credentials stored in Civi, at any given moment, the odds are that you only have one active key. (That's why 80% solution is 80%...) But at some moments you have 2 keys. And, over time, the system may go through a dozen (and your backups/snapshots will reflect this evolution). You could potentially try to be clever and design just for the transitional moment where there's 2 keys... but wouldn't that be short sighted? N-key support allows more use-cases. In the big picture, nearly all crypto-enabled apps support multiple keys. (In PGP/GPG, you have a key database, and each key has an ID. In SSH, you can choose different key files (ssh -i ...). In Apache/Nginx, you can choose a different key for each vhost. The same goes for most GUIs that work with these systems - eg Jenkins/Github/Gitlab UIs and browser certificate UIs all let you register multiple keys/credentials.)

To work with multiple encryption keys, you need some way to say, "Item #123 is encrypted using key #456 and cihper xyz". For a field like civicrm_mail_settings.password (IMAP credentials), you might store the crypto params alongside password. Either:

Two columns: Ex: In civicrm_mail_settings, have the columns password (ciphertext; the encrypted password) and key_id (identifier for the crypto-system).
Serialized column (encrypted token): Ex: In civicrm_mail_settings, the password stores a serialized record which describes how the password is encrypted. For example, the token #456,xxxxxxxx could mean that it used key #456 to produce ciphertext xxxxxxxx. And plain,xxxxxxxx would mean that it's stored in plain-text without encryption.

Given the current shape of things, I expect the serialized column (encrypted token) is easier to layer-in.

If you want to enable different crypto parameters (e.g. transitioning from Civi's RJ256-ECB rendition on mcrypt to, say, AES256-CBC in any other library), then you just need a different flag on the token. (Ex: rj256-ecb-sitekey vs aes256-cbc-credkey). Similarly, if you have an extension which wants to replace the cryptosystem, it can define its own flag (my-crypto-{$keyid}).

The follow-up question will be: what, exactly, should the token and key-id/cryptosystem-id look like? (e.g. it might use a delimited list, serialize(), JSON, or JWT/JWE.)

Linked to Item 2, is there is no way for the API/DAO/Extensions to know what fields have been encrypted so to decrypt code appropriately.

I would vote to add a simple boolean field to the metadata, e.g. MailSettings.username has is_encryptable=>FALSE and MailSettings.password has is_encryptable=>TRUE.

A related issue: when reading/writing data via API, how do you differentiate between values in their plaintext vs ciphertext form? How does that affect the API DX? Shooting from the hip, here's a possible ruleset that would apply to encryptable fields in APIv4:

For "write" operations (APIv4 create/update/save), you should be able to pass-in plaintext - and have it converted automatically to ciphertext.
For "write" operations (APIv4 create/update/save), you should be able to pass-in ciphertext - and have it stored literally.
For "read" operations (APIv4 get), you can read the ciphertext.
For "read" operations (APIv4 get), you can set an option (decrypt=>TRUE). If your user has permission to use the relevant key, then you get back the decrypted value.
For "query" operations, you cannot filter or order on encryptable columns.

changed the description

added Doing git:civicrm-core sig:security-improvement type:proposal labels

added sig:compatibility sig:maintainability labels

Moved from security/core to dev/core because:

There's no specific, exploitable vulnerability.
This is more a matter of defense-in-depth and maintainability/compatibility.
Patches for this issue are likely to touch several areas, so it's best to target a bigger release rather than a point-release.

Just to note - in previous weeks, Seamus and I used the security channel for some more discussions and sketching about this -- particularly with respect to the internal API for working with multiple keys and ciphers. (I'll be posting a PR soon-ish.) But I wanted to add some more notes.

For backward-compatibility with existing versions of extensions, CRM_Utils_Crypt per se should be left alone and deprecated. But we still need a transition path. (From core POV, that's specifically for migrating the SMTP password.) We considered the idea of adding support for a crypto-system like CRM_Utils_Crypt (i.e. rj256-ecb using CIVICRM_SITE_KEY and mcrypt). This doesn't really make sense:

The on-disk data outputted by CRM_Utils_Crypt is too ambiguous, and we need the upgrader to change it to something clearer. Even if we wanted to use rj256-ecb+CIVICRM_SITE_KEY... it wouldn't be a drop-in update.
rj256-ecb+CIVICRM_SITE_KEY is a bad crypto-system. Supporting it for new code is going to invite confusion.

I think we're more likely to see this:

The upgrader removes ambiguous encoding of SMTP password.
- For sites with mcrypt PECL, that means decrypting.
- For sites without mcrypt PECL, it means no change.
If you want encrypted fields generally, then it needs some post-upgrade configuration:
- Add new key in civicrm.settings.php
- Call an API to rekey data (eg cv api4 System.rekey)

Can someone explain the threat model here? What is to be protected from what threat?

e.g. taking SMTP credentials as an example. What are we protecting those from?

public/anon HTTP (obvs)
logged in unprivileged accounts (CMS permissions)
ever being accessed via HTTP; i.e. admin users can provide a secret but never get it back using the web UI?
someone who has access to the database via SQL (either SQL commands or unencrypted backup.sql)?
someone who has access to the database via filesystem?
someone who has read only access to the webroot? (and can therefore access any .env or config.settings files)
someone who can run PHP (and can therefore access environment variables as well as the above)
someone who has read write access to the webroot? (and therefore could rewrite forms to intercept credentials, or otherwise obtain the decrypted values/private key)
datacentre (or scrap merchant): vm's raw disk can be read
datacentre or anyone with root access: vm's RAM can be read

and who needs to access them, and how would they be able to access decrypted values?

mailer system run by cron (CiviMail etc.) - so is private key stored on disk? Or just in protected memory somehow?
mailer system run triggered by HTTP request ("Send an email") - ditto

I think we're talking about just storing the encrypted value in the database on disk (which is usually the same disk that stores the private keys), so I think we're only talking about 2-3 of the risks above, but I'd need to know more to be able to input usefully.

</impostersyndrome>

Excellent comment.

FWIW, when I first looked at this topic (CRM_Utils_Crypt, the SMTP password encryption, and suggestions to encrypt database content) a few years ago, I had a skeptical reaction. (The phrases "security theater" and "rip it out" may have been typed on my keyboard...) But I came around after thinking more about "defense in depth" and after looking at some of the other content that merits protection.

I think we're talking about just storing the encrypted value in the database

Yeah.

Can someone explain the threat model here? What is to be protected from what threat?

e.g. taking SMTP credentials as an example. What are we protecting those from?

Let's do one threat more verbosely and then consider others in shorter form.

If you look at the history of Civi security issues, you'll find several SQL injections (SQLI)... and there will probably be more in the future. It's quite rare for SQLI in Civi to allow modifying data (INSERT/DELETE)... it's more typical to allow reading data (JOIN/UNION/sub-SELECT). Of course, it's bad if SQLI allows an attacker to read anything and compromise the confidentiality of your email-list. But it's worse if the attacker can also read a bunch of credentials (SMTP, IMAP, Paypal/Auth.net, Civi API, etc) -- because now it escalates to a multifront integrity problem.

Building on the example of SMTP password and your list, I think cases could be classified in a few buckets:

No added protection: Even if the SMTP credential is encrypted, it will not be protected against this kind of attack/vulnerability.
- (N1) The software suffers from arbitrary code execution.
- (N2) The system RAM is compromised.
Yes protection: If the SMTP credential is encrypted, then it will remain secure in spite of this attack/vulnerability.
- (Y1) The software suffers from a SQL injection.
- (Y2) The SQL backup file is exposed.
- (Y3) A staffer with access to reporting UI or admin UI becomes disgruntled.
Contingent protection: It can be protected in these cases.. if you take additional steps.
- (C1) The files in the webroot are web-readable.
- (C2) The system backup is exposed.
- (C3) An old disk (or a cloud/VM disk-management service) is exposed.

The contingent cases arise because the admin has some influence over (a) where exactly you store the key and (b) which exact files are exposed. For the default case where the credential-key is stored directly in civicrm.settings.php, you're not protected. But there are several plays to change this:

Create a folder (eg /etc/keys) which is excluded from regular backups. In civicrm.settings.php, set define('CIVICRM_CRED_KEY', file_get_contents('/etc/keys/civicrm_cred_key')). This protects C1+C2 and is fairly portable/generalizable.
Use a cloud VM. During startup, the VM copies CIVICRM_CRED_KEY from a credential-service to a temp file in a ramdisk (/tmp/civicrm.key) or to an environment variable. This protects C1+C2+C3 but it's less portable/generalizable.
(More development) Write a connector for a hardware security module. (Unfortunately, PHP doesn't currently have native bindings for this. owncloud has a bridge which evidently works for their use-case. I'd be curious about cumulative affect on execution-time for Civi use-cases.) This is the strongest protection, and it covers C1+C2+C3, but it's more expensive and has more technical risk.

Overall, I think one should look at this style of database encryption as a matter of "defense in depth". It's not some magic spice that makes all vulnerabilities go away, and it's not effective against Arbitrary Code Execution. However, I do think the cases in "Yes protected" are real/useful for both sophisticated+non-sophisticated admins (and, of course, really paranoid folks can go after C1/C2/C3).

Also I think there is an issue with the fact that the current implementation uses a unique system that isn't easily transferrable and also uses functions that are deprecated in php. I agree with Tim's comments above. I tend to think that the SMTP cred isn't the most important but there are others we may want to protect that are more important.

added triaged label

A few open PRs:

19236 - Add crypto.registry and crypto.token services for defining keys and cipher-suites
19251 - Add System.rotateKey API and hook_civicrm_cryptoRotateKeys
19239 - Migrate SMTP password from CRM_Utils_Crypt to crypto.token

mentioned in commit 281eacd8

mentioned in commit 8d3452c4

mentioned in commit 5b394c8f

mentioned in commit 27fe36e7

mentioned in commit 8d54915f

mentioned in commit d463c543

mentioned in commit f702c7f5

mentioned in commit 7c5110c3

mentioned in commit 23fa0118

mentioned in commit 3fef2e21

mentioned in commit 1c492419

mentioned in commit fe4d2bdc

mentioned in commit b124f74b

mentioned in commit 05ec6dbd

mentioned in commit f1aeb0ba

mentioned in commit 8daf06d6

mentioned in commit 03bfb187

mentioned in commit ebbf2a61

mentioned in commit aa7a6575

mentioned in commit 2ce5684a

mentioned in commit 71110177

mentioned in commit 519eb0b8

mentioned in commit wordpress@21f7ec25

mentioned in commit wordpress@8eb75a31

The PRs to generate CIVICRM_CRED_KEYS during installation were/are:

added epic label

mentioned in commit fce178b6

mentioned in commit dbc5c4fc

mentioned in commit 29d6ba90

mentioned in commit 2b0517d2

mentioned in commit 94d3a8b6

mentioned in commit bde2689e

mentioned in commit 9cb1e984

mentioned in commit b5029578

mentioned in commit 23f9d2df

mentioned in commit fd797f4f

mentioned in commit 87facec5

mentioned in commit 6ce2d8b9

mentioned in commit ad217a5a

mentioned in commit 473bf71a

mentioned in commit 28102c55

mentioned in issue mail#59 (moved)

First site upgraded to 5.34.

migrateSmtpPasswords function caused upgrade errors.

[nativecode=1366 ** Incorrect string value: '\xBAB\xC0i\x1C\x0D...'

Looks like an encrypted value is attempted to be written to a text field without first being text encoded.

UPDATE civicrm_setting SET value = .... WHERE id = 28 [nativecode=1366 ** Incorrect string value: '\xBAB\xC0i\x1C\x0D...' for column civicrm.civicrm_setting.value at row 1]

Seems to be a encoding issue.

Also posted in MM, https://chat.civicrm.org/civicrm/channels/dev/hjgxyu1kxtr9zkizc1uz8mskmc

Agileware Ref: CIVICRM-1673

migrateSmtpPasswords function caused upgrade errors.

Fixed by https://github.com/civicrm/civicrm-core/pull/19711 and in the 5.34.alpha1 upgrade script

mentioned in issue #2470 (closed)

Cross-referencing, I added an issue to add a system check for the presence of CIVICRM_CRED_KEYS in #2470 (closed).

Hi,

Thank you for raising an issue to help improve CiviCRM. As you may know, this issue has not had any activity for quite some time, so we have closed it.

We would like you to help us to determine if this issue should be re-opened:

If this issue was reporting a bug, can you attempt to reproduce it on a latest version of CiviCRM?
If this issue was proposing a new feature, can you verify if the feature proposal is still relevant? Did it get the concept-approved label? Have other people also shown interest? Could it be implemented as an extension?

If the answer to either question is yes, please feel free to comment or re-open the issue. Please also consider:

Is it something that you could help implement, either by sending a patch or hiring someone who can?
Would you be willing to fund the work, either through a CiviCRM partner, or the Core Team?

Thank you for your help and contributions to CiviCRM.

P.S. This is an automated message, see infra/gitlab issue 20. We understand that automatic responses are annoying, but given the number of open issues as the project evolves, we need a bit of help to triage and prioritise the most relevant issues.

closed

Re-Thinking our Crypto implementation

Child items ...

Activity