Boot API/protocol for third-party modules should be upgrade-safe
Overview
This is a generalized response to #1846 (closed). 1846 is specific example (re:5.27 and woocommerce) of a class of a bug which has arisen with different versions and different modules over the years.
Example use-case
- Install CiviCRM version
$x
- Install a CMS plugin or module which uses CiviCRM (In the above report, it was WooCommerce. In other reports, it's been Drupal Views. But it could be others.)
- Configure the plugin or module in a way that is pervasive on the system. (Ex: It adds a block in the left-hand bar on every page, or it performs some logic whenever a user logs in, or it adds records to global menu/cache/data-structure.)
- Download the new code for CiviCRM version
$x+1
- Open the CMS/Civi web UI
- Navigate to Civi's DB upgrade screen
The key issue for this use-case comes in at step 5. The plugin or module is active, and it is pervasive, and it needs to use Civi APIs. It probably calls civicrm_initialize()
and then calls some service or data from Civi. But, we're in the middle of an upgrade process; the schema is out-of-date; even if you call civicrm_initialize()
, one cannot guarantee that the system is actually working.
Current approach
The behavior depends on the details of the versions ($x
vs $x+1
) and of the customization. In many cases, things work fine - there's no pervasive customization, or the old+new DB schemas are "close enough".
But sometimes there are problems. This would be reported as an upgrade bug. ("I downloaded the latest version, and I can't run the upgrade, and things are crashy.") It cannot be reproduced in a clean environment - you have to install a customization and configure it suitably. Historically, the process has been like:
- Attempt the upgrade on a real/complex site
- Discover the problem
- Either:
- Identify and disable the customization; re-attempt the upgrade; re-enable the customization.
- Report the issue; patch
civicrm-core
with a narrow work-around; and issue a point-release.
Either way, this is a reactive posture that leaves us systemically vulnerable to this class of issue and creates a poor impression. The key drivers (e.g. the list of third-party plugins; the list of upgrades; etc) are all open-ended. There is no simple, realistic, long-term verification protocol.
Analysis
For any module/plugin which needs to bootstrap Civi (e.g. civicrm_initialize()
), there should be an explicit check/indication of whether Civi is in a working state.
As an example, suppose we have a Drupal module:
// Show a block with a horosocope based on the logged-in user's date of birth
function horoscope_block_view($delta) {
if ($delta === 1) {
civicrm_intitialize();
$dob = civicrm_api3('Contact', 'getvalue', ['id' => '@user_contact_id', 'return' => 'birth_date']);
$msg = horoscope_lookup($dob);
return ['subject' => t('Horoscope'), 'content' => htmlentities($msg)];
}
}
I would submit that the problem here revolves around the contract for civicrm_initialize()
- this caller has assumed that the post-condition of civicrm_initialize()
is... to have an initialized Civi. But, if we're working toward a DB upgrade, then that condition cannot be met.
Proposed behavior
The correct thing is for this block to fail gracefully, e.g.
// Show a block with a horosocope based on the logged-in user's date of birth
function horoscope_block_view($delta) {
if ($delta === 1) {
if (civicrm_intitialize() === 'partial') {
return ['subject' => t('Horoscope'), 'content' => t('We are having trouble communing with the astral spirits. Please upgrade Civi and come back later.')];
}
$dob = civicrm_api3('Contact', 'getvalue', ['id' => '@user_contact_id', 'return' => 'birth_date']);
$msg = horoscope_lookup($dob);
return ['subject' => t('Horoscope'), 'content' => htmlentities($msg)];
}
}
For this to work, Civi's side of the boot protocol has to provide information about how well it has booted. Here's a possible contract (this is an incremental revision of the current; but it's possible something else is better):
/**
* @return bool|string
* FALSE: Indicates that Civi is not booted at all. For example, this may happen if it hasn't been installed.
* TRUE: Indicates that Civi is fully booted. You should expect most services to work.
* 'partial': Indicates that Civi may not be fully available. Most "application" should gracefully fail.
* However, some cases (eg `drush civicrm-upgrade-db`) might still continue with execution.
*/
function civicrm_initialize();
Comments
Addressing this would involve a few steps/phases:
- Implement the updated boot protocol (e.g. patch
civicrm_initialize()
) - Update any bundled / widely-used / semi-official modules to be compliant
- Update dev docs
- Encourage other downstream authors to follow the protocol