Enriched Deletion
You might call this a "minority opinion" - I imagine few would take (on first glance) come to the same approach. However, I want to get it on paper because I think it's a deeper solution - and the issue may recur.
Background / Context
CiviCRM is a database application which tracks related data. For example, an Activity
may be related to a Case
and three Contact
s. This creates the question of referential integrity. For example, if one of those three Contact
s is deleted, then the relation (aka "foreign key" or "reference") between the Activity
and the deleted Contact
becomes nonsensical, and you must do something to make the data sensible again.
CiviCRM builds on top of MySQL, and MySQL provides one mechanism to deal with this: ON DELETE
. When declaring the schema for Activity
and Contact
, you can set a rule to resolve this nonsensical situation by either (a) putting a blank (null
) value into the reference or (b) deleting the record which has the invalid reference. This process can be thought of as s cascading or propagating. (Conceptually, the deletion of Contact
could trigger the deletion of an Activity
which could trigger the deletion of a custom dataset for that activity).
CiviCRM also builds on top of PHP. It has hooks and events, which leads to another mechanism: by subscribing to some event (e.g. hook_civicrm_pre
), you can take some action before or after deletion. This mechanism can also provide the cascading or propagating effect. Compared to MySQL, this empowers a developer to define more nuanced cascading rules.
Basic Concept
The basic concept of "Enriched Deletion" (for want of a better name) can be seen by comparing these two screens. First, we have a normal confirmation dialog that you might show to user before deleting an important record:
With enriched deletion, the user (or agent; more about the expanded view in a moment) has visibility and discretion into the consequences of the deletion:
They have the ability to see what data is affected by their action -- and to decide what the impact will be.
Expanded Concept
The cascade policy is what you in see the table above -- a list of relations and the rules about how to handle each stale reference (delete the record, set to null, block deletion/raise an error, etc).
The default cascade policy is, well, the default. It is determined programmatically. It is mediated via event/hook, and the site administrator can make overrides/customizations.
A screen such as the above is a thin wrapper around an API. All standard API entities would support two actions:
-
delete
: This API performs the deletion. In absence of suitable instruction, it obeys the default cascade policy. If the agent has suitable permission (egmanage rich deletion
-- for want of a better name), and if the API call specifies somecascade
option, then the inputs will take precedence over the defaults. -
deletePreview
(for want of a better name): This API reports about the related entities that would be deleted - and the cascade options that are valid for them. (If the user permissionmanage rich deletion
, then all options are valid. If they lack permission, then options are locked-in to the default cascade policy.)
Rationale
Rare is the user who clicks "Delete" while thinking, "Please, I hope the next screen is more complicated!" So why would you add this extra panel to the "Delete" action?
Regardless of this Gitlab filing, some cascade policy already exists. It must exist. The question is its form, content, comprehension, maintenance. Who can influence it?
- In the early/original CiviCRM, the cascade policy is specified via
xml/schema/**.xml
(e.g.xml/schema/Contribute/Contribution.xml
specifies that thecontact_id
relation has a policy of<onDelete>CASCADE</onDelete>
). - In CiviCRM with hooks/events, a third-party developer can potentially use
hook_civicrm_pre
to refine the cascade policy. It is not certain if this is actually done or if the interface works well for the purpose, but the basic idea is there.
However, there is no visibility into this behavior for a user. Whatever the cascade policy is, there will be elements which feel a bit grey or uncertain. Users will be in the dark about this - they won't even think about the extended implications of a delete until after it matters. The existence of extensions heightens the mystery - a user isn't in a position to know what extensions are, and an admin isn't in a position to say how each extension does (or does not) adjust the policy.
CiviCRM is a modular, multi-organization, "customizable/off-the-shelf" system. Consequently, it is difficult to make one set of judgments for grey area things -- especially when there's an open-set of customizations for each site. Even if you do understand the policy, programmatically customizing the policy is an expensive proposition.
The basic concept of "Enriched Deletion" is to formalize the cascade policy in a way that realistically allows+reconciles influence from (a) developers, (b) site admins, and (c) backend users.
Alternatives
- Hard-code one cascade policy
- Use "on-delete" hooks to allow programmers to customize policies
- Never delete anything - all deletions should be reversible.
- Comment: I'm not sure this changes the basic issue about the sensibility of the data - for example, suppose a user "deletes" a contact who has a case. Does that case remain visible? Do its activities in reporting about case-activities? Whether the deletion mechanism is hard or soft, there is still some kind of policy/effect on the case and activities ascribed to the contact.