Discussion on how to track merged contacts
Why? I'm raising this ticket because we don't have an agreed standard on how to track the history of merged contacts. Knowing which contacts were merged into a given contact can be important for reasons such us
- When a donor requests that their data be deleted from the database there may be personal information retained in the database for that contact under merged contact records, which should also be tracked and deleted
- When displaying log data for a contact such as their address history some of that data may be against a different contact. (For example the Extended reports extension allows you to create an address history report and optionally add it as a tab on the contact summary. This report currently includes data from merged contacts IF it can determine them - which currently requires you to have hacked core).
- When looking up a contact by id and they have been merged to another contact then it should be easy to move to that contact in the UI
Because these various functions take place in different extensions a standard way of tracking merge history makes sense.
Current Behaviour? Currently when a merge takes place 2 activities are created
- (1) 'Contact Merged'
- source is the logged in user,
- target is the retained contact.
- Deleted contact id is mentioned in subject text
- CRM-14792
- (2) 'Contact deleted in merge'
- source is the logged in user
- target is the deleted contact
- Both contact ids are mentioned in the subject text
- activity has the other contact as 'parent_id'
- CRM-18106
Options
-
The current behaviour DOES provide a path to piece together the contact history - ie. you need to track through 2 activities to get the contact pair. This seems a little unperformant / obscure but is do-able. We could enhance this by a) adding an api eg. Contact.getmerges that would return details of previous merges and that could be a standard method for extensions like privacy extensions (including GDPR) and reports/dashlets that display contact history to find past mergees. b) based on the above add a prominent link from merged contacts to the contact they were merge to.
-
Link the contacts via a single activity. ie. each of the activities created stores the logging in user and a target but leaves 'assignee' unused. This feels very logical and in fact would support a 'many-to-one' merge if we ever implemented it. WMF has implemented this through a core-hack and the extended reports extensions will present address history encompassing merges if sites implement this method. This was discussed on CRM-21415 and on CRM-18106 and there were 2 arguments made against a) There is some complexity around viewing an activity assigned to a deleted contact (resolvable IMHO) b) The semantics are odd - ie. the use of the 'assigned' field in one of the activites doesn't 'feel' like the right language.
The advantages are that the logic of reconstructing history seems simpler. However, it seems like either way the api & link described above make sense to expose the data
- Using an external extension. This would basically mean a hook could catch the creation of the second activity & crawl back up the parent id to compose the detail of the first activity and either update the second activity with the extra detail OR store the data in another format - e.g a table. Storing in a table potentially gets around weirdness if the deleted contact is fully deleted but we don't want to have to totally rebuild from the log table. It does feel like there is some data duplication in this though and not 'wanting' to totally rebuild from the log table may or may not be a good argument for that duplication. There is some discomfort with having an extension provide the described api but I guess extensions like Extended reports don't have to have it as a real dependency - they can just change their behaviour depending on whether it is present.