Ongoing duplicate contact creation if mismatch in civicrm_uf_match
(I've only seen this on WordPress, but the relevant file is in core, so I'm filing under dev/core).
Summary:
Under certain circumstances involving "crossed wires" in the civicrm_uf_match
table, a logged-in user's mere activity in CiviCRM will create a new contact (containing only the user's email address) for each CiviCRM page load (or ajax call, etc.). Left un-checked, this can lead to thousands of duplicate contacts containing only an identical email address.
Scope of impact:
I've seen this on a handful of WordPress sites over the past 5-7 years. I've not heard others in the community mention it, nor seen an existing issue on l.c.o.
Example steps to reproduce:
This is only one example; there are surely other steps that will get us there. See "Requisite data conditions" below.
These steps are for repro under WordPress. I haven't tried this under other CMSs.
- Create WP user a with email address a@example.com. Observe this creates a corresponding civicrm contact; we'll call this contact C1. Observe that the entry in
civicrm_uf_match
is correct (i.e. Summary tab for contact C1 shows the user ID of user a). - Change the email address for user a to a2@example.com. Observe that the
civicrm_uf_match
link is preserved (but also that the email address incivicrm_uf_match.uf_name
is unchanged). - Delete contact C1 (permanently or to trash). Observe entry in
civicrm_uf_match
is deleted, and user a still exists. - Create a new contact (which we'll call C2) with another email address, e.g. aFoo@example.com.
- Use CiviCRM's "Create User Record" feature to create a new WP user for contact C2; specify any username you like, but we'll assume username a2. Observe that the entry in
civicrm_uf_match
is correct (i.e. Summary tab for contact C2 shows the user ID of user a2), and that user a2 has email address aFoo@example.com. - Change contact C2's email address to a2@example.com. Observe that the email address in
civicrm_uf_match.uf_name
is updated, and the emmail address for WP user a2 is unchanged.
At this point, you'll have this state of data:
wp_users.ID* | wp_users.user_login | civicrm_uf_match.id* | civicrm_uf_match.contact_id* | civicrm contact primary email | civicrm_uf_match.uf_name | wp_users.user_email |
---|---|---|---|---|---|---|
64 | a | NULL | NULL | NULL | NULL | a2@example.com |
65 | a2 | 162 | 75512 | a2@example.com | a2@example.com | aFoo@example.com |
* IDs in this table are from my real data; yours will differ of course.
- As for permissions, I granted user a the WP Administrator role, which has Administer CiviCRM; you could probably repro with narrower permissions.
- Log in as user a. Perform a CiviCRM search for contacts having email address a2@example.com. Observe the result count N.
- Bad Behavior: Do just about anything in CiviCRM. For example, refresh the search just performed. Observe the result count is >N.
- Repeat step 9 and observe the increasing number of contacts with email address a2@example.com.
Requisite data conditions:
As mentioned above, there are probably many possible repro recipes, but the key is to arrive at this requisite state of data:
Given a specific WP user e.g. username a, email address a2@example.com, new contacts are created containing only this email address, each time this logged-in user takes any action (or presumably almost any action) in CiviCRM, as long as:
- The WP user ID is not represented in
civicrm_uf_match.uf_id
; and - The WP user email (a2@example.com) is associated with one or more CiviCRM contacts; in a list of these contacts, sorted by
is_primary DESC, contact_id
, the first contact in that list is represented incivicrm_uf_match.contact_id
; and - The WP user email (a2@example.com) is represented in
civicrm_uf_match.uf_name
.
Relevant code:
This all seems to be primarily handled by CRM_Core_DAO_UFMatch::synchronizeUFMatch()
. The above "requisite data conditions" are a summary of the logical path in that method that leads to the stated problem.
Interestingly, the code seems to be aware of this problem, as in line 284 it defines a variable $msg
which would read along the lines of (using values from my data-state above) "Contact ID 162 is a match for WordPress user 64 but has already been matched to 65" -- but that message is never used or logged anywhere.
Workarounds:
My simplest fix has been simply to update civicrm_uf_match via SQL to link the right user with the right contact, like so (again using values from my real data above): update civicrm_uf_match set uf_id = 64 where contact_id = 75512;
. And then of course to remove all of the duplicate contacts.
I thought Synchronize Users to Contacts might do the trick, but it has no affect on the relevant data, and the bad behavior persists.
(Joinery reference: F#1180, F#1339)