Dedupe performance & hooks

changed the description

I added the findDuplicates hook quite a few years ago. It's still in use by a client but I'd be more than happy to re-write the code to use new/modern hooks.

For reference the implementation is:

/**
 * Implements hook_civicrm_findDuplicates().
 *
 * When submitting an online event registration page we check for duplicate contacts based on specific groups
 *  as specified in the event custom field 'duplicate_if_in_groups'
 */
function clientname_civicrm_findDuplicates($dedupeParams, &$dedupeResults, $context) {
  // Do we have an event?
  if (empty($context['event_id'])) {
    return;
  }
  try {
    $eventParams = [
      'id' => $context['event_id'],
      'return' => CRM_Clientname_Utils::getCustomByName('duplicate_if_in_groups', 'event_dedupe_filters'),
    ];
    // Get the group that this event allows duplicate contacts for
    $duplicateGroupId = civicrm_api3('Event', 'getsingle', $eventParams);
    $duplicateGroupId = CRM_Utils_Array::value(CRM_Clientname_Utils::getCustomByName('duplicate_if_in_groups', 'event_dedupe_filters'), $duplicateGroupId);
    // As we are submitting from anonymous event registration form we don't want to check permissions to find matching contacts.
    $dedupeParams['check_permission'] = FALSE;
    // Run the "standard" dedupe routine. This will return one or more contact IDs based on the unsupervised dedupe rule
    $dedupeResults['ids'] = CRM_Dedupe_Finder::dupesByParams($dedupeParams, $dedupeParams['contact_type'], $dedupeParams['rule'], $dedupeParams['excluded_contact_ids'], $dedupeParams['rule_group_id']);
    if (!empty($dedupeResults['ids'])) {
      $duplicateContactIds = [];
      foreach ($dedupeResults['ids'] as $duplicateContactId) {
        // We've got a duplicate contact ID.  If that ID is in the specified group we return the duplicate ID,
        // Otherwise we return an empty array (no duplicates) and allow the contact to be created again.

        $contactGroups = civicrm_api3('Contact', 'getsingle', [
          'id' => $duplicateContactId,
          'return' => ['group'],
        ]);

        // Loop through each of the groups linked to the contact ID to see if any match our group
        if (!empty($contactGroups['groups'])) {
          $groups = explode(',', $contactGroups['groups']);
          foreach ($groups as $groupId) {
            if ($groupId == $duplicateGroupId) {
              $duplicateContactIds[] = $duplicateContactId;
              break;
            }
          }
        }
      }
      // If we found duplicates this array will contain those IDs, otherwise it will be an empty array.
      $dedupeResults['ids'] = $duplicateContactIds;
    }
    $dedupeResults['handled'] = TRUE;
    return;
  }
  catch (Exception $e) {
    Civi::log()->debug('clientname_civicrm_findDuplicates: ' . $e->getMessage());
    return;
  }
}

It was a fairly specific use-case for event registrations and only deduping if contacts were in certain groups. The handled parameter was important and we didn't/don't have a pattern to call a hook and return if the extension hook implementation did the required work. At the time it was a case that "If X the extension logic should run, otherwise continue with core logic". So something like stopPropagation should be fine.

added triaged label

changed the description

Dedupe performance & hooks

Child items ...

Activity