Add ORDER BY Clause to CRM_Mailing_BAO_MailingRecipients->mailingQuery()
Overview
The SQL query in CRM_Mailing_BAO_MailingRecipients->mailingQuery() helps to segment all recipients of a mailing into smaller Mail Jobs / Batches. The query is iteratively called with a LIMIT and changing OFFSET with the goal of returning the next batch of recipients.
The problem is that without an ORDER BY clause, there is no guarantee that the order will be consistent across multiple queries.
I believe that this could potentially cause recipients to be duplicated or excluded from mailings if the order of results changes between batches.
My suggested solution is to add ORDER BY contact_id ASC, email_id ASC
to the query. (Though, it might suffice to just order by email_id)
Reproduction steps
- Create a Mailing.
- Process the mailing with the process_mailing scheduled task.
- Compare recipients in civicrm_mailing_recipients to recipients in related mail jobs in civicrm_mailing_event_queue.
Note that this problem is likely intermittent and there's no guarantee of reproducing it on any single run. Likelihood of error might also differ across mysql installations (MariaDB vs Percona, etc)
Environment information
- CiviCRM: Master/5.75
- PHP: 8.3
- CMS: Drupal 7.99
- Database: Percona for MySQL 8.0.41-32
- Web Server: Apache/2.4.63