Skip to content

GitLab

  • Projects
  • Groups
  • Snippets
  • Help
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
C
cloud-native
  • Project overview
    • Project overview
    • Details
    • Activity
  • Issues 15
    • Issues 15
    • List
    • Boards
    • Labels
    • Service Desk
    • Milestones
  • Operations
    • Operations
    • Incidents
  • Analytics
    • Analytics
    • Value Stream
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Activity
  • Create a new issue
  • Issue Boards
Collapse sidebar
  • Development
  • cloud-native
  • Issues
  • #9

Closed
Open
Opened Dec 05, 2017 by davej@davej

Does CiviCRM work well with database replication and clustering?

There have been a few discussions on Mattermost in recent months in which people have described problems running Civi in replicated / cluster environments. In particular, Civi's use of temporary tables has caused issues. E.g.

10 Oct 2017 eileen:

We have been having some replication lag issues & one metric we are investigating is the slaves (only) seem to show a large number of open temp tables on one metric. The odd thing about the replication lag is it seems to get worse the longer since a server restart (presumably just the mysql part needs to be restarted).

10 Oct 2017 jackrabbithanna:

Ok we're dealing with Percona DB

Database Error Code: Statement violates GTID consistency: CREATE TEMPORARY TABLE and DROP TEMPORARY TABLE can only be executed outside transactional context. These statements are also not allowed in a function or trigger because functions and triggers are also considered to be multi-statement transactions., 1787

followed by a php error: Fatal error: Uncaught CRM_Core_Exception: [0: Transaction integrity error: Expected to find active frame thrown

Per https://dev.mysql.com/doc/refman/5.7/en/replication-gtids-restrictions.html the multi-statements transactions are not allowed as they could potentially run using the same global transaction ID (GTID)

this error happens when trying to create a Drupal user, with CiviCRM profile on the form

Further comments from Eileen:

OK - I'll have a read of that - the CiviCRM usage of temp tables is pretty common & in particular ties in with the way mailing recipients are calculated & rolled back. However, I'm also reading concerns that replicated DBs don't like temp tables (although it's not blocked in the version we have)

(we use mixed mode replication)

I think it's kind of a big picture question isn't it? I had thought to move all temp table creations to a single function (so if we want to do something different we can do it in there) - the thing is a lot of them are used to calculate group_contact_cache & those will cause rollback to fail

"A recommended practice when using statement-based or mixed-format replication is to designate a prefix for exclusive use in naming temporary tables that you do not want replicated, then employ a --replicate-wild-ignore-table option to match that prefix. For example, you might give all such tables names beginning with norep (such as norepmytable, norepyourtable, and so on), then use --replicate-wild-ignore-table=norep% to prevent them from being replicated." - we could at least do that for reports etc

if you are using statement based replication or mixed replication then any temp tables used to build real tables have to be replicated

or else the query will fail on the slave

it's an improvement basically 'make civicrm support replication'

it does work with replication but I think that's more because various people have just configured it to do so

but I don't think there has ever been any official support for that or effort to understand the design implications

13 Oct eileen:

... our replication issue DOES seem to be related to the temp tables created for civimail at this stage. ... What we are seeing is the DROP statement is written to the binlog BEFORE the create statement is - somewhat in line with https://www.xaprb.com/blog/2007/01/20/how-to-make-mysql-replication-reliable/

(not really clear from there but it says "As I mentioned, each transaction goes into the binlog in the order in which it commits, not in the order in which it is started, so transactions may execute in a different order on the replica. ")

the binlog is parsed on the slaves & they temp tables mount up

not 100% sure if this will solve all our woes - but not letting civicrm use temp tables to create recipient lists seems to be important

We ran a bunch of Civi sites with MySQL replication several years ago and had a lot of problems with it: replication would fall over periodically and then everything would grind to a near standstill as it was re-sync'd.

As a starting point, it would be useful to gather some data on how many people use replication / clustering with Civi, how frequently problems are encountered, what the most common problems are and whether they have been solved.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Assignee
Assign to
None
Milestone
None
Assign milestone
Time tracking
None
Due date
None
Reference: dev/cloud-native#9