Performance - query analysis
Using the methodology described here I have analysed the queries to pipe 7 contributions into CiviCRM and analyse all the queries to see if we can eliminate some. Analysis follows (I will look to address some)
- This is on a server that uses Redis with associated reduction of queries
- Main civi api calls called are Contact.create & contribution.create.
- None of the queries are slow - the focus here is on removing unnecessary queries from high volume code paths
- All 7 contributions were inserted within the total time of one second. We normally process 2000 donations per code run so reducing 1 query per row saves 2000 per run
- 18% of the total queries come from this line of code for this issue. I believe the handling in the create classes likely makes it obsolete - #2039 (closed)
Type of query | Count | proportion queries that are unnecessary |
---|---|---|
Activity wrangling #2057 (closed) | 42 | high |
CiviCRM log | 14 | medium |
greeting queries | 6 | undetermined |
Initialisation | 24 | low |
Insert actions | 97 | Low |
location queries #2039 (closed) | 119 | very high |
metadata | 51 | low |
pcp #2056 (closed) | 14 | high |
uf match queries #2087 (closed) | 14 | medium |
wmf | 14 | low |
Grand Total | 395 |
Visual https://docs.google.com/document/d/1Min5WxmC8O0_MF4nK3LMTXcVZ-WQU_4k1WSJtMeqLl8/edit?usp=sharing Activity wrangling 42
I haven't dug very far but my sense is this is likely about checking for existing activities linked to the contributions. As these are new contributions these checks might be largely avoidable
Examples
timestamp | query | seconds | rows found | columns requested |
---|---|---|---|---|
15/09/20 2:54 | SELECT * FROM civicrm_activity_contact WHERE ( civicrm_activity_contact .activity_id = 112183025 ) AND ( civicrm_activity_contact .contact_id = 46011835 ) AND ( civicrm_activity_contact .record_type_id = 2 ) |
0.000549 | 0 | 0 |
15/09/20 2:54 | SELECT * FROM civicrm_activity_contact WHERE ( civicrm_activity_contact .activity_id = 112183025 ) AND ( civicrm_activity_contact .record_type_id = 1 ) |
0.000606 | 0 | 0 |
15/09/20 2:54 | SELECT * FROM civicrm_activity_contact WHERE ( civicrm_activity_contact .activity_id = 112183025 ) AND ( civicrm_activity_contact .record_type_id = 2 ) |
0.000725 | 0 | 0 |
15/09/20 2:54 | SELECT * FROM civicrm_activity_contact WHERE ( civicrm_activity_contact .activity_id = 112183025 ) AND ( civicrm_activity_contact .record_type_id = 2 ) |
0.000605 | 1 | 1 |
15/09/20 2:54 | SELECT * FROM civicrm_activity_contact WHERE ( civicrm_activity_contact .activity_id = 112183025 ) AND ( civicrm_activity_contact .record_type_id = 3 ) |
0.000636 | 0 | 0 |
CiviCRM Log = 14
I split these out from the main insert queries as my feeling is that we don't need to add to civicrm log when logging is enabled - I'm pretty sure there is another GL on this & it's not quite that black and white
Examples
timestamp | query | seconds | rows found | columns requested |
---|---|---|---|---|
15/09/20 2:54 | INSERT INTO civicrm_log (entity_table , entity_id , data , modified_id , modified_date ) VALUES ('civicrm_activity' , 112183024 , 'Activity created for source=46011834' , 46011834 , 20200909141546 ) |
greeting queries = 6
I haven't done any analysis on these - I suspect we have skipGreeting enabled in our api call so it's only 6
timestamp | query | seconds | rows found | columns requested |
---|---|---|---|---|
15/09/20 2:54 | SELECT contact_a.id as contact_id, contact_a.email_greeting_id as email_greeting_id, contact_a.postal_greeting_id as postal_greeting_id, contact_a.addressee_id as addressee_id, contact_a.addressee_display as addressee_display, contact_a.addressee_custom as addressee_custom, contact_a.email_greeting_display as email_greeting_display, contact_a.email_greeting_custom as email_greeting_custom, contact_a.postal_greeting_display as postal_greeting_display, contact_a.postal_greeting_custom as postal_greeting_custom FROM civicrm_contact contact_a WHERE ( contact_a.id = '46011839' ) LIMIT 0, 25 | 0.000734 | ||
15/09/20 2:54 | SELECT contact_a.id as contact_id, contact_a.email_greeting_id as email_greeting_id, contact_a.postal_greeting_id as postal_greeting_id, contact_a.addressee_id as addressee_id, contact_a.addressee_display as addressee_display, contact_a.addressee_custom as addressee_custom, contact_a.email_greeting_display as email_greeting_display, contact_a.email_greeting_custom as email_greeting_custom, contact_a.postal_greeting_display as postal_greeting_display, contact_a.postal_greeting_custom as postal_greeting_custom FROM civicrm_contact contact_a WHERE ( contact_a.id = '46011840' ) LIMIT 0, 25 | 0.000913 | 1 | 1 |
Initialisation queries 24
These are all pretty necessary
examples
timestamp | query | seconds | rows found | columns requested |
---|---|---|---|---|
15/09/20 2:53 | ( SELECT * FROM civicrm_menu WHERE path in ( 'civicrm' ) AND domain_id = 1 ORDER BY length(path) DESC LIMIT 1 ) UNION ( SELECT * FROM civicrm_menu WHERE path IN ( 'navigation' ) AND domain_id = 1 ) | 0.001571 | 1 | 1 |
15/09/20 2:53 | /*!40101 SET NAMES utf8 */ | |||
15/09/20 2:53 | /*!50503 SET NAMES utf8mb4 */ | |||
15/09/20 2:53 | BEGIN |
Insert actions 97
This is where the work is done - nothing extraneous in the query list here
example
timestamp | query | seconds | rows found | columns requested |
---|---|---|---|---|
15/09/20 2:54 | INSERT INTO civicrm_entity_financial_trxn (entity_table , entity_id , financial_trxn_id , amount ) VALUES ('civicrm_financial_item' , 67488679 , 67491096 , 2.35 ) |
Location queries 119
These seem to be mainly extraneous. 70 of them come from what appears to be an outdated effort to handle is_primary (including 10 on open id which we don't support). At least some of the remaining 39 appear to be variants of this - this is the lowest hanging fruit
examples
timestamp | query | seconds | rows found | columns requested |
---|---|---|---|---|
Location queries | 15/09/20 2:54 | SELECT * FROM civicrm_openid WHERE ( ( is_primary = 0 OR is_primary IS NULL ) ) AND ( civicrm_openid .contact_id = 46011839 ) |
0.000643 | 0 |
Location queries | 15/09/20 2:54 | SELECT * FROM civicrm_openid WHERE ( ( is_primary = 0 OR is_primary IS NULL ) ) AND ( civicrm_openid .contact_id = 46011840 ) |
0.000517 | 0 |
Location queries | 15/09/20 2:54 | SELECT * FROM civicrm_openid WHERE ( ( is_primary = 1 ) ) AND ( civicrm_openid .contact_id = 46011834 ) |
0.001377 | 0 |
Metadata queries 51
This is less of a concern for us in this context as these (almost) all only ran once and none ran once per contribution. There is scope to improve them though & UI users would probably be helped. We would prefer they used the Redis-backed metadata cache
examples
timestamp | query | seconds | rows found | columns requested |
---|---|---|---|---|
15/09/20 2:54 | SELECT a.id as id , a.custom_group_id as custom_group_id , a.name as name , a.label as label , a.data_type as data_type , a.html_type as html_type , a.default_value as default_value , a.is_required as is_required , a.is_searchable as is_searchable , a.is_search_range as is_search_range , a.weight as weight , a.help_pre as help_pre , a.help_post as help_post , a.mask as mask , a.attributes as attributes , a.javascript as javascript , a.is_active as is_active , a.is_view as is_view , a.options_per_line as options_per_line , a.text_length as text_length , a.start_date_years as start_date_years , a.end_date_years as end_date_years , a.date_format as date_format , a.time_format as time_format , a.note_columns as note_columns , a.note_rows as note_rows , a.column_name as column_name , a.option_group_id as option_group_id , a.serialize as serialize , a.filter as filter , a.in_selector as in_selector FROM civicrm_custom_field a INNER JOIN civicrm_custom_group custom_group_id_to_civicrm_custom_group ON a.custom_group_id = custom_group_id_to_civicrm_custom_group.id WHERE (a.name = "original_currency" OR a.label = "original_currency") AND (custom_group_id_to_civicrm_custom_group.name = "contribution_extra" OR custom_group_id_to_civicrm_custom_group.title = "contribution_extra") LIMIT 25 OFFSET 0 |
0.001292 | 1 | 1 |
15/09/20 2:54 | SELECT a.id as id , a.custom_group_id as custom_group_id , a.name as name , a.label as label , a.data_type as data_type , a.html_type as html_type , a.default_value as default_value , a.is_required as is_required , a.is_searchable as is_searchable , a.is_search_range as is_search_range , a.weight as weight , a.help_pre as help_pre , a.help_post as help_post , a.mask as mask , a.attributes as attributes , a.javascript as javascript , a.is_active as is_active , a.is_view as is_view , a.options_per_line as options_per_line , a.text_length as text_length , a.start_date_years as start_date_years , a.end_date_years as end_date_years , a.date_format as date_format , a.time_format as time_format , a.note_columns as note_columns , a.note_rows as note_rows , a.column_name as column_name , a.option_group_id as option_group_id , a.serialize as serialize , a.filter as filter , a.in_selector as in_selector FROM civicrm_custom_field a INNER JOIN civicrm_custom_group custom_group_id_to_civicrm_custom_group ON a.custom_group_id = custom_group_id_to_civicrm_custom_group.id WHERE (a.name = "original_amount" OR a.label = "original_amount") AND (custom_group_id_to_civicrm_custom_group.name = "contribution_extra" OR custom_group_id_to_civicrm_custom_group.title = "contribution_extra") LIMIT 25 OFFSET 0 |
0.001085 | 1 | 1 |
15/09/20 2:54 | SELECT a.id as id , a.custom_group_id as custom_group_id , a.name as name , a.label as label , a.data_type as data_type , a.html_type as html_type , a.default_value as default_value , a.is_required as is_required , a.is_searchable as is_searchable , a.is_search_range as is_search_range , a.weight as weight , a.help_pre as help_pre , a.help_post as help_post , a.mask as mask , a.attributes as attributes , a.javascript as javascript , a.is_active as is_active , a.is_view as is_view , a.options_per_line as options_per_line , a.text_length as text_length , a.start_date_years as start_date_years , a.end_date_years as end_date_years , a.date_format as date_format , a.time_format as time_format , a.note_columns as note_columns , a.note_rows as note_rows , a.column_name as column_name , a.option_group_id as option_group_id , a.serialize as serialize , a.filter as filter , a.in_selector as in_selector FROM civicrm_custom_field a WHERE (a.name = "opt_in" OR a.label = "opt_in") LIMIT 25 OFFSET 0 |
0.001043 | 1 | 1 |
15/09/20 2:54 | SELECT a.id as id , a.entity_table as entity_table , a.entity_id as entity_id , a.account_relationship as account_relationship , a.financial_account_id as financial_account_id FROM civicrm_entity_financial_account a WHERE (a.entity_id = "9") AND (a.entity_table = "civicrm_financial_type") LIMIT 25 OFFSET 0 |
0.000718 | 6 | 6 |
15/09/20 2:54 | SELECT a.id as id , a.financial_account_id as financial_account_id FROM civicrm_entity_financial_account a WHERE (a.entity_table = "civicrm_option_value") AND (a.entity_id = "6342") LIMIT 1 OFFSET 0 |
0.001156 | 1 | 1 |
15/09/20 2:54 | SELECT a.id as id , a.financial_account_id as financial_account_id FROM civicrm_entity_financial_account a WHERE (a.entity_table = "civicrm_option_value") AND (a.entity_id = "844") LIMIT 1 OFFSET 0 |
0.000697 | 1 | 1 |
PCP 14 queries
I haven't done much analysis but I think these should be avoidable on sites without pcps - maybe the same way we no longer have product queries
examples
timestamp | query | seconds | rows found | columns requested |
---|---|---|---|---|
15/09/20 2:54 | SELECT id FROM civicrm_contribution_soft WHERE contribution_id = 49769031 AND pcp_id IS NOT NULL | 0.000624 | 0 | 0 |
15/09/20 2:54 | SELECT id FROM civicrm_contribution_soft WHERE contribution_id = 49769031 AND pcp_id IS NULL | 0.000616 | 0 | 0 |
15/09/20 2:54 | SELECT id FROM civicrm_contribution_soft WHERE contribution_id = 49769032 AND pcp_id IS NOT NULL | 0.000503 | 0 | 0 |
15/09/20 2:54 | SELECT id FROM civicrm_contribution_soft WHERE contribution_id = 49769032 AND pcp_id IS NULL | 0.000562 | 0 | 0 |
uf match queries - 14
These are not relevant to the incoming but not sure if they are avoidable - the duplication might be though
timestamp | query | seconds | rows found | columns requested |
---|---|---|---|---|
15/09/20 2:54 | SELECT * FROM civicrm_uf_match WHERE ( civicrm_uf_match .domain_id = 1 ) AND ( civicrm_uf_match .contact_id = 46011840 ) |
0.000567 | 0 | 0 |
15/09/20 2:54 | SELECT * FROM civicrm_uf_match WHERE ( civicrm_uf_match .domain_id = 1 ) AND ( civicrm_uf_match .contact_id = 46011840 ) |
0.00053 | 0 | 0 |
wmf queries - 14
these are specific to our script