Imagine a world without CodeGen
Update: The lengthy discussion in this issue has resulted in the agreement to replace schema/CRM/xml
files with schema/entityType.php
files. A conversion script has been added to Civix. The files that had previously been autogenerated based on the xml (DAOs, allCoreTables.data, install/uninstall sql) have all been replaced with code that reads directly from the new schema/entityType.php
files.
Currently we use CRM_Core_CodeGen
to take our schema/xml files and generate DAO.php, install.sql and uninstall.sql files, which have to be periodically regenerated. This is a minor inconvenience for a core developer, a potential gotcha for an extension developer, and a major coordination difficulty across all extensions in universe
(whenever any aspect of the generated code needs to change).
But what if we didn't have to generate those files? What if we could read schema information directly from the xml (or potentially a different source).
Current Structure:
Key:
File | Purpose | In Core | In Extensions |
---|---|---|---|
schema/xml
|
Canonical declaration of entity + all metadata | Run setup.sh -g
|
Run civix generate:entity-boilerplate
|
|
Add schema tables | civicrm.mysql |
auto_install.sql |
|
Drop schema tables | civicrm_drop.mysql |
auto_uninstall.sql |
|
Declare entity's existence | AllCoreTables.data.php |
*.entityType.php + entity-types-php mixin |
CRM/Core/I18n/SchemaStructure.php
|
Lists localizable table columns | Seems kinda redundant with other metadata? | Doesn't exist |
CRM_Core_DAO
|
Base class for all generated DAOs | All DAOs extend this class | Extension DAOs also extend core class (makes change-management across universe difficult) |
CRM_*_DAO_*
|
Generated from the xml file | Must be generated | Must be generated |
The generated DAO file (including stuff it inherits from CRM_Core_DAO
) serves a variety of purposes:
- OO class that allows a database table to be used like a php object, e.g.
$contact = new CRM_Contact_DAO_Contact(); $contact->first_name = 'Bob'; $contact->save();
- Static methods like
fields()
andindices()
which return the data from the xml file in php format. - Localizes strings, because CodeGen wraps titles and labels in
ts()
. - A bunch of other random static methods (e.g.
disableFullGroupByMode()
) which seem like they'd be better-placed in a separate utility class.
New Structure
If we no longer want to generate files and just read from a canonical source, then the main question is, "what should be the canonical source of entity metadata?"
-
Stick with XML: Keep the existing xml files but delete the generated stuff. Parse the xml at runtime to get that data.
- Pro: It's already there, no rewrites needed.
- Con: Poor DX (developers don't generally enjoy writing XML files).
- Con: It's very slow (the slowest by far of all the options) so heavy caching would be needed.
- Con: Without generating php files, another method of i18n string extraction would be needed (such as this).
-
DAO Files: Delete the
schema/xml
files and run everything from the generated DAO files, which going forward will be hand-edited instead of regenerated.- Pro: DAO files are already there.
- Con: Also poor DX (the boilerplate in those files would not be fun to write/edit by hand).
-
Somewhere Else: Move schema info to e.g. json files or better-structured PHP files.
- Pro: DX and performance could be optimized.
- Con: XML files must be rewritten (could be scripted).
- Con: Migration management for core and extensions.
Supporting Dynamic/Virtual Entities
It's also worth keeping in mind that there are now several types of entities that are dynamic & share a DAO:
- Multi-record custom fields
- ECK entities
- SearchKit materialized displays
The DAO structure doesn't cope with this very well, as the assumption has always been 1-1-1 between table, entity & DAO class. But while we're restructuring things let's avoid adding more code that makes this assumption. An ideal DAO from the POV of virtual entities would be an object that takes entity name in its constructor & initializes itself with the appropriately corresponding tablename, fields, and other metadata.