Generate XML schema docs from XML schema definition files
Background
CiviCRM uses XML for lots of things like:
- Database schema definition
- Extension metadata
- Menu configuration
- etc...
The acceptable schema for these XML files (i.e. elements, attributes, and values) has historically been poorly documented. Also, when I wrote the Database Schema XML docs, I uncovered a number of inconsistencies in the XML tags which demonstrates both: (a) our lack of documentation, and (b) our inability to validate our own XML.
Research
General
Recently, while improving these docs, I spent some time researching various technologies for machine-readable XML schema specifications. Based on this research I've identified two useful (and competing) tools:
- XML Schema Definition (aka XSD)
- RELAX NG (aka RNG).
Both XSD and RNG allow you to write a schema definition file which describes the acceptable elements, attributes, values, and grammar in a valid XML file. Within the schema definition file, you can even write descriptions and examples.
Having .xsd
or .rng
files would allow us to do the following:
- Validate our XML
- Provide auto-completion and code-level docs for XML authors using compatible IDEs (e.g. PhpStorm)
- Generate human-readable documentation
Doc generating
Auto-generating documentation is the biggest reason that I'm personally interested in this project, although the other two benefits are awesome as well. I spent quite a bit of time researching tools to auto-generate docs from XML schema, and this is what I found:
Tool | Quality | Comments |
---|---|---|
bitfehler/xs3p | Most promising, but buggy, difficult to embed | |
Oxygen | Proprietary | |
xs3p | Terrible style, hard to comprehend output | |
xml-tools | Web-based, unclear license, no source-code published |
There were some others, but basically the point is: none of the existing tools are suitable for our needs.
Goal
This ticket requests the following steps:
- Identify areas of CiviCRM which utilize XML (some are already listed above)
- Write schema definition files for those XML schemata
- Publish the schema definition files on the Internet with short URLs
- Provide documentation for people editing XML which tells them how to specify a
xmlns
attribute on the root element using the URL for our schema definition file. This way, their (compatible) IDE will auto-complete and provide in-app docs - Write a script (from scratch) which generates human-readable documentation like this from the schema definition file.
- Incorporate necessary processes into our documentation to ensure that updates to schema result in auto-generated documentation updates.
- Add functionality to the automated test suite to validate XML files used in core (or delegate this to a new ticket)
- Add functionality to
civix
to validate XML files used in extensions (or delegate this to a new ticket)
Recommendations
- Choose RNG over XSD
- Why? Because it seems much easier to learn, write, and read. It's a newer technology so its support is not as extensive as XSD, but from what I can tell it has enough support for us to use
- RNG has two syntax variants (much akin to SASS). Choose the XML syntax over the compact syntax
- Why? The compact syntax seems to have a steeper learning curve but more rewarding experience once you're "over the hump". The XML syntax seems to be a bit easier to get started with. Being that we don't do a ton of work in this field, I think it's best that we stick with the simpler one — XML
- Generate markdown (as opposed to generating HTML)
- Why? I think this will be the best strategy for allowing us to easily embed the docs within our current doc systems. Plus, I think the wider community of people who might be interested in this sort of thing would also appreciate that the format being somewhat more portable to their doc systems.
- Follow the format of the
info.xml
doc page- I think this page works pretty well and it wouldn't be too complicated to generate that markdown