Canonize API for storing translated data

Open Issue created 4 years ago by totten

Goal

Enable richer user experiences which incorporate data-translation. Specifically, provide a CRUD API for administrative applications that need to read/write alternate versions of a string in the database.

Background

This is most immediately motivated by mail#83 (moved), which aims to improve the process+experience of drafting+testing workflow templates. For this case, the string that is being edited (ie civicrm_msg_template.msg_html) is a relatively rich piece of content (with HTML tags, tokens, Smarty expressions - which in turn may vary based on the context for which the template will be used). The richness of the text implies that one should have more features available (token-pickers, syntax-highlighting, ad nauseum). Editing a translation of this content in a generic textbox (as with multilingual UI, Transifex UI, or POEdit) would be difficult and error-prone.
This is intended as a step in support of community/feature-request#26, which is a broad effort (initiated by @ayduns @BjoernE) to re-conceive how the multilingual subsystem works. TLDR: Current multilingual requires significant MySQL schema manipulation. This works for 1-3 languages but does not scale to 10 languages. Resolving it requires changes in the storage/lifecycle of translated data.
Inspired by this discussion, Eileen wrote a proof-of-concept extension https://github.com/eileenmcnaughton/civi-data-translate. The scope of civi-data-translate mostly matches the scope of this filing, but not quite perfectly. It matches insofar as it introduces an APIv4 interface and a MySQL table for strings. It diverges insofar as it specifically touches on MessageTemplate. (The work for MessageTemplate is left as a separate matter.) Its biggest obstacle is dependency-hell: it requires a skilled administrator to maintain a deployment, which disincentivizes development and usage.

Approaches

Working within the limits of available code and capacity, it appears feasible to adapt civi-data-translate to this purpose. Either:

Move its APIv4 interface and data-storage to core-proper, or...
Move its APIv4 interface and data-storage to core-extension.

Comments

Having an API to edit the strings would be meaningless if we did not have a data-store.
There is a performance question about using MySQL for a string table. (Most FOSS applications use gettext MO files which are optimized for fast lookup of static strings. This is how Civi handles translation of its numerous app-strings.) In prior discussions with @BjoernE @ayduns etal, we identified this balance:
- There is a difference between administration (browsing/editing strings) and runtime lookup (substituting 1000 strings during a page-load).
- For administration, there is no question about whether the performance of a MySQL string-table would be acceptable. It would be. In fact, many different tools/workflows/stores can be acceptable.
- The performance question is relevant to runtime lookup of heavily used strings. The performance question is not necessarily closed, and it depends on other variables (the #data-strings, the use-case, the hardware, etc).
- If one does need to optimize lookup, the best known approach is to compile to gettext. To wit: Read strings from whatever source is handy, aggregate them, and write them to a cache folder in *.mo format. (You can see de.systopia.l10nmo as a foray into this approach of blending/merging string sources.)
I was worried about proposing this - specifically, worried that it might conflict with a more optimized dataflow. However, on reflection, I think it is complementary progress. Suppose you wanted to patch l10nmo to include a feed of strings provided by web-based administrators. If each web UI stored strings differently, then you'd probably give up. But if they use the same (shared/documented) string API, then it's easier to pull from there.
- (I mention this as a hypothetical. In practice, some things like mail#83 (moved) can be achieved without this level of optimization. The upshot is that we can bite off a chunk of work here on the API/storage side and make some incremental progress.)

Edited 4 years ago by totten

Canonize API for storing translated data

Goal

Background

Approaches

Comments

Linked items ... 0

Activity