Define `setLocale()` for languages which aren't "all there"
The Basic Question
Q: What to do if someone activates a locale that isn't "all there"?
To clarify the question:
- "Activate a locale" means "Call
setLocale('xx_XX')
". This could be:- Click a button in the web UI to set the language of the screen. (This requires a call to
setLocale(...)
.) - Send an email or SMS message in another language. The message includes tokens, and tokens can be localized. (This requires a call to
setLocale(...)
.) - Process an API call in an alternate language (such as "Preview mailing in locale X" / "Render message in locale X"). It may respond with localized data. (This requires a call to
setLocale(...)
.)
- Click a button in the web UI to set the language of the screen. (This requires a call to
- "All there" means that a locale is enabled/defined in all localization layers
- Is marked as active in
civicrm_option_value
(languages
; ie communication-preference languages) - Is marked as active in
civicrm_setting
(uiLanguages
; ie web UI languages) - Has
ts()
data (filel10n/xx_XX/civicrm.mo
) - (In multilingual deployments) Has SQL columns (aka setting
languageLimit
) - Has currency formatting rules
-
Has date formatting rules(afaik, we don't actually track this on a per-locale basis right now) - Is supported by Drupal/Joomla/WordPress
- Is marked as active in
Example: A staff member writes emails (Mailing
or MessageTemplate
) in few locales (eg es_US
and en_US
). They send email to a user who prefers es_US
. The email has tokens, which can rely on services like ts()
and Civi::format()
. But the system doesn't fully support es_US
(eg there is no l10n/es_US/civicrm.mo
). How will the email render? How will the tokens be processed?
The Basic Answer
A: Either switch the locale completely -- or mix-in substitutes from other locales.
Example (continued): If you are rendering an email for es_US
but lack some resources for es_US
, then you might switch the request to en_US
, or you might mix-in elements from es_MX
.
Ideal | Complete Switch | Mixed Locales | |
---|---|---|---|
tsLocale | es_US | en_US | es_MX |
dbLocale | es_US | en_US | es_MX |
Currency | es_US | en_US | en_US |
Dates | es_US | en_US | es_MX |
When the question matters (broadly)
I imagine that many deployers don't care -- they pick 1 or 2 well-defined locales and do everything in that locale. But it can matter ifeither...
- ...if you define communication-materials (
Mailing
,MessageTemplate
) in alternate languages - ...if your primary audience uses a locale that isn't fully supported (
es_US
,en_NZ
)
Mixed signals
At a lower level, the existing code has some inconsistent answers:
- When booting Civi, the
applyLocale()
checks several settings (from Civi+CMS) and decides which locale to activate. - In the Civi API, there's an option (
setLanguage()
/option.language
) to set the active locale. It only lets you choose locales that are configured for multilingual. - In localization unit-tests, some tests call
setLocale()
for locales that aren't enabled. - In the
CRM_Core_I18n::setLocale()
, it allows you pass anything (without validation).
At a higher level, I imagine two different attitudes:
- You should only process requests in locales that are fully supported. If a user performs an action in some other locale, then they will inevitably get quirks (eg mail-tokens that render with alien content). Better to make a clean switch to a supported locale rather than to peck at quirks.
- You should allow any locale with a best-effort/closest-match. Even if we don't have all localization resources for all locales, we have lots of pieces that get pretty close, and admins should be allowed to send messages with any locale they choose. Granted, they could send messages with tokens that don't localize properly; it's up to them to write the message and choose suitable tokens.
Exploratory branch
There is an exploratory branch (PR 24174) which currently answers this way:
- The class
Locale
represents the major decisions/data-points for the locale (eg tsLocale, dbLocale, moneyFormatter). - The function
Locale::negotiate('xx_XX')
is a focal-point -- it takes a preferred a locale (xx_XX
), examines the settings+resources, and creates a concreteLocale
descriptor. -
Locale::negotiate()
is influenced by a new setting,partial_locales
(on/off). If enabled, then it will mix locales. If disabled, it will only use complete locales.
A couple things in flux:
- The
negotiate()
looks at a bunch of settings and resources. Which ones matter? How should each be used? - Some tests were written on the assumption that
setLocale()
has no validation. Some of these can be easily tweaked so as the make the test-scenario more valid. But there are some (liketestUiLanguages()
andtestI18nEventCopy()
) which have relevant edge-cases. Need to change the test or changenegotiate()
. - Are there callers in contrib (
setLocale('xx_XX')
) which will barf if the values are negotiated?
(I may have done a similar sort of braindump on MM a couple weeks ago... but I want something to link back to and add comments on...)