- Overview
- Requirements
- Refreshing POT files
- Publishing POT files to Transifex
- Automated: Pulling new translations from Transifex
- Automated: Push files to the Github repository
- Automated: Compiling MO files
- Historical notes
Overview
In a somewhat reverse order, the life of a translation string in CiviCRM:
- When CiviCRM does a string lookup with
ts
to fetch the translation, it must find a matching string in the gettext mo dictionary for the target language. - (deprecated) The .mo files are shipped in the localisation files (ex: civicrm-5.x.x-l10n.tar.gz) on the download page.
- we host the mo files on download.civicrm.org and the installer fetches the specific files
- (deprecated) Those translation tar.gz files are generated during the release process of the various CiviCRM archives (CMS-specific + translation).
- The release scripts fetch the .mo files from gcloud, where they were uploaded by a Jenkins job that runs daily. From .mo to .po is basically the process of indexing the plain text .po file in a binary format efficient for fast lookups.
- To build the .mo files, the Jenkins Job fetches the .po (source / plain text) from the gitlab translation repository. This repository is updated daily, as part of the same daily jenkins job. This repo is a safeguard in case Transifex disappears. The repo is also mirrored on github.
- The daily fetch from Transifex basically does a
tx pull -a -t
(pull all translations) and commits them to github.
Now, for the string to exist on Transifex, we need to run the extraction from the CiviCRM source code. This process is done manually, and will be described more in detail below. Short overview:
- A member of the core team has to run the extraction scripts. They may run on multiple CiviCRM versions, depending which are supported (ex: ESR).
- Validate that the scripts ran correctly
- Push the new strings to transifex
Once the new strings are added to Transifex, translators can go and translate. After 24h, the translation files are updated on gcloud (from the first process above), and admin can manually update their translation files if they want to, or wait for the next CiviCRM release.
Supported CiviCRM versions
- Latest RC
- Latest ESR
Requirements
- A shell environment with buildkit
- Install Transifex Client, e.g.
sudo easy_install transifex-client
orsudo apt-get install transifex-client
- Install "moreutils" for the sponge utility, e.g.
sudo apt-get install moreutils
- Install the PHP command line interface (CLI) and MySQL bindings, e.g.
sudo apt-get install php-cli php-mysql
You will also need to checkout the git repositories in the following way:
-
~/buildkit/build
where your builds are located - Clone the translation git repo somewhere (
git clone https://lab.civicrm.org/dev/translation.git civicrm-translation
)
The translation repository contains:
-
bin
: scripts for automating some of the tasks -
po/<lang>
: translation files for each language -
po/pot/
: .pot files are the source translation files (in English, as extracted from the CiviCRM code)
The po
files use a different file hierarchy than the default Transifex one.
With Transifex, files are structured in the form of civicrm.<component>/<lang>.po
. In the l10n git repository, we use the structure "/.po". It makes it easier to extract and compile only part of the translation. The magic for making this happen is in the file: l10n/.tx/config.
Refreshing POT files
The .pot files are the source translation files that will be sent to Transifex for translation. They contain the source US-English strings.
We should regenerate the .pot files when a new release branch is created (ex: 5.14) when the new release goes RC (to allow translators a bit of time to translate before the official release). If we do not refresh the files, translators will not be able to find the string in Transifex.
In the following examples: 5.21 is the ESR and 5.22 is the RC.
To run the extraction process, run:
cd /path/to/civicrm-translation
./bin/build-unified-pots.sh ~/repositories/civicrm "6.2 5.81" 2>&1 | tee pots.log
You can also run the script on the 'master' branch, but this is not recommended except for testing.
This will (a) make fresh copies of the relevant releases in your temp directory, (b) bootstrap them with minimal config, (c) run GenCode to create all of the DAO files and (d) create the given release’s POT files, afterwards (e) combining the per-release POT files into release-agnostic POT files and (f) putting them in po/pot
.
Gettext will emit a warning if it detects an incorrect use of the ts() function. For example, ts("Create: $entity");
is a common mistake. It should be written as ts("Create: !1", \[1 => $entity\]);
. For more information, see the dev-guide Translation for Developers. If you find a mistake, correct it in the CiviCRM code (send a pull-request on Github), commit, then re-run the script.
If the /tmp partition is too small
The extraction creates a lot of temporary files in /tmp. If you do not have enough space in that directory, you can set the TMPDIR environment variable to use another directory (before running the build-unified-pots.sh script).
For example:
mkdir ~/tmp
export TMPDIR=~/tmp
Review new translation strings
Review the new strings to make sure that the scripts are working correctly. Accidentally deleting strings from Transifex would be pretty disastrous. It's also worth taking a few minutes to spot obvious typos in new strings.
$ git status
$ ./bin/diff-check.php | less
$ ./bin/diff-check.php | grep -E '^(\# FILE|removed|added|unchanged)'
$ ./bin/diff-check.php | grep -E '^(removed|added|unchanged)' | awk '{foo[/home/bgm/repositories/civicrm-extraction] += po/pot} END {for (f in foo) {print f, foo[f]}}'
If it all looks good, commit the changes:
git add po/pot/*
git commit -m "New strings for version 5.xx"
git tag 5.21
git push --tags
Publishing POT files to Transifex
Once the new files are generated, add them to the repository (git add -p
if you want to take a look at what changed, git add .
otherwise; git commit
to commit and git push
to push to the remote repository).
Transifex is nice enough to track the repository’s POT files, so the refreshed POT files do not need to be uploaded to Transifex explicitly. If needed, you can run:
tx push -s
See the .tx/config
file for more information.
If a new component was added to CiviCRM (i.e. you have a new something.pot that was created), it needs to be added to Transifex. For example:
tx set --auto-local -r civicrm.pcp 'po/<lang>.po' --source-lang en --source-file po/pot/pcp.pot --execute
tx push -s -r civicrm.pcp
tx pull -a -r civicrm.pcp
For more information: http://help.transifex.net/features/client/index.html#auto-local.
You will also have to add an entry in the .tx/config file (you can copy/adapt an entry for another file).
Automated: Pulling new translations from Transifex
Since 2013-01-18, this is done daily on a test node by Jenkins.
Before a release (or whenever we want to recompile MO files) new translations need to be pulled from Transifex. This is done with the Transifex Client; the configuration is already in the repository (in the .tx
directory), so a simple tx pull -a
should suffice.
After pulling the translations they should be committed to the repository (with git add po; git commit -m 'pull from Transifex'; git push
).
Automated: Push files to the Github repository
Since 2013-01-18, this is done daily on a test node by Jenkins.
Once you have finished adding/reviewing the .pot files, pulled the .po translations for each language, you can do a last review and push your changes to github:
git status
git diff
./bin/commit-to-git.sh
git push
The "commit-to-git.sh" script will use the "Last-Translator" as the "author" of the git commit. Although it's not very precise, it gives some idea of who is translating what.
Automated: Compiling MO files
Since 2014-04-29, this is run daily on a test node via Jenkins.
Only languages with more than 20% are added to the official civicrm-l10n.tar.gz file. See: New-official-language for more information.
The .mo files are available from download.civicrm.org, although not browsable, but you can guess the URL with the form:
https://download.civicrm.org/civicrm-l10n-core/mo/XX/civicrm.mo
... where XX is the language code, such as 'es_ES', 'fr_FR', 'fr_CA'.
Old historical note: It should be fixed by creating a script that uses the xx → xx_YY language mappings found in the xml/templates/languages.tpl
file (or the relevant parts of the civicrm_option_value
table).
To recompile MO files run the bin/compile-mo-files.sh
script. This will create civicrm.mo
files for every language:
chastell@devielle:~/CiviCRM/git/l10n$ bin/compile-mo-files.sh
compiling af
compiling ar
…
compiling de
compiling de_CH
compiling el
…
compiling vi
compiling zh_CN
compiling zh_TW
chastell@devielle:~/CiviCRM/git/l10n$ git status
# On branch master
# Untracked files:
# (use "git add <file>..." to include in what will be committed)
#
# po/af/civicrm.mo
# po/ar/civicrm.mo
…
# po/de/civicrm.mo
# po/de_CH/civicrm.mo
…
# po/vi/civicrm.mo
# po/zh_CN/civicrm.mo
# po/zh_TW/civicrm.mo
nothing added to commit but untracked files present (use "git add" to track)
The trick is to move the MO files to their relevant places in the Subversion repository (we keep the MO files in the Subversion repository, as it’s linked via svn:externals
link from our main repository). Unfortunately, Transifex uses the shortened language names where possible (e.g., pl
and hi
), using the full xx_YY
notation only where necessary (so pt
is Portuguese and pt_BR
is Brasilian Portuguese), while CiviCRM always uses the full xx_YY} notation (i.e., {{pl_PL
, hi_IN
and pt_PT
for Polish, Hindi and Portuguese, respectively).
This means that the compiled MO files need to be carefully moved to their relevant subdirectories in the Subversion repository. A crude solution (assuming the Subversion checkout of http://svn.civicrm.org/l10n is at ../../svn/l10n
):
Historical notes
- March 2010: Migration from Pootle to Transifex
- March 2011: One single translation file for multiple versions of CiviCRM (the 3 latest stable versions), instead of having version-specific translations.
- 2013? Extension translation