Help:Translation tutorial

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

This translation tutorial provides information helpful to use in translating Commons files into other languages. Both the file and the file description can be translated.

File description[edit]

For each language, add a description for each language the description is going to be translated into, and include all of the information, including any links. Each language should look like:

{{en|1=actual content}}
{{fr|1=le contenu en question}}

where "en" is the language being added. The 1= allows including characters such as =. It does not refer to any language numbering.

The template {{Multilingual description}} can also be used (but it is not recommended for file description pages):

{{Multilingual description
|ca=text en català.
|de=deutscher Text.
|en=English text.
|eu=Euskarazko testua
|fr=texte français.
|it=parole italiane.
|ja=日本語の文字
|mk=Текст на македонски.
|ru=Русский текст.
}}

SVG files[edit]

SVG files which contain plain text, using <text> tag(s), can be easily translated into other languages, and should be marked with either the {{Translation possible}} or {{Translate}} template. Opening the file and searching for "<text" can be used to translate the text. For example, File:1929 wall street crash graph.svg was found to have the following text:

<text x="230" y="60" text-anchor="middle">Wall Street Crash on the Dow Jones Industrial Average, 1929</text>
<!-- Y-axis -->
<g id="percentage_numbers" transform="translate(-5 0)" font-size="12"  text-anchor="end">
  <text y="238">100</text>
  <text y="184">200</text>
  <text y="130">300</text>
  <text y="76">400</text>
</g>
<!-- X-axis -->
<g id="x_axis_numbers" transform="translate(0 55)" font-size="12" text-anchor="middle">
  <text y="250">Oct</text>
  <text x="60" y="250">Jan</text>
  <text x="120" y="250">Apr</text>
  <text x="180" y="250">Jul</text>
  <text x="240" y="250">Oct</text>
  <text x="300" y="250">Jan</text>
  <text x="360" y="250">Apr</text>
  <text x="420" y="250">Jul</text>
  <text x="480" y="250">Oct</text>
  <text x="180" y="268" font-weight="bold" font-size="13">1929</text>
  <text x="420" y="268" font-weight="bold" font-size="13">1930</text>
</g>

This has two items that need to be translated, the title, "Wall Street Crash on the Dow Jones Industrial Average, 1929", and the names of the months of the year.

There are two choices for translation. Either into a new file name, usually just by adding the two-letter country code, for a separate translation into each language, or by adding multiple languages within the same file using <switch>.

Using a separate file[edit]

First, to translate this file into a new language by creating a separate file, just copy the source code into a new file name, adding the two-letter language code for the new language.[clarification needed]

There are three different end-of-line characters that might be used in the file (Apple/Unix/Windows), so if everything appears to be on one line, it could be because a different EOL character was used, and it can be sorted out by clicking on the SVG image, viewing source code in the browser, and copy and pasting the entire file into the saved file. It needs to be saved with an SVG extension and needs to use UTF-8 instead of ANSI. Unicode could be used but has interoperability problems. To obtain the .svg file extension from Windows, enclose the entire file name in quotes.

Translate the title and the names of the months into the new language, save the file, indicate which file it came from, and maintain whatever license was used. A PD file, though, can be used to create a file with any license. It is also helpful to translate the description, and equally helpful to maintain the existing description as a separate language for the description.[clarification needed]

Using the same file[edit]

The second choice is to add the new language (and other languages) to the same file using the SVG <switch> element.[1] The code below shows how to translate the title into Arabic, Chinese, French, and German:

<switch transform="translate(230 60)" text-anchor="middle">
  <text systemLanguage="en">Wall Street Crash on the Dow Jones Industrial Average, 1929</text><!-- English -->
  <text systemLanguage="ar">وول ستريت تحطم على مؤشر داو جونز الصناعي المتوسط 1929</text><!-- Arabic -->
  <text systemLanguage="de">Wall-Street-Absturz auf den Dow Jones Industrial Average, 1929</text><!-- German -->
  <text systemLanguage="fr">Krach de Wall Street sur le Dow Jones Industrial Average, 1929</text><!-- French -->
  <text systemLanguage="zh">華爾街股災道瓊斯工業平均指數,1929年</text><!-- Chinese -->
  <text>Wall Street Crash on the Dow Jones Industrial Average, 1929</text><!-- default -->
</switch>

SVG Translate may be used to automatically add switch elements to an SVG file.

Note that since all of the text is at the same location, the x,y was changed to a translate transformation on the switch element to avoid repeating the coordinates in the text elements. Since the text is anchored at the middle, longer or shorter translations will still be centered.

These phrases were translated using Google translate, but a better translation may be obtained by finding the title of the corresponding "Stock market crash of 1929" article in different language wikis. Another way to find translations is to look at the caption used for the image in a particular language wiki.

It is helpful to include a translation for every language wiki which uses the file, but this can be quite time-consuming, and additional language translations can always be added later. In this case, this file was used in 21 different languages.

As with a separate file, the file will need to be saved as UTF-8.

In building the file, it can be easily checked by viewing it in a browser, but there will still be surprises after it is uploaded, because of MediaWiki issues, and some adjustments might be needed. In particular, to get % signs or parentheses to display correctly at the end of right-to-left languages such as Hebrew and Arabic, Unicode LEFT-TO-RIGHT MARK (&#8206; or &lrm;) or RIGHT-TO-LEFT MARK (&#8207; or &rlm;) may be needed. For example, to display the title as "Stock Market (1929)" in Arabic, <text>&#8206;(سوق الأوراق المالية (1929</text> is used.

Most languages use a comma for a decimal separator, and a period for a thousands separator, while English, Japanese, Chinese, and some others reverse those. French uses a space for a thousands separator, and Russian prefers a space but can use a period. So even if the default is English, it is easier to make the languages that use a comma for a decimal separator the default for numbers.

English 25,678.921
French  25 678,921
Russian 25 678,921 or 25.678,921
Spanish 25.678,921 (also German)

Becomes

<switch>
  <text systemLanguage="en">25,678.921</text>
  <text systemLanguage="fr,ru">25 678,921</text>
  <text systemLanguage="de,es">25.678,921</text>
  <text>25,678.921</text>
</switch>

Language matching rules[edit]

A practical limitation of <switch> is due to the browser (or rsvg) setting and SVG's language tag matching rules. Most browsers allow the user to declare a language preference list (used for the Accept-Language field in the HTTP header).[2] For example, the language preference list "en-GB:q=1.0,en:q=0.9;de:q=0.5" (BCP 47)[3] says the user prefers British English the most, then any variety of English, and, failing those, would prefer German. The goal is to give the user the best language match, but a couple of things get in the way with Wikimedia SVG images.

First, Wikimedia does not serve up raw SVG, so the browser preference list does not interact with the display of the image. Instead, SVG files are first converted to a bitmap format by rsvg; Wikimedia serves those bitmap files rather than the SVG. Consequently, the language matching is done by rsvg. Furthermore, the language matching is not done with the browser’s language preference list but rather the |lang=xx argument specified in the [[File:yyyy.svg...]] markup.

Second, in SVG 1.1, there was no concept of ordering language preferences: the first compatible language in the switch list is selected rather than the best match in the entire list.[4] For example, an SVG file may try to present both US and Great Britain variants of English:

<switch>
  <text systemLanguage="en-US">center</text>
  <text systemLanguage="en-GB">centre</text>
  <text>-center-</text>
</switch>

If the user’s browser is set to just "en-GB" (language English, region Great Britain), then the user will get the spelling "centre".[5] Although that gives the desired result in this particular case, it often gives the wrong result in other cases because a browser setting of "en-GB" does not match the more general systemLanguage="en". A browser set to "en-GB" will only match "en-GB" content; it will ignore all other English content. For example, a Canadian user setting their browser to "en-CA" will get the default "-center-" from the above SVG code. It is better for most English speakers to set their browser language to include "en" to get the most reasonable behavior. The browser’s "en" will match not only an "en" tag but also any variant of English such as "en-US" or "en-GB". Setting the preference list to "en-GB:q=1.0,en:q=0.9" would be the appropriate way for a user to specify a preference for British over US English. Unfortunately, the SVG 1.1 matching rules do not work that way. In the above example, the "en" will match the systemLanguage=en-US, so the user stating a preference for British English will get the US spelling of "center". Displaying multiple variations of a single language in one file is problematic.

The semantics of switch may change and follow SMIL allowReorder="yes" rules.[6][7] Such a change would improve language matching but will not impact Wikimedia images until other changes are made. Although some browsers process SVG switch elements with allowReorder="yes", the attribute is not in the SVG standard and will cause SVG validation to fail.

Third, rsvg has a bug, so it does not follow SVG langtag semantics. Instead, it matches up to the first hyphen. That means that rsvg cannot distinguish en-US from en-GB. The bug has serious consequences for Chinese (zh-Hans and zh-Hant) and Serbian (sr-Latn and sr-Cyrl). rsvg will just match the first two letters.

Note: Large files (> 256 kb) may need to have the translations appear closer to the beginning of the file.

After it is uploaded, a new drop down box will appear listing all of the languages used. Add the template {{Translate|switch=yes}} just above the description heading, and remove the "Translation possible" template (or add "|switch=yes" to "{{Translate}}"), and move it to above the description, if used. This will create a box showing that the file has multiple translations, and includes advice on using them.

Using translations[edit]

After a new translation, or translation is added, it is helpful to go to each wiki which would benefit from the translated image. If this step is skipped it could be years before someone notices that the translated image is available.

  • If a new file is used, replace the old image name with the name of the translated image.
  • If the same file is used, with the <switch> element, add |lang=xx where xx is the two letter language code of that wiki.
    • If the filename is an infobox parameter, so that wikitext [[File:...]] is not directly accessible, append {{!}}lang=xx to the filename.

All new accounts are SUL accounts, so even if you have never edited in a particular wiki, if your account is SUL, you will be logged in to each wiki that you visit. When you edit the wiki, even though you understand nothing of the language you are editing, it is helpful to add an edit summary indicating that either a language selection is being added, or that the image is being replaced. For this, an automatic translation is better than using a different language. Worst case a common language, Arabic, Chinese, English, French, Russian, or Spanish, can be used for the edit summary (these are the official UN languages), but this is not as good.

Alternatively, an image link can be used as the edit summary:

Edit summary:
/*Foo bar*/[[File:Translation arrow.svg]]

This will add a link in the summary to the translation arrow image. The image itself will not appear in the page history.

SVG tools[edit]

  • There is a translation tool ‘SVG Translate’ (by Jarry1250) that can be used to assist with the translation of SVG files, but this tool is still beta. The tool does not do the translating; the translations are supplied by the user.
  • It can also be used the Wikimedia inline ‘SVGedit.js’ (by Rillke).

Other images[edit]

Any other image than SVG, needs to be done using a new file, normally with the two letter language code added to the file name. Images that are photographs should not normally be translated within the image, but by adding a caption.

Linking to other translations[edit]

A gallery or list can be used to link to all of the separate files that were created in different languages, in the "Other versions" section of the description. If there are more than a small number of translations, a separate file can be used for the gallery, normally named Template:Other versions/Name of image, see, for example, File:Atmosphere layers-en.svg, and Template:Other versions/Atmosphere layers.

Translation accuracy[edit]

As of 26 May 2016, Google Translate has translations for 104 languages ("he", Hebrew, is "iw"), but many of the translations are pretty clunky and require being corrected by a native speaker to remedy any translation errors, or at least note the error so that it can be fixed. There is no reason to ignore errors.

Translation help[edit]

References[edit]

  1. SVG 1.1 switch element, http://www.w3.org/TR/SVG/struct.html#SwitchElement
  2. http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.4
  3. http://www.ietf.org/rfc/bcp/bcp47.txt
  4. SVG 1.1, § 5.8.5, "The 'systemLanguage' attribute", http://www.w3.org/TR/SVG/struct.html#SwitchElement stating, "Evaluates to 'true' if one of the languages indicated by user preferences exactly equals one of the languages given in the value of this parameter, or if one of the languages indicated by user preferences exactly equals a prefix of one of the languages given in the value of this parameter such that the first tag character following the prefix is '-'."
  5. https://www.w3.org/TR/SVGTiny12/struct.html#SystemLanguageAttribute
  6. https://github.com/w3c/svgwg/issues/136
  7. https://www.w3.org/TR/SMIL/smil-content.html#adef-allowReorder