Commons:Ancient Chinese characters project/Tutorial

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

This is the Commons:Ancient Chinese characters project's tutorial for creating SVG scalable images from orginary gif files for the project.

Selecting the source images[edit]

  1. Choose a Chinese character, Japanese Kanji or Korean Hanja that has not been converted to SVG before or which might be done somewhat better.
  2. Pick up one or even all of the images for the styles seal, bigseal, bronze and oracle:
  3. Find data and pictures of the character in question at Chinese Etymology (Richard Sears allowed use of his data[1])
  4. Select the to you most interesting picture of a given category and download it to your computer.
    • For each style, save the selected gif or png image on you computer. To distinguish between them you may follow this naming convention:
    • oracle => *-oracle.gif / png
    • bronze => *-bronze.gif / png
    • bamboo and silk => *-silk.png
    • bamboo and slip => *-slip.png
    • seal => *-seal.gif / png
    • bigseal => *-bigseal.gif

Please keep the code name of the images, i.e. "J12333" in J12333.gif, this will be needed later.

Conversion from gif to SVG format[edit]

Conversion using Inkscape[edit]

Follow these steps to convert the gif image to SVG, e.g. *-seal.gif to *-seal.svg. For more information, please see the detailed picture guide below.

  1. Paste the gif image file into Inkscape and set the page size to 300px × 300px.
  2. Scale the gif image to 300px in height, uniformly, that is preserving its original proportions.
  3. Select > Path -> Trace Bitmap from the tool bar (Shortcut: Shift + Alt + B)
  4. Run a Single Scan either with Brightness cutoff = 0.950 or Color quantization = 2
  5. Retouch the path to make up for the gif file's low quality and correct obvious mistakes
  6. Enlarge the SVG path so that it has a height of 290px, again uniformly to preserve the original proportions.
  7. Center the SVG image relative to the page by selecting > Object -> Align and Distribute from the tool bar (Shortcut Shift + Ctrl + A). Do this vertically and horizontally.
  8. If everything went well, then you can delete the underlying gif image which is still there under the path.
  9. Save your file according to the naming convention, e.g. *-seal.svg, where * is the character you looked for in step 1.

Detailed picture guide[edit]

Steps described using Inkscape
Before post-edition
After post-edition

Please see this tutorial or click on the thumbnail on the left side for a detailed step-by-step tutorial on how the conversion is done.

Creating SVG files is really important since SVG is scalable, while gif is not. Converting these images to SVG is really easy even with no experience in image manipulation, and Wikimedia benefits greatly from your work. Thank you for your contribution.

After the usual learning curve (about five characters), it will usually take from three to five minutes to convert a single character with shape improvements. Of course complex characters may need a little more attention and simple ones (like the one in the example image above) may need a little less attention.

Post editing[edit]

The automated result is generally messy. Sometime that is good enough, sometime the picture needs improvement :

  • Add a control point at the end of the character lines, and make sure that it is not an angle point.
  • Add an angle control point where two lines intersect, and make sure the tangents are in the correct directions.
  • Separate the lines where two lines have been merged.

Conversion using potrace[edit]

Inkscape is not the only program allowing to convert raster images to vectorial images. Another program which gives good results is 'potrace'. 'potrace' can convert 'bmp' images to SVG and is scriptable under Linux. So you could convert the GIF to BMP using ImageMagick convert and then the BMP to SVG using 'potrace'.

A sample Shell script taking the base GIF file name as argument follows:

convert $1.gif $1.bmp
potrace -s $1.bmp

The Linux tutorial describes how to automate the task of generating centered SVG files from the GIFs.

Upload to Wikimedia Commons[edit]

The ACClicense template[edit]

Upload the converted gif file to Wikimedia Commons and use the ACClicense-Template as follows:

Set up an "universal" template for ACC project, with accurate descriptions matching the data (© 2003 Richard Sears) from www.internationalscientific.org.

When you upload an SVG ancient Chinese character, please copy paste this "ACClicense" template such as :

  • {{ACClicense|字|oracle|shang|strokes=|component1=}} (for Shang oracle images)
  • {{ACClicense|字|oracle|zhouyuan|strokes=|component1=}} (for Zhouyuan oracle images)
  • {{ACClicense|字|bronze|western|strokes=|component1=}} (for Western Zhou bronzeware images)
  • {{ACClicense|字|bronze|shang|strokes=|component1=}} (for Shang bronzeware images)
  • {{ACClicense|字|bronze|spring|strokes=|component1=}} (for Spring and Autumn bronzeware images)
  • {{ACClicense|字|bronze|warring|strokes=|component1=}} (for Warring States bronzeware images)
  • {{ACClicense|字|slip|chu|strokes=|component1=}} (for Chu slip images)
  • {{ACClicense|字|slip|qin|strokes=|component1=}} (for Qin slip images)
  • {{ACClicense|字|silk|chu|strokes=|component1=}} (for Chu silk images)
  • {{ACClicense|字|silk slip|strokes=|component1=}} (for other silk slips)
  • {{ACClicense|字|seal|strokes=|component1=}} (for Shuowen seal images)
  • {{ACClicense|字|ancient|strokes=|component1=}} (for Shuowen ancient images)
  • {{ACClicense|字|zhou|strokes=|component1=}} (for Shuowen zhou images)
  • {{ACClicense|字|odd|strokes=|component1=}} (for Shuowen odd images)
  • {{ACClicense|字|liushutong|strokes=|component1=}} (for Liushutong / Big seal images)
Parameter Mandatory? Meaning
1 Mandatory The character being described. This character becomes a category. To avoid that categorizing, use cat=r with parameters 6 and stroke.
2 Mandatory The type of script. May be one of : oracle / bronze / seal / bigseal.
3 Mandatory only for oracle and bronze The time period of unearthed characters (oracle, bronze, slip, silk). Oracle, slip and silk will be auto-filled, so this parameter is just for bronze. May be one of:
  • shang (for Shang oracle or Shang bronzeware)
  • zhouyuan (for Zhouyuan oracle)
  • west / western / western Zhou / Western Zhou / zhouyuan (for Western Zhou)
  • spring / spring and autumn / Spring and Autumn (for Spring and Autumn)
  • war / warring / warring states / Warring States (for Warring States).
4 Optional The code name of the former GIF image in Richard Sears. It is not required, but welcome if your source is Richard Sears.
5 Optional For various comments, if needed. But linguistic informations are not welcome: other websites provide such datas with more accuracy than us.
6
rad
Suggested * Enables categorization depending on the system of the 214 traditional Kangxi radicals; it can be coded as a three-digit number, e.g. 078 (leading zeroes are not necessary and may be omitted).
7 Optional For free comment (Kangxi variants).
8 Optional Enables categorization (only to use if the character belongs to the 540 ancient Shuowen radicals, or their variants); it should be coded as a three-digit number, e.g. 078.
9 Optional For free comment (Shuowen variants).
kaiOrder Optional The Kaishu number of a character in Sinica Database which can be found in Sinica's URL. It is not required and only used when the character being described is different from the source.
fontcode Optional The code name of the former PNG image in Sinica Database which can be found in Sinica's URL. It is not required, but welcome if your source is Sinica.
draw< Optional Allows a classification of the glyph according to its strokes number, for comparison and research purpose. See the stroke tutorial if you intend to use it.
This feature is automatically disabled for the Liushutong script, so please don't include this function when you upload -bigseal.svg files.
See also Commons:Maint:ACC:NoDraw.
component1 Mandatory Allows a classification of the glyph according to its components, for comparison and research purpose. It should be used for simple or compound characters. If no component is given, the character is indexed in the category:ACC needing decomposition category.
component2 to
component6
Optional The other components, used for compound characters.
permission Optional You can also provide your own free license(s).
strokes Suggested * The number of additional strokes; together with the radical number in parameter 6, used by Template:Rcat for categorizing.
cat Suggested * Set cat=r to use Template:Rcat for categorizing with parameters 6 and stroke; set cat=n for none.
See also Commons:Maint:ACClicense for undefined categories.
var Optional Can be used when the character uses another radical variant, e.g. standard rad(66)=攴, var=攵.
i Suggested * for SVG To add the parameters for Template:Igen, e.g. i=I[nkscape]; i=P[otrace].
See also Commons:Maint:ACC:Unspec for undefined SVG tools.
*: it is recommended to provide these parameters.

You do not have to type anything else while uploading your picture as the license and description are automatically added by using this template!

The HTML form to be used is the basic form, which allows to enter the template above in the "Summary" field.

For the character used in the tutorial this would be:

{{ACClicense|木|oracle||j14138|strokes=4|component1=木}} because this character was found by searching Chinese Etymology for "木" and then using the oracle style picture provided there. The j14138 was the original file name of that picture, j14138.gif.

Note: {{ACClicense|木|oracle|oracle|j14138||075|strokes=4|component1=木|i=I}} would be correct, because

  • 075 木 shows the traditional Chinese radical 75
  • i=I the SVG image was created with Inkscape

To inhibit categorization, instead of strokes=4 it can be coded strokes=0|cat=r:

  1. the template Rcat will check whether the category (in this case ) exists; if not yet, the "redlink" categorization does not occur
(when later the category becomes defined, an 'empty' edit-save changes the categorization)
  1. because ACClicense knows the stroke count of all radicals (木 consists of four strokes), just the number of the additional strokes must be specified - zero for images of radicals.

-- sarang사랑 12:08, 15 January 2010 (UTC)[reply]

Identifying character components[edit]

Character components are used to classify the characters in the Category:Ancient Chinese characters by components.

The component1 to component6 parameters are to be chosen this way:

  • Be sure to identify the component on the Ancient Chinese Character, which may be different from the modern one!
  • Most of the time, each character component may be separately indexed. Some exceptions are:
    • If a compound character appears in several (let us say, more than five) ancient characters, it may have a category of its own and be indexed separately (in that case, the compound character category is itself part of its components' categories).
    • If a character is used in only one or two ancient character, there is no need for a separate category. In that case, the "Misc" component name must be used.
  • If a component cannot be identified, do not skip it but replace it with a "?" component (question mark). Somebody else may know better while further browsing the Category:ACC containing ? and replace it by its value.

References[edit]

  1. Richard Sears' Agreement