File:Zipf-tibe-1 Tibetan - Illusion, Vimalakirti, Perception.svg

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Original file(SVG file, nominally 512 × 504 pixels, file size: 493 KB)

Captions

Captions

Zipf Law plots for three Tibetan texts: Play of Mistaken Illusion, Sutra of Vimalakirti,

Summary

[edit]
Description
English: Zipf law plot (frequency as function of frequency rank) for the words in three Tibetan language texts. For these plots, each Tibetan syllable was considered a separate word. All texts were obtained from the the Asian Classics Input Project (ACIP) collection, and are stored in a slight variant of their encoding.

The texts and the word frequency files are:

  • The Play of Mistaken Illusion by Kyabje Trijang Rinpoche (mid 1900s). ACIP item 95306. Sample: LA 'THAMS PAS 'DI SNANG SPRE'U'I GAR BAR MED YON POR BSGYUR BA'I RNAM G [...] LOG GIS NYAG RONG GI SA. File tibe/pmi/tot.1/gud.wfr (original 143289 words, truncated/filtered to 35027 words, N = 1963 distinct).
  • The Sutra of Vimalakirti, from the Kangyur (core of the Tibetan Buddhist Canon), pages 271A-376B. The work is a translation from a ~100 CE Sanskrit original. ACIP item KL0176. Sample: SHES PA CHEN PO YONGS SU SBYANGS PA LAS NGES PAR BYUNG BA SANGS RGYAS [...] NYID MTSUNGS PAR 'JUG PA DE NI GNYIS SU MED PAR 'JUG PA'O. File tibe/vim/tot.1/gud.wfr (original 53287 words, truncated/filtered to 35027 words, N = 1300 distinct).
  • A Commentary to The Commentary on Valid Perception, By Ravigupta, pages 293B-325B and 331A-398A. Part of the classical commentaries (Tengyur) on the Tibetan Buddhist Canon. ACIP item TD4224. Sample: RGYA GAR SKAD DU PRA M'A nA B'ARTI KA BRATTI N'A MA BOD SKAD DU TSAD MA [...] TSOGS PAS BSKYED PA LA 'DRE BA GZHAN DGOS NA NI. File tibe/ccv/tot.1/gud.wfr (original 88620 words, truncated/filtered to 35027 words, N = 846 distinct).
The word frequency files '*/*/*/gud.wfr' are available at the UNICAMP website. The original annotated full texts, before truncation/filtering, are in the companion files */*/org/main.src. The truncated/filtered texts -- one word per line, without punctuation -- are in */*/*/gud.tlw.
Date
Source Own work
Author Jorge Stolfi

Licensing

[edit]
I, the copyright holder of this work, hereby publish it under the following license:
w:en:Creative Commons
attribution share alike
This file is licensed under the Creative Commons Attribution-Share Alike 4.0 International license.
You are free:
  • to share – to copy, distribute and transmit the work
  • to remix – to adapt the work
Under the following conditions:
  • attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
  • share alike – If you remix, transform, or build upon the material, you must distribute your contributions under the same or compatible license as the original.

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current21:56, 15 May 2023Thumbnail for version as of 21:56, 15 May 2023512 × 504 (493 KB)Jorge Stolfi (talk | contribs)Rebuilt the file with small changes in dataset, colors
18:21, 9 May 2023Thumbnail for version as of 18:21, 9 May 2023512 × 504 (492 KB)Jorge Stolfi (talk | contribs)Uploaded own work with UploadWizard

There are no pages that use this file.

Metadata