File:Frequency of digrams in Ukrainian words.png
Original file (1,252 × 1,252 pixels, file size: 498 KB, MIME type: image/png)
Captions
Summary
[edit]DescriptionFrequency of digrams in Ukrainian words.png |
English: Dot (.) represents beginning and the end of a word. Number under the bigram AB is % probability that given letter A, next letter would be B.
Diagram should be read per row. Total probability in row is 100%. First row represents probability of each character starting a word. Second - probability of characters that appear after "а". Firsts column - probability that when we are on a character of that row, word ends. Not probability of a word ending with that character. So total sum of probabilities in column is not 100%.
words = open('/usr/share/dict/ukrainian').read().splitlines() # needs package wukrainian to be installed
itos = ".абвгґдеєжзиіїйклмнопрстуфхцчшщьюя'-"
stoi = {s: i for i, s in enumerate(itos)}
nchars = len(itos)
import torch
import random
N = torch.zeros((len(stoi), len(stoi)), dtype=torch.int32)
for w in words:
chrs = ['.'] + list(w.lower()) + ['.']
for c1, c2 in zip(chrs, chrs[1:]):
i1 = stoi[c1]
i2 = stoi[c2]
N[i1, i2] += 1
P = N.float()
P = P / P.sum(1, keepdim=True)
import matplotlib.pyplot as plt
%matplotlib inline
# plt.imshow(N)
fig = plt.figure(figsize=(16, 16))
plt.imshow(P, cmap='Blues')
for i in range(nchars):
for j in range(nchars):
chstr = itos[i] + itos[j]
plt.text(j, i, chstr, ha="center", va="bottom", color='gray')
plt.text(j, i, '%.1f' % (P[i, j].item()*100.0), ha="center", va="top", color='gray')
plt.axis('off')
fig.savefig('uk_digrams.png', bbox_inches='tight')
|
Date | |
Source | Own work |
Author | Bunyk |
Licensing
[edit]- You are free:
- to share – to copy, distribute and transmit the work
- to remix – to adapt the work
- Under the following conditions:
- attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- share alike – If you remix, transform, or build upon the material, you must distribute your contributions under the same or compatible license as the original.
File history
Click on a date/time to view the file as it appeared at that time.
Date/Time | Thumbnail | Dimensions | User | Comment | |
---|---|---|---|---|---|
current | 18:33, 10 August 2023 | 1,252 × 1,252 (498 KB) | Bunyk (talk | contribs) | Uploaded own work with UploadWizard |
You cannot overwrite this file.
File usage on Commons
There are no pages that use this file.
File usage on other wikis
The following other wikis use this file:
- Usage on uk.wikipedia.org
Metadata
This file contains additional information such as Exif metadata which may have been added by the digital camera, scanner, or software program used to create or digitize it. If the file has been modified from its original state, some details such as the timestamp may not fully reflect those of the original file. The timestamp is only as accurate as the clock in the camera, and it may be completely wrong.
Software used | |
---|---|
Horizontal resolution | 39.37 dpc |
Vertical resolution | 39.37 dpc |