Prefix-Suffix correlations (%)


The correlation was calculated in the following way.

  1. The forms with a preffixes o-,qo-,y- and suffixes -y, -dy was collected.

  2. For each form i the number of roots having it R(i) was calculated

  3. For each form pair (i,j) the number of roots having both N(i,j) was calculated.

  4.  The correlation Corr(i,j)=N(i,j)/Sqrt(R(i)*R(j)) *100%

Table 1. "Correlation" between prefix-suffix-forms 
Corr(%) *** ***dy ***y o*** o***dy o***y qo*** qo***dy qo***y y*** y***dy y***y
*** 100 11 11 21 8 8 19 7 6 7 7 11
***dy 11 100 25 9 28 17 9 26 20 10 18 14
***y 11 25 100 8 16 17 8 16 18 8 12 14
o*** 21 9 8 100 13 13 34 11 9 27 12 8
o***dy 8 28 16 13 100 32 14 45 35 11 32 23
o***y 8 17 17 13 32 100 15 31 31 13 21 20
qo*** 19 9 8 34 14 15 100 14 14 34 14 10
qo***dy 7 26 16 11 45 31 14 100 39 12 39 27
qo***y 6 20 18 9 35 31 14 39 100 11 28 24
y*** 7 10 8 27 11 13 34 12 11 100 16 11
y***dy 7 18 12 12 32 21 14 39 28 16 100 28
y***y 11 14 14 8 23 20 10 27 24 11 28 100

The next table calculates the probability of existence of the second form if the first form exists.

It was calculated as P(j|i)=F(i,j)/H(i)
where H(i) - the number of words having the form i,
F(i,j) - the number of word, having the form i with existing form j.
For example, if we have words  fory,kaly, kaly, kaldy then we have F(-y,-dy)=2, H(-y)=3, H(-dy)=1, F(-dy,-y)=1

Table 2. Conditional probabilities of prefix-suffix-forms 

P(%) *** ***dy ***y o*** o***dy o***y qo*** qo***dy qo***y y*** y***dy y***y
*** 100 39 52 60 10 17 48 4 7 53 3 8
***dy 61 100 70 27 58 59 21 45 45 31 47 52
***y 52 61 100 14 41 44 10 31 33 15 28 38
o*** 81 26 41 100 38 48 74 25 30 60 11 34
o***dy 39 89 81 45 100 86 21 82 79 23 68 60
o***y 41 64 77 43 68 100 32 63 75 31 48 63
qo*** 89 33 41 91 38 46 100 32 37 81 19 34
qo***dy 25 95 92 28 96 94 17 100 91 18 88 82
qo***y 37 74 91 39 76 92 33 81 100 35 71 83
y*** 78 26 36 73 23 32 66 15 20 100 15 21
y***dy 51 86 77 48 87 81 31 82 73 37 100 67
y***y 44 66 77 31 69 76 25 60 69 31 62 100


1.Words having  prefixes o-, qo- and y- , very often have forms without  preffix or with any other prefix.
2.Words having  suffixes dy-,  and y- , very often have forms without  suffix or with  other suffix.
3. Large amount of words do not accept some suffixes and preffixes, such words should be detected.

Actually, some part of roots are "active" and can have many forms