«
module documentation

ALINE http://webdocs.cs.ualberta.ca/~kondrak/ Copyright 2002 by Grzegorz Kondrak.

ALINE is an algorithm for aligning phonetic sequences, described in [1]. This module is a port of Kondrak's (2002) ALINE. It provides functions for phonetic sequence alignment and similarity analysis. These are useful in historical linguistics, sociolinguistics and synchronic phonology.

ALINE has parameters that can be tuned for desired output. These parameters are: - C_skip, C_sub, C_exp, C_vwl - Salience weights - Segmental features

In this implementation, some parameters have been changed from their default values as described in [1], in order to replicate published results. All changes are noted in comments.

Example usage

# Get optimal alignment of two phonetic sequences

>>> align('θin', 'tenwis') # doctest: +SKIP
[[('θ', 't'), ('i', 'e'), ('n', 'n'), ('-', 'w'), ('-', 'i'), ('-', 's')]]

[1] G. Kondrak. Algorithms for Language Reconstruction. PhD dissertation, University of Toronto.

Function align Compute the alignment of two phonetic strings.
Function delta Return weighted sum of difference between P and Q.
Function demo A demonstration of the result of aligning phonetic sequences used in Kondrak's (2002) dissertation.
Function diff Returns difference between phonetic segments P and Q for feature F.
Function R Return relevant features for segment comparsion.
Function sigma_exp Returns score of an expansion/compression.
Function sigma_skip Returns score of an indel of P.
Function sigma_sub Returns score of a substitution of P with Q.
Function V Return vowel weight if P is vowel.
Variable C_exp Undocumented
Variable C_skip Undocumented
Variable C_sub Undocumented
Variable C_vwl Undocumented
Variable cognate_data Undocumented
Variable consonants Undocumented
Variable feature_matrix Undocumented
Variable inf Undocumented
Variable R_c Undocumented
Variable R_v Undocumented
Variable salience Undocumented
Variable similarity_matrix Undocumented
Function _retrieve Retrieve the path through the similarity matrix S starting at (i, j).
def align(str1, str2, epsilon=0): (source)

Compute the alignment of two phonetic strings.

(Kondrak 2002: 51)

Parameters
str1Undocumented
str2Undocumented
epsilon:float (0.0 to 1.0)Adjusts threshold similarity score for near-optimal alignments
str1, str2:strTwo strings to be aligned
Returns
Alignment(s) of str1 and str2
Unknown Field: rtpye
list(list(tuple(str, str)))
def delta(p, q): (source)

Return weighted sum of difference between P and Q.

(Kondrak 2002: 54)

def demo(): (source)

A demonstration of the result of aligning phonetic sequences used in Kondrak's (2002) dissertation.

def diff(p, q, f): (source)

Returns difference between phonetic segments P and Q for feature F.

(Kondrak 2002: 52, 54)

def R(p, q): (source)

Return relevant features for segment comparsion.

(Kondrak 2002: 54)

def sigma_exp(p, q): (source)

Returns score of an expansion/compression.

(Kondrak 2002: 54)

def sigma_skip(p): (source)

Returns score of an indel of P.

(Kondrak 2002: 54)

def sigma_sub(p, q): (source)

Returns score of a substitution of P with Q.

(Kondrak 2002: 54)

Return vowel weight if P is vowel.

(Kondrak 2002: 54)

C_exp: int = (source)

Undocumented

C_skip: int = (source)

Undocumented

C_sub: int = (source)

Undocumented

C_vwl: int = (source)

Undocumented

cognate_data: str = (source)

Undocumented

consonants: list[str] = (source)

Undocumented

feature_matrix: dict = (source)

Undocumented

Undocumented

R_c: list[str] = (source)

Undocumented

R_v: list[str] = (source)

Undocumented

salience: dict[str, int] = (source)

Undocumented

similarity_matrix: dict[str, float] = (source)

Undocumented

def _retrieve(i, j, s, S, T, str1, str2, out): (source)

Retrieve the path through the similarity matrix S starting at (i, j).

Returns
list(tuple(str, str))Alignment of str1 and str2