Constructor: PCFG(start, productions, calculate_leftcorners)
A probabilistic context-free grammar. A PCFG consists of a start state and a set of productions with probabilities. The set of terminals and nonterminals is implicitly specified by the productions.
PCFG productions use the ProbabilisticProduction class. PCFGs impose the constraint that the set of productions with any given left-hand-side must have probabilities that sum to 1 (allowing for a small margin of error).
If you need efficient key-based access to productions, you can use a subclass to implement it.
Class Method | fromstring |
Return a probabilistic context-free grammar corresponding to the input string(s). |
Method | __init__ |
Create a new context-free grammar, from the given start state and set of ProbabilisticProductions. |
Constant | EPSILON |
The acceptable margin of error for checking that productions with a given left-hand side have probabilities that sum to 1. |
Inherited from CFG
:
Class Method | binarize |
Convert all non-binary rules into binary by introducing new tokens. Example:: Original: |
Class Method | eliminate |
Eliminate start rule in case it appears on RHS Example: S -> S0 S1 and S0 -> S1 S Then another rule S0_Sigma -> S is added |
Class Method | remove |
Remove nonlexical unitary rules and convert them to lexical |
Method | __repr__ |
Undocumented |
Method | __str__ |
Undocumented |
Method | check |
Check whether the grammar rules cover the given list of tokens. If not, then raise an exception. |
Method | chomsky |
Returns a new Grammer that is in chomsky normal :param: new_token_padding |
Method | is |
Return True if all productions are at most binary. Note that there can still be empty and unary productions. |
Method | is |
Return True if the grammar is of Chomsky Normal Form, i.e. all productions are of the form A -> B C, or A -> "s". |
Method | is |
Return True if all productions are of the forms A -> B C, A -> B, or A -> "s". |
Method | is |
True if left is a leftcorner of cat, where left can be a terminal or a nonterminal. |
Method | is |
Return True if all productions are lexicalised. |
Method | is |
Return True if there are no empty productions. |
Method | is |
Return True if all lexical rules are "preterminals", that is, unary rules which can be separated in a preprocessing step. |
Method | leftcorner |
Return the set of all nonterminals for which the given category is a left corner. This is the inverse of the leftcorner relation. |
Method | leftcorners |
Return the set of all nonterminals that the given nonterminal can start with, including itself. |
Method | max |
Return the right-hand side length of the longest grammar production. |
Method | min |
Return the right-hand side length of the shortest grammar production. |
Method | productions |
Return the grammar productions, filtered by the left-hand side or the first item in the right-hand side. |
Method | start |
Return the start symbol of the grammar |
Method | _calculate |
Pre-calculate of which form(s) the grammar is. |
Method | _calculate |
Undocumented |
Method | _calculate |
Undocumented |
Instance Variable | _all |
Undocumented |
Instance Variable | _categories |
Undocumented |
Instance Variable | _empty |
Undocumented |
Instance Variable | _immediate |
Undocumented |
Instance Variable | _immediate |
Undocumented |
Instance Variable | _is |
Undocumented |
Instance Variable | _is |
Undocumented |
Instance Variable | _leftcorner |
Undocumented |
Instance Variable | _leftcorner |
Undocumented |
Instance Variable | _leftcorners |
Undocumented |
Instance Variable | _lexical |
Undocumented |
Instance Variable | _lhs |
Undocumented |
Instance Variable | _max |
Undocumented |
Instance Variable | _min |
Undocumented |
Instance Variable | _productions |
Undocumented |
Instance Variable | _rhs |
Undocumented |
Instance Variable | _start |
Undocumented |
nltk.grammar.CFG.fromstring
Return a probabilistic context-free grammar corresponding to the input string(s).
Parameters | |
input | a grammar, either in the form of a string or else as a list of strings. |
encoding | Undocumented |
nltk.grammar.CFG.__init__
Create a new context-free grammar, from the given start state and set of ProbabilisticProductions.
Parameters | |
start:Nonterminal | The start symbol |
productions:list(Production) | The list of productions that defines the grammar |
calculate | False if we don't want to calculate the leftcorner relation. In that case, some optimized chart parsers won't work. |
Raises | |
ValueError | if the set of productions with any left-hand-side do not have probabilities that sum to a value within EPSILON of 1. |