nltk.tag.senna.SennaChunkTagger

class documentation

class SennaChunkTagger(Senna): (source)

Constructor: SennaChunkTagger(path, encoding)

Undocumented

Method	`__init__`	Undocumented
Method	`bio_to_chunks`	Extracts the chunks in a BIO chunk-tagged sentence.
Method	`tag_sents`	Applies the tag method over a list of sentences. This method will return for each sentence a list of tuples of (word, tag).

def __init__(self, path, encoding='utf-8'): (source) ¶

Undocumented

def bio_to_chunks(self, tagged_sent, chunk_type): (source) ¶

Extracts the chunks in a BIO chunk-tagged sentence.

>>> from nltk.tag import SennaChunkTagger
>>> chktagger = SennaChunkTagger('/usr/share/senna-v3.0')
>>> sent = 'What is the airspeed of an unladen swallow ?'.split()
>>> tagged_sent = chktagger.tag(sent) # doctest: +SKIP
>>> tagged_sent # doctest: +SKIP
[('What', 'B-NP'), ('is', 'B-VP'), ('the', 'B-NP'), ('airspeed', 'I-NP'),
('of', 'B-PP'), ('an', 'B-NP'), ('unladen', 'I-NP'), ('swallow', 'I-NP'),
('?', 'O')]
>>> list(chktagger.bio_to_chunks(tagged_sent, chunk_type='NP')) # doctest: +SKIP
[('What', '0'), ('the airspeed', '2-3'), ('an unladen swallow', '5-6-7')]

Parameters
tagged_sent:str	The chunk tag that users want to extract, e.g. 'NP' or 'VP'
chunk_type	Undocumented
Returns
iter(tuple(str))	An iterable of tuples of chunks that users want to extract and their corresponding indices.

def tag_sents(self, sentences): (source) ¶

Applies the tag method over a list of sentences. This method will return for each sentence a list of tuples of (word, tag).