class documentation

A lazy sequence whose elements are formed by applying a given function to each element in one or more underlying lists. The function is applied lazily -- i.e., when you read a value from the list, LazyMap will calculate that value by applying its function to the underlying lists' value(s). LazyMap is essentially a lazy version of the Python primitive function map. In particular, the following two expressions are equivalent:

>>> from nltk.collections import LazyMap
>>> function = str
>>> sequence = [1,2,3]
>>> map(function, sequence) # doctest: +SKIP
['1', '2', '3']
>>> list(LazyMap(function, sequence))
['1', '2', '3']

Like the Python map primitive, if the source lists do not have equal size, then the value None will be supplied for the 'missing' elements.

Lazy maps can be useful for conserving memory, in cases where individual values take up a lot of space. This is especially true if the underlying list's values are constructed lazily, as is the case with many corpus readers.

A typical example of a use case for this class is performing feature detection on the tokens in a corpus. Since featuresets are encoded as dictionaries, which can take up a lot of memory, using a LazyMap can significantly reduce memory usage when training and running classifiers.

Method __getitem__ Return the i th token in the corpus file underlying this corpus view. Negative indices and spans are both supported.
Method __init__ No summary
Method __len__ Return the number of tokens in the corpus file underlying this corpus view.
Method iterate_from Return an iterator that generates the tokens in the corpus file underlying this corpus view, starting at the token number start. If start>=len(self), then this iterator will generate no tokens.
Instance Variable _all_lazy Undocumented
Instance Variable _cache Undocumented
Instance Variable _cache_size Undocumented
Instance Variable _func Undocumented
Instance Variable _lists Undocumented

Inherited from AbstractLazySequence:

Method __add__ Return a list concatenating self with other.
Method __contains__ Return true if this list contains value.
Method __eq__ Undocumented
Method __hash__ No summary
Method __iter__ Return an iterator that generates the tokens in the corpus file underlying this corpus view.
Method __lt__ Undocumented
Method __mul__ Return a list concatenating self with itself count times.
Method __ne__ Undocumented
Method __radd__ Return a list concatenating other with self.
Method __repr__ Return a string representation for this corpus view that is similar to a list's representation; but if it would be more than 60 characters long, it is truncated.
Method __rmul__ Return a list concatenating self with itself count times.
Method count Return the number of times this list contains value.
Method index Return the index of the first occurrence of value in this list that is greater than or equal to start and less than stop. Negative start and stop values are treated like negative slice bounds -- i.e., they count from the end of the list.
Constant _MAX_REPR_SIZE Undocumented
def __getitem__(self, index): (source)

Return the i th token in the corpus file underlying this corpus view. Negative indices and spans are both supported.

def __init__(self, function, *lists, **config): (source)
Parameters
functionThe function that should be applied to elements of lists. It should take as many arguments as there are lists.
*listsThe underlying lists.
cache_sizeDetermines the size of the cache used by this lazy map. (default=5)
**configUndocumented
def __len__(self): (source)

Return the number of tokens in the corpus file underlying this corpus view.

def iterate_from(self, index): (source)

Return an iterator that generates the tokens in the corpus file underlying this corpus view, starting at the token number start. If start>=len(self), then this iterator will generate no tokens.

_all_lazy = (source)

Undocumented

Undocumented

_cache_size = (source)

Undocumented

Undocumented

Undocumented