class LazyZip(LazyMap): (source)
Known subclasses: nltk.collections.LazyEnumerate
Constructor: LazyZip(*lists)
A lazy sequence whose elements are tuples, each containing the i-th element from each of the argument sequences. The returned list is truncated in length to the length of the shortest argument sequence. The tuples are constructed lazily -- i.e., when you read a value from the list, LazyZip will calculate that value by forming a tuple from the i-th element of each of the argument sequences.
LazyZip is essentially a lazy version of the Python primitive function zip. In particular, an evaluated LazyZip is equivalent to a zip:
>>> from nltk.collections import LazyZip >>> sequence1, sequence2 = [1, 2, 3], ['a', 'b', 'c'] >>> zip(sequence1, sequence2) # doctest: +SKIP [(1, 'a'), (2, 'b'), (3, 'c')] >>> list(LazyZip(sequence1, sequence2)) [(1, 'a'), (2, 'b'), (3, 'c')] >>> sequences = [sequence1, sequence2, [6,7,8,9]] >>> list(zip(*sequences)) == list(LazyZip(*sequences)) True
Lazy zips can be useful for conserving memory in cases where the argument sequences are particularly long.
A typical example of a use case for this class is combining long sequences of gold standard and predicted values in a classification or tagging task in order to calculate accuracy. By constructing tuples lazily and avoiding the creation of an additional long sequence, memory usage can be significantly reduced.
Method | __init__ |
No summary |
Method | __len__ |
Return the number of tokens in the corpus file underlying this corpus view. |
Method | iterate |
Return an iterator that generates the tokens in the corpus file underlying this corpus view, starting at the token number start. If start>=len(self), then this iterator will generate no tokens. |
Inherited from LazyMap
:
Method | __getitem__ |
Return the i th token in the corpus file underlying this corpus view. Negative indices and spans are both supported. |
Instance Variable | _all |
Undocumented |
Instance Variable | _cache |
Undocumented |
Instance Variable | _cache |
Undocumented |
Instance Variable | _func |
Undocumented |
Instance Variable | _lists |
Undocumented |
Inherited from AbstractLazySequence
(via LazyMap
):
Method | __add__ |
Return a list concatenating self with other. |
Method | __contains__ |
Return true if this list contains value. |
Method | __eq__ |
Undocumented |
Method | __hash__ |
No summary |
Method | __iter__ |
Return an iterator that generates the tokens in the corpus file underlying this corpus view. |
Method | __lt__ |
Undocumented |
Method | __mul__ |
Return a list concatenating self with itself count times. |
Method | __ne__ |
Undocumented |
Method | __radd__ |
Return a list concatenating other with self. |
Method | __repr__ |
Return a string representation for this corpus view that is similar to a list's representation; but if it would be more than 60 characters long, it is truncated. |
Method | __rmul__ |
Return a list concatenating self with itself count times. |
Method | count |
Return the number of times this list contains value. |
Method | index |
Return the index of the first occurrence of value in this list that is greater than or equal to start and less than stop. Negative start and stop values are treated like negative slice bounds -- i.e., they count from the end of the list. |
Constant | _MAX |
Undocumented |
nltk.collections.LazyMap.__init__
nltk.collections.LazyEnumerate
Parameters | |
*lists:list(list) | the underlying lists |
nltk.collections.LazyMap.__len__
Return the number of tokens in the corpus file underlying this corpus view.
nltk.collections.LazyMap.iterate_from
Return an iterator that generates the tokens in the corpus file underlying this corpus view, starting at the token number start. If start>=len(self), then this iterator will generate no tokens.