class Downloader(object): (source)
Constructor: Downloader(server_index_url, download_dir)
A class used to access the NLTK data server, which can be used to download corpora and other data packages.
Method | __init__ |
Undocumented |
Method | clear |
Undocumented |
Method | collections |
Undocumented |
Method | corpora |
Undocumented |
Method | default |
Return the directory to which packages will be downloaded by default. This value can be overridden using the constructor, or on a case-by-case basis using the download_dir argument when calling download()... |
Method | download |
Undocumented |
Method | incr |
Undocumented |
Method | index |
Return the XML index describing the packages available from the data server. If necessary, this index will be downloaded from the data server. |
Method | info |
Return the Package or Collection record for the given item. |
Method | is |
Undocumented |
Method | is |
Undocumented |
Method | list |
Undocumented |
Method | models |
Undocumented |
Method | packages |
Undocumented |
Method | status |
Return a constant describing the status of the given package or collection. Status can be one of INSTALLED, NOT_INSTALLED, STALE, or PARTIAL. |
Method | update |
Re-download any packages whose status is STALE. |
Method | xmlinfo |
Return the XML info record for the given item |
Constant | DEFAULT |
The default URL for the NLTK data server's index. An alternative URL can be specified when creating a new Downloader object. |
Constant | INDEX |
The amount of time after which the cached copy of the data server index will be considered 'stale,' and will be re-downloaded. |
Constant | INSTALLED |
A status string indicating that a package or collection is installed and up-to-date. |
Constant | NOT |
A status string indicating that a package or collection is not installed. |
Constant | PARTIAL |
A status string indicating that a collection is partially installed (i.e., only some of its packages are installed.) |
Constant | STALE |
A status string indicating that a package or collection is corrupt or out-of-date. |
Class Variable | download |
Undocumented |
Class Variable | url |
Undocumented |
Method | _download |
Undocumented |
Method | _download |
Undocumented |
Method | _get |
The default directory to which packages will be downloaded. This defaults to the value returned by default_download_dir(). To override this default on a case-by-case basis, use the download_dir argument when calling ... |
Method | _get |
The URL for the data server's index file. |
Method | _info |
Undocumented |
Method | _interactive |
Undocumented |
Method | _num |
Undocumented |
Method | _pkg |
Undocumented |
Method | _set |
Undocumented |
Method | _set |
Set a new URL for the data server. If we're unable to contact the given url, then the original url is kept. |
Method | _update |
A helper function that ensures that self._index is up-to-date. If the index is older than self.INDEX_TIMEOUT, then download it again. |
Instance Variable | _collections |
Dictionary from collection identifier to Collection |
Instance Variable | _download |
The default directory to which packages will be downloaded. |
Instance Variable | _errors |
Flag for telling if all packages got successfully downloaded or not. |
Instance Variable | _index |
The XML index file downloaded from the data server |
Instance Variable | _index |
Time at which self._index was downloaded. If it is more than INDEX_TIMEOUT seconds old, it will be re-downloaded. |
Instance Variable | _packages |
Dictionary from package identifier to Package |
Instance Variable | _status |
Dictionary from package/collection identifier to status string (INSTALLED, NOT_INSTALLED, STALE, or PARTIAL). Cache is used for packages only, not collections. |
Instance Variable | _url |
The URL for the data server's index file. |
Return the directory to which packages will be downloaded by default. This value can be overridden using the constructor, or on a case-by-case basis using the download_dir argument when calling download().
On Windows, the default download directory is PYTHONHOME/lib/nltk, where PYTHONHOME is the directory containing Python, e.g. C:\Python25.
On all other platforms, the default directory is the first of the following which exists or which can be created with write permission: /usr/share/nltk_data, /usr/local/share/nltk_data, /usr/lib/nltk_data, /usr/local/lib/nltk_data, ~/nltk_data.
Undocumented
Return the XML index describing the packages available from the data server. If necessary, this index will be downloaded from the data server.
Undocumented
Return a constant describing the status of the given package or collection. Status can be one of INSTALLED, NOT_INSTALLED, STALE, or PARTIAL.
The default URL for the NLTK data server's index. An alternative URL can be specified when creating a new Downloader object.
Value |
|
The amount of time after which the cached copy of the data server index will be considered 'stale,' and will be re-downloaded.
Value |
|
A status string indicating that a package or collection is installed and up-to-date.
Value |
|
A status string indicating that a collection is partially installed (i.e., only some of its packages are installed.)
Value |
|
A status string indicating that a package or collection is corrupt or out-of-date.
Value |
|
The default directory to which packages will be downloaded. This defaults to the value returned by default_download_dir(). To override this default on a case-by-case basis, use the download_dir argument when calling download().
Set a new URL for the data server. If we're unable to contact the given url, then the original url is kept.
A helper function that ensures that self._index is up-to-date. If the index is older than self.INDEX_TIMEOUT, then download it again.
Time at which self._index was downloaded. If it is more than INDEX_TIMEOUT seconds old, it will be re-downloaded.