Utility functions for the :module:`twitterclient` module which do not require
the twython
library to have been installed.
Function | extract |
Extract field values from a full tweet and return them as a list |
Function | get |
Undocumented |
Function | json2csv |
Extract selected fields from a file of line-separated JSON tweets and write to a file in CSV format. |
Function | json2csv |
Extract selected fields from a file of line-separated JSON tweets and write to a file in CSV format. |
Function | outf |
Get a CSV writer with optional compression. |
Constant | HIER |
Undocumented |
Function | _add |
Undocumented |
Function | _get |
Undocumented |
Function | _get |
Undocumented |
Function | _is |
Undocumented |
Function | _outf |
Undocumented |
Function | _write |
Undocumented |
Extract field values from a full tweet and return them as a list
Parameters | |
tweet | Undocumented |
fields | Undocumented |
json tweet | The tweet in JSON format |
list fields | The fields to be extracted from the tweet |
Returns | |
list(str) | Undocumented |
Extract selected fields from a file of line-separated JSON tweets and write to a file in CSV format.
This utility function allows a file of full tweets to be easily converted to a CSV file for easier processing. For example, just TweetIDs or just the text content of the Tweets can be extracted.
Additionally, the function allows combinations of fields of other Twitter objects (mainly the users, see below).
For Twitter entities (e.g. hashtags of a Tweet), and for geolocation, see
json2csv_entities
Parameters | |
fp | Undocumented |
outfile | Undocumented |
fields | Undocumented |
encoding | Undocumented |
errors | Undocumented |
gzip | if True , output files are compressed with gzip |
str infile | The name of the file containing full tweets |
str outfile | The name of the text file where results should be written |
list fields | The list of fields to be extracted. Useful examples are 'id_str' for the tweetID and 'text' for the text of the tweet. See <https://dev.twitter.com/overview/api/tweets> for a full list of fields. e. g.: ['id_str'], ['id', 'text', 'favorite_count', 'retweet_count'] Additionally, it allows IDs from other Twitter objects, e. g., ['id', 'text', 'user.id', 'user.followers_count', 'user.friends_count'] |
error | Behaviour for encoding errors, see https://docs.python.org/3/library/codecs.html#codec-base-classes |
Extract selected fields from a file of line-separated JSON tweets and write to a file in CSV format.
This utility function allows a file of full Tweets to be easily converted to a CSV file for easier processing of Twitter entities. For example, the hashtags or media elements of a tweet can be extracted.
It returns one line per entity of a Tweet, e.g. if a tweet has two hashtags there will be two lines in the output file, one per hashtag
e. g.: ['id_str'], ['id', 'text', 'favorite_count', 'retweet_count']
If entity_type
is expressed with hierarchy, then it is the list of fields of the object that corresponds to the key of the entity_type, (e.g., for entity_type='user.urls', the fields in the main_fields list belong to the user object; for entity_type='place.bounding_box', the files in the main_field list belong to the place object of the tweet).
Parameters | |
tweets | the file-like object containing full Tweets |
outfile | Undocumented |
main | Undocumented |
entity | Undocumented |
entity | Undocumented |
encoding | Undocumented |
errors | Undocumented |
gzip | if True , ouput files are compressed with gzip |
str outfile | The path of the text file where results should be written |
list main | The list of fields to be extracted from the main object, usually the tweet. Useful examples: 'id_str' for the tweetID. See <https://dev.twitter.com/overview/api/tweets> for a full list of fields. |
list entity | The name of the entity: 'hashtags', 'media', 'urls' and 'user_mentions' for the tweet object. For a user object, this needs to be expressed with a hierarchy: 'user.urls' . For the bounding box of the Tweet location, use 'place.bounding_box' . |
list entity | The list of fields to be extracted from the entity. E.g. ['text'] (of the Tweet) |
error | Behaviour for encoding errors, see https://docs.python.org/3/library/codecs.html#codec-base-classes |
def outf_writer_compat(outfile, encoding, errors, gzip_compress=False): (source) ¶
Get a CSV writer with optional compression.