Twitter data is available for academics and researchers via the Twitter API. The process to gain access is quite easy, but programming skills are required.
First, you will need to gain access credentials from Twitter through their developer website:
1. Apply for a developer account identify yourself as an "academic or researcher".
You will be directed to a login page if you are not already logged in. Create a Twitter account in case you don’t have one.
2. Select “doing academic research”. This means that you are “doing research to advance human understanding of a topic through Twitter data.”
3. Confirm the basic information and select "Next".
4. Complete the application form on how you will use the Twitter API or Twitter data. Click next and confirm that everything looks good on the following page. Clicks “Looks Good” to submit your application.
5. Twitter will send you an email (typically within a day or so) confirming your access. Once you receive this email, sign in to your developer account to get started.
6. Hover over your name and click “Apps” and select "Create an app".
7. Complete the form with information about how your app. Your Callback URL should be http://localhost:1410. When you're done select "Create".
8. Once you have created your app, return to the app page and click on your app.
9. Click “Keys and tokens” to find your Consumer API keys, Access token and access token secret. You will need this information to connect to the API through your program of choice.
10. Save this information in a secure place. Note that Twitter will not show your access token and access token secret beyond the first time you generate it for security reasons. You can regenerate it from this same webpage, but it will invalidate your current access token and secret.
11. Use your programming language of choice to pull Tweets using the credentials gained in steps 1-10.
Under this free access for academics and researchers, note that data is only available for the past 7 days! You will need to constantly pull new Tweets each day. Learn more from Twitter on what exactly is available from the "standard" tier.
First, you can search for Tweets based on a query (keyword or account name) in addition to (location, language, etc) and Twitter will return the most recent or popular responses (relevence not completeness).
For each Tweet, you can request the following data:
user_id | status_id | created_at |
screen_name | text source | display_text_width |
reply_to_status_id | reply_to_user_id | reply_to_screen_name |
is_quote | is_retweet | favorite_count |
retweet_count | hashtags symbols | urls_url |
urls.t.co | urls_expanded_url | media_url |
media_type | profile_banner_url | media_type |
ext_media_url | ext_media_t.co | ext_media_expanded_url |
ext_media_type | mentions_user_id | mentions_screen_name |
lang | quoted_status_id | quoted text |
quoted_created_at | quoted_source | quoted_favorite_count |
quoted_retweet_count | quoted_user_id | quoted_screen_name |
quoted_name | quoted_followers_count | quoted_friends_count |
quoted_statuses_count | quoted_location | profile_background_url |
quoted_description | quoted_verified | retweet_status_id |
retweet_text | retweet_created_at | retweet_source |
retweet_favorite_count | retweet_retweet_count | retweet_user_id |
retweet_screen_name | retweet_name | retweet_follwers_count |
retweet_friends_count | retweet_statuses_count | retweet_location |
retweet_description | retweet_verified | place_url |
place_name | place_full_name | place_type |
profile_image_url | country | country_code |
geo_coords | coords_coords | bbox_coords |
status_url | name | location |
description | url | protected |
followers_count | friends_count | listed_count |
statuses_count | favourites_count | account_created_at |
verified | profile_url | profile_expanded_url |
account_lang |
(list courtesy of Rania Wazir)
Twitter’s Developer Terms describe how to use data ethically: