Measuring underrepresentation of women in media outlet's online posts

Possible for:

Twitter

Election Monitoring

The purpose of this analysis is to quantify how much more or less mainstream media (i.e. @BBCNews) posts about female figures (i.e. politicians, activists, etc) in comparison to their male counterparts on social media.

1. Research question:

Do mainstream media sources give equal coverage to male and female figures on social media platforms?
Do mainstream media sources focus on female leaders’ clothing choices, personal life, social role (i.e. being a mother) and their excessively or insufficiently feminine demeanour more than their male counterparts?

2. Sample selection

Generate a list of the top 5-20 news media based on your own expertise/research of your country context
Try to balance and label this list in regard to confounding factors (i.e. ideological leaning)
Limit your sample size to a particular topic (i.e. politicians) and/or political event (i.e an election)
Select a list of keywords to filter by to capture all posts related to your particular topic or political event (i.e. US Election 2020, Elizabeth Warren).
Select an appropriately wide time period for your given study.

3. Gather your data

Based on your selected sample, filter down your data set to gather the posts’ text and interactions

4.Classify your data

a. Identify and label posts where the key subject of a post is (0) male (1) female (2) both (3) neither i. If studying elections, label posts with a candidate’s name and other relevant information.

- Topic of post (i.e. politician’s outfit)
- Sentiment of post (i.e. criticizing or praising the individual)

b. Classify posts that portray leaders in a traditionally gendered way: (0) non-gendered (1) fashion (2) personal life (3) social role (4) excessively feminine/masculine (5) insufficiently feminine/ masculine

5. Analyse your data

No coding required
- Calculate the total posts classified into each of the four categories. This may be visualized well through a bar graph with the categories as X and post counts as Y.
- For Facebook, investigate the type of interactions male v. female figures receive. Do any patterns emerge regarding who receives more angry interactions versus hearts? This might provide some insights into how users treat female versus male figures.
- Based on the additional coded features (i.e. candidates name, sentiment or topic of post), do any interesting patterns emerge?
- Analyse post count over time to see if any unique patterns emerge regarding the type of coverage
- Which topics are most associated with posts about female individuals versus male individuals?
Further Ideas for Researchers with Coding Capabilities:

Researchers with programming capabilities may be able to dive deeper into the type of language used in male v. female posts.

- What are the most frequent words used in posts about males v. females? Try a wordcloud visual.
- You can try using a structured topic modelling (STM) approach to generate general topics.
  - With R you can use the stm package.
  - Using Python you can run topic models with NLTK and Gensim libraries
- Run a sentiment analysis on the male v. female posts

Additional Resources

See our other methodology to measure types of gender-based harassment on Twitter. It's also possible to use this methodology to look at user comments on other platforms.
You might also be interested in looking beyond bias, but comparing harassment against male versus female figures.
See DRI's Guide on Gender and Social Media for more information:

Download methodology

go to next chapter