Measuring underrepresentation of women in media outlet's online posts

Possible for:
Election Monitoring
The purpose of this analysis is to quantify how much more or less mainstream media (i.e. @BBCNews) posts about female figures (i.e. politicians, activists, etc) in comparison to their male counterparts on social media.  

1. Research question:

  • Do mainstream media sources give equal coverage to male and female figures on social media platforms?
  • Do mainstream media sources focus on female leaders’ clothing choices, personal life, social role (i.e. being a mother) and their excessively or insufficiently feminine demeanour more than their male counterparts?

2. Sample selection

  • Generate a list of the top 5-20 news media based on your own expertise/research of your country context
  • Try to balance and label this list in regard to confounding factors (i.e. ideological leaning)
  • Limit your sample size to a particular topic (i.e. politicians) and/or political event (i.e an election)
  • Select a list of keywords to filter by to capture all posts related to your particular topic or political event (i.e. US Election 2020, Elizabeth Warren).
  • Select an appropriately wide time period for your given study.

3. Gather your data

  • Based on your selected sample, filter down your data set to gather the posts’ text and interactions

4.Classify your data

  • a. Identify and label posts where the key subject of a post is (0) male (1) female (2) both (3) neither i. If studying elections, label posts with a candidate’s name and other relevant information.
    • Topic of post (i.e. politician’s outfit)
    • Sentiment of post (i.e. criticizing or praising the individual)
  • b. Classify posts that portray leaders in a traditionally gendered way: (0) non-gendered (1) fashion (2) personal life (3) social role (4) excessively feminine/masculine (5) insufficiently feminine/ masculine

5. Analyse your data

  1. No coding required
    • Calculate the total posts classified into each of the four categories. This may be visualized well through a bar graph with the categories as X and post counts as Y.
    • For Facebook, investigate the type of interactions male v. female figures receive. Do any patterns emerge regarding who receives more angry interactions versus hearts? This might provide some insights into how users treat female versus male figures.
    • Based on the additional coded features (i.e. candidates name, sentiment or topic of post), do any interesting patterns emerge?
    • Analyse post count over time to see if any unique patterns emerge regarding the type of coverage
    • Which topics are most associated with posts about female individuals versus male individuals?
  2. Further Ideas for Researchers with Coding Capabilities:

Researchers with programming capabilities may be able to dive deeper into the type of language used in male v. female posts.

    • What are the most frequent words used in posts about males v. females? Try a wordcloud visual.
    • You can try using a structured topic modelling (STM) approach to generate general topics.
      • With R you can use the stm package.
      • Using Python you can run topic models with NLTK and Gensim libraries
    • Run a sentiment analysis on the male v. female posts

Additional Resources

Download methodology