Who is Missing from our Data?

January 3, 2022

Public health professionals are dedicated to making health accessible for everyone. So how can we ensure we’re including everyone in our data, and sharing results in a meaningful way? In this blog, learn more about who is often missing in our data and what we can do to solve it.

Epidemiological analyses include investigating risk factors for health outcomes. A risk factor is anything that puts a person at a higher risk of a health outcome.

We often talk about determinants of health, such as demographics, in epidemiology and how these factors might impact disease rates.

Who is missing from our baseline demographics of participants included in Table 1?

What can we do to make sure we’re including the health experience of all people in our data?

Missing Data on Gender Identities

Folx with different gender identities can have significantly different health experiences when seeking health care. Discussions about how to accurately describe the health outcomes and experiences of different gender identities when sharing gender in a Table 1 are ongoing. Collapsing gender into: “Female”, “Male”, and “Other” due to small cell sizes, does little to solve this problem.

Categorizing gender in this way does little to solve these issues, and discounts the health experience of folx into an “Other” category. So how do we solve this problem?

When conducting a study, it’s important to collect gender identity on your sample of participants, and acknowledge that gender identity might change over the course of your study.

Gender identity should be asked at multiple time points over the course of a longitudinal study. If understanding gender identity is a key outcome for your research, then people with different gender identities should be included in the design and recruitment efforts for your study.

Community based participatory research (CBPR) is one way to approach how to design a study. CBPR accounts for the needs, and wants, of members in a community.

Finally, and most important, listen to the community. We should all follow their lead, and take their feedback into account in our research.

Two women talk over data analysis findings on a couch with plants around them

Race and Ethnicity Data

Health research has systematically excluded participants of different race and ethnic groups for decades. In some cases this was due to blatant ethical disregard, and systematic oppression in others. A key issue is the idea of data sovereignty – data is subject to the laws and governance of the geographic location where it was collected.

In terms of race and ethnicity, we must be cognizant of data sovereignty with tribes and acknowledge that data collected on tribal members is sacred. In general, Indigenous data is often missing from race and ethnicity variables due to the unique and sacred nature of working with this data.

To solve this issue, we need to work with tribal nations before a study or data collection event has begun. Tribal nations should be involved in all stages of the research from inception to sharing of results.

Data Governance in Public Health

Data governance is all about the collection, use, standards, and metrics of data which ensure its effective and efficient use. Public health data governance ensures that the public health data we collect is intentionally used to serve and support the communities we care about most. Our aim as public health professionals should be to do no harm with the data we collect.

The next time you create a Table 1, take a look at who’s missing and why. What can you do to ensure they aren’t missing in the future?

Need help with data collection or data advocacy?


Leave a Reply

Your email address will not be published. Required fields are marked *

Leave a comment

Subscribe to receive updates
when I post new content!

Subscribe to receive updates
when I post new content!

About the Founder

Hi! I’m Erika, founder and CEO of Aengle. I hold a BS in Microbiology, MPH in Epidemiology, and I am currently a PhD student in Epidemiology with a minor in Management Information Systems. I hold numerous certificates in teaching, and data science solutions like Stata and Python. I’ve been working in the public health sphere for 7+ years, managing grant deliverables of over $1 million. I’m on a mission to revolutionize public health by supporting heart-driven changemakers dedicated to creating a lasting legacy.