We're Reporting Census Data All Wrong
By Luc Schuster
December 13, 2021
Census data on race and ethnicity are invaluable for understanding who we are as a region and how we’re changing over time. Invaluable, yes. But also imperfect. Headlines during the census count last year focused on challenges facing Census Bureau workers during a pandemic and on the Trump administration’s efforts to depress the count in certain areas. But the physical count isn’t the only problem. While back-end reporting changes for the 2020 Census in some ways help us see more clearly who we are as a multiracial, multiethnic nation, other changes have led to misleading findings about actual demographic change. These challenges are compounded by traditional reporting approaches used by researchers like us that have tended to not include all people who select a given race on their census form.
Fortunately, alternatives exist for painting a more accurate picture. These judgment calls make an especially large difference for Boston’s White, Black and Native American populations, as shown in the graph below, but the traditional reporting approach skews other race totals as well (read further for those differences). Traditional reporting of 2020 census data understates Boston’s Black population by almost 43,000 residents.
Race totals appear much lower when subtracting Latino ethnicity and not including people who select more than one race.
Census data limitations paired with the broader complexities of racial and ethnic identification (like the fact that race itself is a social construct) means there will always be downsides to any approach. But in this brief, we make the case for alternative reporting of race and ethnicity estimates that have real-world implications for our public conversations around race and demographic change in Greater Boston.
So first, let’s describe some of the problems.
The census’s race and ethnicity questions don’t reflect how some people think about identity, and the ethnicity question doesn’t allow people to select both Hispanic and non-Hispanic lineage.
Since 1980, the census has included two separate questions on: 1) Hispanic or Latino ethnicity; and 2) race. People are first asked to answer a binary yes/no about whether they have “Hispanic, Latino or Spanish origin” (or Latino” for the rest of this brief). They are then asked to identify with one or more racial categories: White; Black or African American; American Indian or Alaska Native; one of several Asian countries that the Bureau aggregates up to “Asian”; Other Pacific Islander; and Some Other Race.
Problems with the two-question approach include:
- Most people treat being Latino as roughly equivalent to a racial category. This includes the vast majority of Latinos themselves—according to a 2015 Pew survey, 67 percent of Latino adults view being Latino as part of their racial background. This makes it uncomfortable for many Latino respondents to then pick a separate racial category that doesn’t reflect how they see themselves. It’s no surprise that more than 75 percent of people who select Some Other Race are people of Latino ethnicity.
- The race question allows respondents to select more than one option, reflecting the growing number of people who have mixed racial backgrounds, but the ethnicity question does not. Respondents can only identify as Latino or not. This means that if someone’s father, say, is non-Latino and their mother is Latino, they are forced to choose—ultimately communicating only half of their parental lineage.
- Some people don’t fit well within either the race categories or the Latino ethnicity option. Many people of Middle Eastern or North African descent, for instance, feel that the current options offered do not reflect their cultural identity. For now, the Census Bureau officially considers people from countries like Lebanon or Egypt to be racially White, even though that’s not how many people from these countries view themselves.
In 2015, the Census Bureau tested a revised approach to these questions that would have merged the two race and ethnicity questions into one, allowed for multiple responses, and added a new option for selecting Middle Eastern or North African. The Bureau found that this approach led to better information on people’s true identities and that it reduced the level of non-responses. But the Trump administration’s Office of Management and Budget, which has final say over how the questions will be asked, did not allow the Census Bureau to adopt this improved approach for 2020. So, for the time being we are left with the current flawed approach.
Researchers too often subtract the racial identity of people with Latino ethnicity.
Stuck with the two-question approach, researchers are forced to make tough judgment calls around how to present data on race and ethnicity in a way that matches the fact that most people view these categories side-by-side. People often want to compare, for instance, Boston’s Black population with its Latino population. This is especially challenging if you want to present population shares that add up to 100 percent.
So, many researchers have adopted the convention of giving Latino ethnicity a primary distinction. Put another way, if you check the Latino box, you often don’t get reported as White, Black or Asian. This way, population shares by race (e.g., Asian, Black, White) are compared side-by-side with Latino population shares, and these all then sum up to 100 percent of the city’s population.
But here’s an example of how this plays out: Often researchers report Boston’s Black population in 2020 as 129,264, or 19.1 percent (see graph above). But you only get this number if you’ve first subtracted 9,606 Black respondents who also selected Latino ethnicity (and if you subtract those who select Black in combination with another race, which we explain next). Many of these people identify as Afro-Latino, having descended from people originally from the continent of Africa. They are racially Black like other non-Latino African American residents of Boston, but a key difference is that they came here by way of nations colonialized by the Spanish empire, like the Dominican Republic. So even though many Afro-Latino residents think of themselves as racially Black (having selected this on their census form), this research convention ignores their Black identity and only includes them in Hispanic or Latino totals.
Researchers too often report race totals only for people who select that race alone, not those who select multiple racial categories.
Compounding the problem above is the convention of only reporting people who select a given race by itself, rather than also including those who select that race in combination with at least one other. This is so common that many researchers (including us at Boston Indicators) often forget to note for readers that we’re only including those who select one race alone.
We think of President Barack Obama as our nation’s first Black president, for instance, but because his mother was White, this conventional “single race alone” reporting approach doesn’t even include him in our nation’s Black population totals (assuming he selected Black and White when filling out the census).
As a result, what can seem like a decline in one racial group is sometimes instead the result of people intermarrying and forming families across racial lines. In fact, there’s been a rapid increase in the share of people, especially young children, who have mixed racial backgrounds, making this “single race alone” reporting approach increasingly unsatisfying.
Focusing on Boston’s Black population again, we find that more than 33,000 Bostonians are multiracial Black (the difference between the “Total Selecting Black” and “Black alone” bars above). These are people who selected Black and at least one other race. But they are rarely included in reported Black population shares for the city of Boston. After taking out Latino respondents and those who select Black in combination with another racial group, Boston’s Black population is often reported as 19.1 percent (or 129,264). But if you more simply look at the percentage of people who identified themselves as Black, the number jumps by a third—to 172,039, or 25.5 percent of the city’s population.
This is a bigger issue for 2020 because, for reasons described below, the census is now also removing many Afro-Latino respondents from the single race Black category even though they themselves haven’t selected multiple races.
These gaps in numbers are especially large when reporting Native American totals. Reporting “Native American alone, minus Latino” gives the perception that our region’s Native American population continues to shrink—from 1,227 in 2010 to 989 residents in 2020 (-19.4 percent). But using the more inclusive approach, we saw an increase from 6,529 in 2010 to 9,122 in 2020 (+39.7 percent).
In fact, these differences have led to some confusion about what the national trends really are. Much of the early narrative, for instance, was of a White population decline over the last decade, with people citing either a decline of 2.6 percent (White alone, minus Latino) or a decline of 8.6 percent (White alone, including Latino). But including all people who identify in some way as White instead generates a national increase of 1.9 percent (or over 4 million people). This difference is driven in large part by a 316 percent jump since 2010 in the number of respondents identified as being White and at least one other race. The end result is a mix of narratives driven by the same numbers. On the one hand, the data tell a uniquely American story that celebrates the growth of the nation as an inclusive, multiracial and multi-ethnic nation. But the same numbers are also used by far-right voices to stoke racist fears of a White population decline that somehow weakens America.
Judgment calls by the Census Bureau in 2020 are leading to artificially low single-race-alone totals (and, therefore, artificially high multiracial totals).
Bear with me, since this gets wonky!
The Census Bureau has adopted a back-end coding practice of assigning additional race selections to individuals whose write-in responses suggest different racial categories than the boxes they actually select. As mentioned above, for instance, the Census Bureau officially considers people from North African countries like Egypt to be White. So, if a person selects “Some Other Race” and then below that writes in “Egypt,” the Census Bureau actually counts this person as both Some Other Race and White, even though the person never selected White themselves.
This became a much larger issue in 2020 because that is the first year for which there was a write-in box under the White and Black/African American race checkboxes, instructing people to “(m)ark one or more (race) boxes AND print origins” in the write-in area. The write-in boxes had already been in place for American Indian, Other Asian, Other Pacific Islander, Some Other Race, and for the Latino ethnicity question.
The logic for this is that it helps correct mistaken race category selections. But the downside is that it often simply overrides the self-identification of the respondent.
Where this crops up most often is for people of Latino ethnicity because the Census Bureau has a practice of assigning “Some Other Race” to everyone who writes in a Latino-seeming country of origin. Here’s an example to make this concrete: If someone is Afro-Latino from Cuba, they might logically select “Black or African American” for their race and write-in “Cuba” under that response (in addition to selecting Latino on the ethnicity question). The Census Bureau’s back-end coding practice then also adds a “Some Other Race” selection for this person in addition to “Black,” thereby counting this person as multiracial, even though the respondent may not have mixed parental lineage or identifies solely as Black. The Census Bureau says they are doing this to reflect the fact that many Latinos think of their Latino background as similar to a race.
It sounds “in the weeds,” and it is. But the impact is very real, and it shows up in how you report the numbers. The table below shows what a difference this makes when choosing how to present 2020 Census data. There were meaningful increases in 2020, for instance, in the number of Bostonians selecting the White and Black race options, but you only notice these increases if you count everyone who selected that racial category—meaning that you include people who also selected another race (or were assigned Some Other Race, as described above) and that you didn’t subtract people of Latino ethnicity. This makes the largest difference in reporting Boston’s Black population; you see a 6.4 percent decline in Black residents if you report “Black alone (minus Latino),” but you see a 5.1 percent increase if you look at “Total Selecting Black.”
The flip side of this issue can be seen above for the Some Other Race and Multiracial categories. The number of people in Boston either selecting or being assigned Some Other Race almost doubled over the last decade. And Boston’s multiracial population almost tripled. Boston’s multiracial population really is growing quickly, but the 10-year increase drops down to 119 percent when one sets aside people who have Latino ethnicity. To be clear, many Latinos are indeed of mixed racial backgrounds, so ideally researchers would not subtract them when reporting multiracial totals. But the Census Bureau’s practice of adding Some Other Race for many Latinos in 2020 complicates our ability to know how many of these folks really identify themselves as mixed-race.
A better way to report data on race and ethnicity
What these issues and practices all add up to is reporting of census data on race and ethnicity that commonly looks like this graph below. Race totals only reflect people who select that race alone. People of Latino ethnicity have been subtracted from race totals. And the totals conveniently add up to 100 percent.
But it doesn’t have to be this way. Below is one alternative that, while imperfect in its own ways, may better reflect the multidimensional realities of racial and ethnic identity in Boston.
These judgment calls can lead to very different storytelling about a given place. People mostly already consider Boston a “majority-minority” city, for instance, due to the traditional reporting approach shown above—where White residents make up 44.6 percent of the city. Complicating that story, however, is the fact that 53.9 percent of residents select the White box (as shown in the alternate version). It’s just that some are also Latino and/or selected other racial identities as well. In a way, Boston is both majority White and majority non-White (because 57 percent of residents also select a non-White racial option).
One obvious downside to this alternative approach is that totals add up to more than 100 percent, so it can feel messy or unsatisfying. But maybe that’s not such a downside. People really do exist across these categories, so why pretend that they don’t? Racial and ethnic identities are complex and ever-changing, and we shouldn’t expect them to fit neatly into discrete separate boxes. Allowing someone who is, say, Latino and White to show up in both places does feel like a better expression of these people’s full identities. To be clear, this isn’t to say there aren’t ever times for looking just at people who identify with one race, or when you might want to isolate the size of the non-Latino White population. But experimenting with alternatives like the one above might in places lead to a better understanding of our city’s true demographic complexity.