Exploring U.S. Births Data by Race and State (2007 – 2019)

Upon reading news about the recent tragic shooting in Buffalo, New York, United States we came across a mention about a specific data. That data was births data, claimed to show decline of the number of births of a certain race. This decline was reported to be one of the reasons that instigated the violence.

We’re curious to know what the data really looks like. Although we didn’t seek to find the specific birth data that was linked to the event, we knew that Centers for Disease Control and Prevention (CDC) has open data of births in the U.S. So, we had a look at that CDC data. Specifically, the data we looked at was Natality Records 2007-2020 from CDC WONDER online database.

We made interactive visualizations from the CDC data, so that all can explore and extract insights from it. We’re fully aware that the data has the potential to lead to conclusions that are highly divisive. Therefore, we only provide a very brief analysis of it in this article. Readers can explore the data and find insights themselves through interactive dashboard below.

CAVEAT

Before moving on the interactive dashboard, discussions about the data, and the analysis, there are several important remarks that needed to be taken into account when exploring the data. Any insights, patterns or trends in the CDC birth data cannot in anyway be used to justify violence or hate. We see the dynamics of birth numbers of various races in the data as natural dynamics of population. We strongly believe that American people should perceive every race in the U.S. as a part of one whole American society, and that they should make sure that every race contribute to the good of the whole society. Racial identity should not be forgotten, but the good of the whole American society must be put first.

Here is the interactive dashboard:

The dashboard is also available in Tableau Public.

Here is the full image of the dashboard:

Source: NadiData.com

THE DATA

The data we analyzed here were Natality Records 2007-2020, pulled from CDC Wonder Online Database. The data consisted of 2701 rows and 8 columns. The variables we used were State, Mother’s Bridged Race, Year and Births. Basically, we were analyzing the number of births per race of 50 states of United States plus Washington D.C. in the span of years from 2007 to 2019. Year 2020 was ommited because the race data were empty from that year. The term ‘bridged’ in ‘bridged race’ refers to certain practices used to make the category consistent along multiple datasets. We kept the ‘bridged’ term here to differentiate the data from other kinds of births data.

The categories in mother’s bridged race are:
• American Indian or Alaska Native
• Asian or Pacific Islander
• Black or African American
• White
• Not Reported

THE DATA PROCESSING

We visualized the births data in the dimension of race, state and year. We added another calculated variable: normalized births. The normalized births variable makes the number of births among different races more easy to visually compare. The formula for number of births normalization is the number of births of a specific race in a specific state and year subtracted by the minimum number of births of that specific race in that specific state, and the resulting number was then divided by the difference between maximum number of births of that specific race in that specific state and the minimum number of births of that specific race in that specific state. The concept of normalization we used are described here. Please be mindful that the normalized births graphs give impresson of the change of births number that is more extreme than it actually is.

The visualizations were four line charts that were clustered in two groups. The first group of charts is the charts of births and normalized births of a single race for all states. The second group is the charts of births and normalized births of births and normalized births of a single region (be it a single state or all states combined) for all races.

We use Python and Tableau to clean and visualize the data. The raw data, cleaned data and the script for the data processing and cleaning can be found at our Github (the files with prefix 1656).

ANALYSIS

As mentioned above, we refrain from looking too deeply for the trends and patterns in the births data due to the nature of the data . We can say that the overall trend of births for all races is declining. There is a 7% decline of the overall number of births for the whole United States from 2007 to 2019.
The trends for a single race or a single region vary, especially when they are observed as normalized births. In the normalized births graph, some races show inclining overall trend, while some others show declining trend. The normalized births patterns in the single region graphs show highly diverse patterns, such that we don’t think there is a generalizable patterns there.

CONCLUSION

Our takeaways are that the dynamics of number of births for each states and each races vary significantly and cannot be generalized. Also, there are visible overall trends of birth rates for each races. Lastly, we would like to remind readers that Americans should view themselves as one whole society eventhough they are of different races, and that these births data cannot be used as justification for hate and violence.

REFERENCES

Centers for Disease Control and Prevention, National Center for Health Statistics. National Vital Statistics System, Natality on CDC WONDER Online Database. Data are from the Natality Records 2007-2020, as compiled from data provided by the 57 vital statistics jurisdictions through the Vital Statistics Cooperative Program. Accessed at http://wonder.cdc.gov/natality-current.html on May 15, 2022 9:56:33 AM.