Online media is ablaze with criticism—even denunciation—of the polling industry, with Trump having more success in the 2020 election than most pollsters predicted. By some accounts, it’s the second consecutive presidential election flubbed by election forecasters who used polling data, which has sowed frustration and distrust among the general public, journalists, and politicians alike.
In many respects, the limits of polling are shining through. But what is often overlooked is why. By not taking a healthy look at the limits of polling and the reasons behind them, and discussing ways to address the issue, critics risk delegitimizing an entire field that provides immense value to our country and world at large.
Survey research across the board has been plagued by declining response rates over the past two decades. This means that researchers trying to get accurate and important data on health, economic conditions, and, yes, political views too often encounter important groups of individuals unwilling to participate in the survey process, which cannot always be fixed with statistical modeling.
One explanation for polls underestimating support for Donald Trump in 2016 is that state-level polls did not include enough white voters without college degrees, who overwhelmingly supported Trump. Some survey experts suggest that state-level polls in 2020 might have suffered from similar nonresponse bias, again leading to forecasts that underestimated the likelihood of a Republican victory.
This can become a vicious cycle of inaccuracy and distrust. Nonresponse bias makes it harder for survey researchers to get accurate data, and when researchers—or pollsters—fail, the trust in the process dwindles, exacerbating levels of nonresponse.
When trust in survey research wavers, it can have detrimental impacts on society that go far beyond election forecasts.
When most people think of surveys, they think of political polls that ask respondents which candidate they will vote for. But survey research spans a wide range of topics and generates useful applications that can’t afford to lose public confidence. The General Social Survey, ongoing since 1972, has helped generate over 27,000 scholarly publications. The monthly Current Population Survey, conducted by the Bureau of Labor Statistics, has measured unemployment and earnings since 1940. Policy-makers use these statistics when they decide on stimulus packages to help people and businesses during a recession and interest rates that affect how much interest we pay on our credit cards and mortgages.
The U.S. Census could be viewed as a large-scale survey of every person in the country to determine their basic demographic information. Census data is used to apportion congressional seats and allocate government spending on schools, hospitals, roads, and other public works and programs.
Unfortunately, these surveys are also facing declining response rates. Survey data quality suffers when people are systematically undercounted.
The Pew Research Center found that typical telephone survey response rates fell to 7% and 6% in 2017 and 2018, respectively. Low response rates are not necessarily a problem if nonresponse is uncorrelated to the data that surveys are trying to measure. But in many cases, such as in the 2016 election polls, nonresponse matters. Weighting surveys to be representative of the target population could improve accuracy, but picking which weights depends on assumptions that don’t always hold.
If it’s not properly addressed, systemic nonresponse could have harmful policy implications. For instance, consider how undercounting in the 2020 U.S. Census can disproportionately impact marginalized communities. Underfunding of the Census Bureau, the controversy around a proposed question about citizenship, and the early end to the Census—combined with the difficulty of conducting a census during a pandemic—will likely result in undercounting of Black, Latinx, and Asian people. This would mean less political representation and fewer government resources going to communities of color.
Collapsing trust in survey research and researchers certainly does not improve response rates. The polling industry, election forecasters, and the media should reflect on their contribution in driving the distrust. Research has shown that election forecasts that show the probability of a candidate winning increase certainty about an election’s outcome, confuse many voters, and decrease turnout. When these forecasts turn out to be inaccurate, many turn against survey researchers, as we’ve recently seen.
Given the harms that election forecasts can cause, the media should stop emphasizing these forecasts in their election coverage and giving outsize influence to the data scientists making these predictions. Meanwhile, those measuring outcomes that help guide policy-making should communicate how their work benefits the general public. At the same time, pollsters should recognize the limitations of their method and recognize the usefulness of other research methods.
In the 1948 U.S. presidential election, the Chicago Daily Tribune printed the incorrect headline “Dewey Defeats Truman” thanks to a nonrepresentative poll. Indeed, the failure of election forecasts in 2020 may seem like another “Dewey Defeats Truman” moment.
But polling did not die off after that spectacular failure in 1948. Instead, researchers improved polling methods by introducing random sampling. Likewise, in 2020, survey research shouldn’t be “canceled” given its importance in guiding evidence-based decision-making. Instead, researchers should work to rebuild trust in the public and improve response rates.
Baobao Zhang is a political scientist at Cornell University.