The future of survey research and data access

The Associated Press released a story yesterday about the rise in households without a landline telephone. The story mentions one consequence of growing reliance on cell phones that is often neglected in this kind of story: polling and survey research, which APLIC’s members rely on in their work.

Growing numbers of surveys now include calls to people on their cells, which is more expensive partly because federal laws forbid pollsters from using computers to place calls to wireless phones.

This makes it particularly difficult to reach low-income and young adult populations, which are more likely to have only a cell phone.

Perry Building, home to ICPSR and the University of Michigan's Survey Research Center

Perry Building, home to ICPSR and the University of Michigan's Survey Research Center

Those who attended the tour of the University of Michigan’s Survey Research Center during the APLIC conference last week learned about this and other challenges in survey research, including declining response rates, the expense of in-person interviews, and the new challenges of collectign biological samples in conjunction with surveys.

We then moved to the Inter-University Consortium for Political and Social Research (ICPSR), where we learned about the challenges of data archiving and access.

The following day, ICPSR director Myron Gutmann gave a keynote about spatial data and confidentiality. “You cannot have spatially explicit information without identification,” he said. But, he added, “spatial information adds tremendous value to research.”

Gutmann’s talk was based on a book (Putting People on the Map) and an article (Providing Spatial Data for Secondary Analysis) he wrote.

The challenge, Gutmann said, is to figure out how to responsibly share data where there are significant confidentiality risks. Data collection is very expensive, as we’d learned the day before at the Survey Research Center, so it makes sense to get as much use out of it as possible.

While Gutmann noted that there have been no significant breaches of confidentiality from research data, the risk is great. Even if a data archive removes identifying and spatial data, spatial data published elsewhere (such as a map in an article) could present a disclosure risk when combined with the data.

Gutmann encouraged researchers to think about data dissemination “early and often,” and to avoid publishing potentially identifying information – for example, acknowledging sample units such as schools and hospitals in articles.

As for the future, Gutmann believes that data will move to distributed online systems which will combine data on the fly, recognize confidentiality issues automatically, and build user communities based on dynamic data use. In the mean time, we can continue to expect institutional solutions to data security, ranging from the least restrictive (web access) to the most (data enclaves).

After the APLIC conference, I headed to Detroit for the Population Association of America annual meeting. There a panel of experts echoed some of the same concerns about survey research, and called for more work in survey methodology.

Keith Hall from the Bureau of Labor Statistics explained how technology is changing the way they conduct surveys. While technology can increase capabilities (BLS does a lot of internet data collection), it doesn’t necessarily decrease costs, contrary to popular belief.  When funding decreases, they continue to produce data, but it is of lower quality because of reduced sample sizes and training.

Howard Hogan from the U.S. Census Bureau began his remarks by talking about the initiative to change the Survey of Income and Program Participation from using surveys to using administrative data. Hogan was in favor of the change initially, but has since been won over to the advantages of surveys, which include:

  • Flexibility. Questions can be added relatively easily.
  • Quicker results.
  • Sub-annual data.
  • Consistency. Questions not subject to the whims of administrators.
  • Greater potential for public use. Respondents cannot be identified by administrators.

With all these advantages, Hogan noted that in the end, surveys are not even much more expensive than using administrative data. He did note that there are many ways to combine survey and administrative data to find useful information.

Barbara Entwistle from the University of North Carolina talked about the new National Children’s Study, an example of a nationally representative survey collecting many types of data, including biological, psychological, chemical exposure, and medical.

I came away from listening to all these experts feeling that it is an exciting time to be involved in population research. There are major changes that, while they present risks and challenges, greatly increase the amount and kind of data we can collect to better our understanding of ourselves.

Leave a Comment