What happened to Trump's war on data?
One major anxiety about the new administration hasn't come to pass—but the White House is learning to use numbers to its advantage.
By Danny Vinik
On May 16, three months' worth of data vanished quietly from the website of the U.S. Census Bureau. The figures included age, sex and employment data, crucial for calculating key statistics like the monthly unemployment rate. Their disappearance set off alarm bells among researchers: President Donald Trump had repeatedly cast doubt on the accuracy of the unemployment rate during his campaign. Was some official now actually interfering with the basic measurements of the economy?
That wasn’t the case, it turned out. For all the worry, the data had disappeared because of a technical issue with the online portal, according to a Census spokeswoman. The files were back online before the end of the day.
The scare about the Census data was just one episode in a long and anxious drama that has played out in 2017 about the fate of government data under Trump. The government collects and posts vast amounts of information online, much of it concerning the climate, jobs, immigration and other politically sensitive areas. During the presidential transition, researchers scrambled to download datasets across the government, setting up “guerrilla archiving” events to preserve federal data that the new administration might suddenly find inconvenient. News organizations ran numerous articles about academics panicking about the potential loss of data; the University of North Texas and University of Pennsylvania spearheaded efforts to copy government webpages and monitor them after Inauguration Day. The Trump administration, many worried, at worst could represent something of an existential threat to science itself, willing to wipe out decades of meticulously collected information in the pursuit of its political goals.
Six months into his presidency, however, this “War on Data” hasn’t materialized. In fact, according to the Sunlight Foundation, which has been tracking the issue closely, just one main dataset has gone down since Inauguration Day—a database on violators of two animal protection laws, whose removal was already under consideration before Trump took office. All climate data remain online. The Environmental Protection Agency even posted an exact, fully clickable replica of its website as it stood on January 19, the last day of Obama's presidency—a step toward transparency that surprised many observers.
To Trump supporters, the false alarm proves that researchers dramatically overreacted during the transition, suspecting the worst and wasting countless hours backing up data that was never going away. The administration, they argue, is imposing a businesslike approach to government—and businesses rely on data for even small decisions.
This does not mollify all of those skeptical researchers, however: relieved as they are that their worst-case fears haven’t become a reality, they still argue that the Trump administration is chipping away at data-driven policy in other ways, such as removing references to climate change from government sites. They worry that the administration could still underfund statistical agencies or eliminate programs that collect data. And, they fear, the data could still disappear at any minute.
“We were really concerned with datasets being removed and we continue to think there’s reason to be concerned,” said Gretchen Goldman, a research director at the Union of Concerned Scientists. “No one would say the data is safe.”
THAT RESEARCHERS HAD so much data to protect is a testament to the breadth of government recordkeeping, tracking everything from the country’s population, which it has been doing since 1790, to student debt levels. But it is also a testament to the Obama administration’s efforts to put huge amounts of that data online. Just a few months into office, Obama launched data.gov, a central repository of information that has around 195,000 datasets today, all available for download to any citizen. He hired the government’s first chief information officer, chief technology officer and chief data scientist and implored agencies to make data easily available, most notably in his 2013 “Open Data” order—known as M1313—which gave agencies specific instructions for publishing data, such as making it machine-readable and imposing no restrictions on its use. It also created new open-source tools for the public to access data.
“It was a very clear principle in the Obama administration that open access to government data made sense and that we would do everything we could to facilitate that,” said John Holdren, who was Obama’s top science adviser. Holdren pointed to initiatives across government to improve public access to data, including the “blue button” initiative at the Department of Veterans Affairs, which gave veterans access to their health records and the “green button” initiative, a collaboration with private sector companies to give utility consumers access to their energy-use data.
The Trump administration has never professed a similar commitment to open data, and it sent a number of worrisome signals at the start. For instance, all past presidential memos—including the M1313 memo—were initially no longer available at whitehouse.gov, although they remained available at the archived version of Obama’s White House website. M1313 was returned to the site, along with other memos, months later, but good-government experts said the long absence was a worrisome signal. “That was pretty alarming,” said Josh New, a policy analyst at the Center for Data Innovation. More concerning was the White House’s decision to take down its open data portal—open.whitehouse.gov. The White House said the site was redundant, as the data were available on other government websites—at a cost of $70,000—and most data experts seem to agree. But New said the decision sent another signal to researchers about the Trump administration’s priorities.
In May, the worst-case fears of researchers seemed to be confirmed when the Washington Post reported that the number of datasets available on data.gov had fallen from around 195,000 on Inauguration Day to just over 150,000—meaning nearly a quarter of public data had vanished from the site, a huge drop in just a few months and a clear sign the Trump administration was actively taking data out of the public domain. Except it wasn’t true. A technical issue with how the site itself counted data caused the mix-up. Today, data.gov still has around 195,000 datasets and the number of datasets added to the site each month has remained consistent.
“I was concerned about Trump undermining things like the unemployment numbers, or his claim there were 3 million fraudulent votes,” said John Wonderlich, executive director of the Sunlight Foundation. “So far, we have to put that in the category of presidential rhetoric. We are so far not facing a crisis of trust in American statistical data.”
In fact, experts can point to just one main dataset that has disappeared since Trump took office—and the circumstances around that one are a bit murky. In early February, a dataset containing information on individuals and businesses that inspectors at the Animal and Plant Health Inspection Service (APHIS) in the Department of Agriculture determined were breaking animal welfare laws went offline, setting off outrage from the Humane Society and other animal welfare groups.
“We’re a transparent society where information is supposed to be available on the Internet,” said John Goodwin, senior director of The Humane Society’s Stop Puppy Mills campaign. “These aren’t nuclear secrets.”
Was the Trump administration burying the data? It’s complicated: APHIS said it had been conducting a review of information on its website and “in 2016, well before the new administration, APHIS decided to make adjustments to the posting of regulatory records.” The agency was also sued in February 2016 by two brothers who compete in horse competitions and said the publication of the data violated their privacy, possibly precipitating the review. But Goodwin and other animal welfare experts remain suspicious, especially after former Agriculture Secretary Tom Vilsack said he had previously declined a similar recommendation from APHIS staff.
In the months since, the agency has put some of the data and reports back online, but much remains unavailable. Tanya Espinosa, an APHIS spokeswoman, said that two-thirds of inspection reports involve individual or homestead businesses and haven’t been reposted because they contain personal information. “APHIS is diligently exploring options for promoting transparency while working to protect personal information, which is of greatest concern for these types of licensees,” she said.
Beyond the animal welfare dataset, changes have been more subtle. Many climate experts have pointed out that their data and reports are less visible on the EPA’s website, with almost all references to climate change expunged from the site itself. The White House has also proposed eliminating funding for climate data collection efforts, including the EPA’s Climate Change Research program and a NASA program that collects data on the distribution of carbon dioxide. In May, White House Budget Director Mick Mulvaney said climate change research programs are “a waste of your money.”
But the EPA and other agencies haven’t gone as far as to remove data altogether, and they appear to be going to some lengths to prove it. Since mid-February, the EPA has included a link at the bottom of its site that takes visitors to a replica of its website from January 19, effectively freezing the site as it was before Trump took office. Almost every link remains working and climate change remains prominently displayed. The move represented such an act of transparency that multiple experts I spoke with wondered whether the replica site had been created by career employees or was legally required after the EPA was flooded with Freedom of Information Act requests for data and reports. “You have to wonder how much independent action there has been in this space,” said Holdren about whether open data moves like the replica site came from the Trump administration’s political figures or career employees. The EPA did not respond to a request for comment.
IF RESEARCHERS’ WORST fears haven’t materialized, what has happened to data under Trump?
The truth is that not that much has changed—so far. Based on such limited changes to government data policies, the administration appears to be just beginning to understand the scope of data collected by the federal government and how it can use it.
Straightforward as “data collection” may sound, in practice there’s often a strong political component to government data. What information to collect, about whom and how it’s collected are critical questions that don’t always have one objective answer. “Data is inherently political,” said Wonderlich. “And how it’s used depends on who’s collecting it and what they’re representing about the world.”
Sometimes, those questions fall on Congress, such as when lawmakers created the unemployment rate—as unbiased a statistic as exists today—in the 1930’s. According to a history of the U.S. Census, the unemployment rate was the subject of a fierce political fight upon its creation, including how often to collect data on unemployment and who would be counted as unemployed. President Herbert Hoover and his allies thought that the crude unemployment figures, which came from limited Bureau of Labor Statistics surveys and business reports at the time, were adequate measures during the Great Depression. Democrats and many labor economists disagreed and called for additional surveys to determine the true extent of unemployment—and the federal response necessary to alleviate it.
Often, though, the White House has wide latitude to implement surveys and collect data. Every administration uses that power accordingly, accumulating information that can then be used to support a certain policy agenda. This isn’t nefarious, as long as the data aren’t manipulated. But it still can provide evidence to drive a specific agenda.
The Trump administration is just beginning to harness this power—not by erasing data, but by collecting it. For instance, Trump issued an executive order in January that directed the secretary of homeland security and attorney general to collect data on the “immigration status of all [incarcerated] aliens.” Another executive order directed the same Cabinet members to publicly release data on the number of foreign nationals in the U.S. who have been charged with a terrorism-related crime or who were radicalized after entry. The agencies have not released any data yet, but selectively highlighting crimes by immigrants—as opposed to crimes by any other group—could be used to support an immigration crackdown, giving a scientific-looking basis for an agenda that immigration advocates have labeled racist.
Many people have objected to this new data collection, saying it would be used to misleadingly villainize immigrants. Holdren, on the other hand, said he was fine with more information, as long as politicians use it honestly as they develop and lobby for different policies. “What I have seen so far is that data show that the crime rate among U.S. citizens is higher than the crime rate among undocumented immigrants,” he said. “How could I object to making those data known? If the data turn out to show a different thing, we ought to know that too.”
In other cases, Trump is rolling back Obama-era changes to data collection efforts. For instance, an agency in the Department of Health and Human Services, the Administration for Community Living (ACL), proposed removing an LGBT-related question from two surveys. First, on an annual survey used to identify service gaps under the Older Americans Act, the Obama administration added a question about sexual orientation and gender identity in 2014. ACL removed it in March, but after a swift backlash—nearly 14,000 comments in total—it added it back to the survey. The second survey provided information on programs that help disabled Americans stay in their homes. The Obama administration added an LGBT-related question to the redesigned survey in January; in March, the ACL under Trump removed it. The agency is currently reviewing comments for that proposal and will release the final survey later this year.
An ACL spokesperson said the agency is required by law to minimize paperwork for survey respondents and only include questions that help the agency conduct proper oversight. “Given that the questions weren’t generating reliable data, they did not meet that standard,” she said.
Trump also eliminated the White House visitor log, a high-profile move that drew a swift backlash from good governance groups. The visitor log wasn’t a traditional piece of government data, since it wasn’t collected through a survey or scientific research, and it didn’t contain the names of every person who visited the Obama White House—there were some exceptions. But it still provided journalists and the public with a powerful resource to hold the Obama administration accountable for who got access to the president. That the Trump administration has discontinued it appears to be a sign about its commitment to transparency, if not data.
Still, more broadly, the administration hasn’t made many other changes to surveys or data collection efforts. “I’ve seen less of that than we anticipated,” said Goldman, who pointed out that other administrations have sidelined inconvenient data. That brings up another concerning possibility for open data advocates: The White House simply doesn’t care much about data one way or the other. In other words, the Trump administration doesn't oppose open-data policies, but doesn't support them either.
There’s some evidence to support this position. For instance, the White House budget proposes small cuts to the Bureau of Economic Analysis, the Economic Research Service and Energy Information Agency, all of which gather and publish copious amounts of data. (Although in a budget with huge cuts to domestic programs, these cuts were relatively small.) Notably, the budget significantly underfunds the Census Bureau, which needs an influx of resources as it ramps up for the 2020 census—the government’s largest data-gathering effort, and a key to accurately tracking the nation's changing demographics. The administration has also not named a permanent replacement for former Census Bureau Director John Thompson, who stepped down at the end of June; it’s a sign to many census-watchers that the administration is not invested in the decennial, a responsibility enshrined in the Constitution.
“The efforts across departments and agencies to gather and analyze data are going to suffer if the budget is approved by Congress,” said Holdren, Obama’s top science adviser.
The White House did not respond to a request for comment.
It’s not just underfunding agencies that could undermine the federal government’s data. Experts warn that inattention could prove equally problematic, leading to missing data points or dysfunctional web portals. Wonderlich says the concern so far isn't "data being removed en masse from public websites," but something more subtle. “The question of politics and funding collection is the biggest threat to American data infrastructure,” he says.
Open data efforts require ongoing maintenance and a commitment to quickly providing data to the public. Obama made such a commitment and expected the rest of administration to act accordingly. On this one the jury is still out: Almost nothing has gone offline yet, but prolonged neglect from Trump could result in technical problems, as obsolescing databases become less usable and up to date. That could lead agencies to make consequential policy decisions without the necessary information, as officials follow Trump’s lead in ignoring data and setting policy based more on instinct, or politics, than evidence. Holdren, for instance, said he can’t understand how Trump officials repeatedly say they don’t know how much climate change is a result of human behavior—but are proposing huge cuts to research programs designed to answer that exact question.
“It is inconsistent to claim on the one hand we don’t know how much humans have to do with it and on the other hand, to slash the data gathering and research efforts that add clarity to that question,” he said. “I find it quite mind-blowing.”
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.