Mitch Daniels is a numbers guy, a cost-cutter. In the early 2000s, he tried and failed to rein in congressional spending under then-US president George W. Bush. So when he took office as Indiana governor in 2005, Daniels was ready to argue once again for fiscal discipline. He wanted to straighten out Indiana’s state government, which he deemed rife with dysfunction. And he started with its welfare system. “That department had been rocked by a series of criminal indictments, with cheats and caseworkers colluding to steal money meant for poor people,” he later said.

Daniels’ solution took the form of a $1.3 billion, 10-year contract with IBM. He had lofty ambitions for the project, which started in 2006, claiming it would improve the benefits service for Indiana residents while cracking down on fraud, ultimately saving taxpayers billions of dollars.

But the contract was a disaster. It was canceled after three years, and IBM and Indiana spent a decade locked in a legal battle about who was to blame. Daniels described IBM’s sweeping redesign and automation of the system—responsible for deciding who was eligible for everything from food stamps to medical cover—as deficient. He was adamant, though, that outsourcing a technical project to a company with expertise was the right call. “It was over-designed,” he said. “Great on paper but too complicated to work in practice.” IBM declined a request for comment. 

In July 2012, Judge David Dryer of the Marion County Superior Court ruled that Indiana had failed to prove IBM had breached its contract. But he also delivered a damning verdict on the system itself, describing it as an untested experiment that replaced caseworkers with computers and phone calls. “Neither party deserves to win this case,” he said. “This story represents a ‘perfect storm’ of misguided government policy and overzealous corporate ambition.” 

That might have been an early death knell for the burgeoning business of welfare state automation. Instead, the industry exploded. Today, such fraud systems form a significant part of the nebulous “govtech” industry, which revolves around companies selling governments new technologies with the promise that new IT will make public administration easier-to-use and more efficient. In 2021, that market was estimated to be worth €116 billion ($120 billion) in Europe and $440 billion globally. And it’s not only companies that expect to profit from this wave of tech. Governments also believe modernizing IT systems can deliver big savings. Back in 2014, the consultancy firm McKinsey estimated that if government digitization reached its “full potential,” it could free up $1 trillion every year. 

Contractors around the world are selling governments on the promise that fraud-hunting algorithms can help them recoup public funds. But researchers who track the spread of these systems argue that these companies are often overpaid and under-supervised. The key issue, researchers say, is accountability. When complex machine learning models or simpler algorithms are developed by the private sector, the computer code that gets to define who is and isn’t accused of fraud is often classed as intellectual property. As a result, the way such systems make decisions is opaque and shielded from interrogation. And even when these algorithmic black holes are embroiled in high-stakes legal battles over alleged bias, the people demanding answers struggle to get them. 

In the UK, a community group called the Greater Manchester Coalition of Disabled People is trying to determine whether a pattern of disabled people being investigated for fraud is linked to government automation projects. In France, the digital rights group La Quadrature du Net has been trying for four months to find out whether a fraud system is discriminating against people born in other countries. And in Serbia, lawyers want to understand why the introduction of a new system has resulted in hundreds of Roma families losing their benefits. “The models are always secret,” says Victoria Adelmant, director of New York University’s digital welfare state project. “If you don’t have transparency, it’s very difficult to even challenge and assess these systems.” 

The rollout of automated bureaucracy has happened quickly and quietly, but it has left a trail of scandals in its wake. In Michigan, a computer system used between 2013 and 2015 falsely accused 34,000 people of welfare fraud. A similar thing happened in Australia between 2015 and 2019, but on a larger scale: The government accused 400,000 people of welfare fraud or error after its social security department started using a so-called robodebt algorithm to automatically issue fines.

Another scandal emerged in the Netherlands in 2019 when tens of thousands of families—many of them from the country’s Ghanaian community—were falsely accused of defrauding the child benefits system. These systems didn’t just contribute to agencies accusing innocent people of welfare fraud; benefits recipients were ordered to repay the money they had supposedly stolen. As a result, many of the accused were left with spiraling debt, destroyed credit ratings, and even bankruptcy. 

Not all government fraud systems linked to scandals were developed with consultancies or technology companies. But civil servants are increasingly turning to the private sector to plug knowledge and personnel gaps. Companies involved in fraud detection systems range from giant consultancies—Accenture, Cap Gemini, PWC—to small tech firms like Totta Data Lab in the Netherlands and Saga in Serbia.

Experts in automation and AI are expensive to hire and less likely to be wooed by public sector salaries. When the UK surveyed its civil servants last year, confidence in the government’s ability to use technology was low, with around half of respondents blaming an inability to hire top talent. More than a third said they had few or no skills in artificial intelligence, machine learning, or automation. But it’s not just industry experience that makes the private sector so alluring to government officials. For welfare departments squeezed by budget cuts, “efficiency” has become a familiar buzzword. “Quite often, a public sector entity will say it is more efficient for us to go and bring in a group of consultants,” says Dan Sheils, head of European public service at Accenture.

The public sector lacks the expertise to create these systems and also to oversee them, says Matthias Spielkamp, cofounder of German nonprofit Algorithm Watch, which has been tracking automated decision-making in social welfare programs across Europe since 2017. In an ideal world, civil servants would be able to develop these systems themselves and have an in-depth understanding of how they work, he says. “That would be a huge difference to working with private companies, because they will sell you black-box systems—black boxes to everyone, including the public sector.” 

In February 2020, a crisis broke out in the Dutch region of Walcheren as officials realized they were in the dark about how their own fraud detection system worked. At the time, a Dutch court had halted the use of another algorithm used to detect welfare fraud, known as SyRI, after finding it violated people’s right to privacy. Officials in Walcheren were not using SyRI, but in emails obtained by Lighthouse Reports and WIRED through freedom-of-information requests, government employees had raised concerns that their algorithm bore striking similarities to the one just condemned by the court.

Walcheren’s system was developed by Totta Data Lab. After signing a contract in March 2017, the Dutch startup developed an algorithm to sort through pseudonymous information, according to details obtained through a freedom-of-information request. The system analyzed details of local people claiming welfare benefits and then sent human investigators a list of those it classified as most likely to be fraudsters. 

The redacted emails show local officials agonizing over whether their algorithm would be dragged into the SyRI scandal. “I don’t think it is possible to explain why our algorithm should be allowed while everyone is reading about SyRI,” one official wrote the week after the court ruling. Another wrote back with similar concerns. “We also do not get insight from Totta Data Lab into what exactly the algorithm does, and we do not have the expertise to check this.” Neither Totta nor officials in Walcheren replied to requests for comment. 

When the Netherlands’ Organization for Applied Scientific Research, an independent research institute, later carried out an audit of a Totta algorithm used in South Holland, the auditors struggled to understand it. “The results of the algorithm do not appear to be reproducible,” their 2021 report reads, referring to attempts to re-create the algorithm’s risk scores. “The risks indicated by the AI algorithm are largely randomly determined,” the researchers found. 

With little transparency, it often takes years—and thousands of victims—to expose technical shortcomings. But a case in Serbia provides a notable exception. In March 2022, a new law came into force which gave the government the green light to use data processing to assess individuals’ financial status and automate parts of its social protection programs. The new socijalna karta, or social card system, would help the government detect fraud while making sure welfare payments were reaching society’s most marginalized, claimed Zoran Đorđević, Serbia’s minister of social affairs in 2020. 

But within months of the system’s introduction, lawyers in the capital Belgrade had started documenting how it was discriminating against the country’s Roma community, an already disenfranchised ethnic minority group. 

Mr. ​​Ahmetović, a welfare recipient who declined to share his first name out of concern that his statement could affect his ability to claim benefits in the future, says he hadn’t heard of the social card system until November 2022, when his wife and four children were turned away from a soup kitchen on the outskirts of the Serbian capital. It wasn’t unusual for the Roma family to be there, as their welfare payments entitled them to a daily meal provided by the government. But on that day, a social worker told them their welfare status had changed and that they would no longer be getting a daily meal.

The family was in shock, and Ahmetović rushed to the nearest welfare office to find out what had happened. He says he was told the new social card system had flagged him after detecting income amounting to 110,000 Serbian dinars ($1,000) in his bank account, which meant he was no longer eligible for a large chunk of the welfare he had been receiving. Ahmetović was confused. He didn’t know anything about this payment. He didn’t even have his own bank account—his wife received the family’s welfare payments into hers. 

With no warning, their welfare payments were slashed by 30 percent, from around 70,000 dinars ($630) per month to 40,000 dinars ($360). The family had been claiming a range of benefits since 2012, including financial social assistance, as their son’s epilepsy and unilateral paralysis means neither parent is able to work. The drop in support meant the Ahmetovićs had to cut back on groceries and couldn’t afford to pay all their bills. Their debt ballooned to over 1 million dinars ($9,000). 

The algorithm’s impact on Serbia’s Roma community has been dramatic. ​​Ahmetović says his sister has also had her welfare payments cut since the system was introduced, as have several of his neighbors. “Almost all people living in Roma settlements in some municipalities lost their benefits,” says Danilo Ćurčić, program coordinator of A11, a Serbian nonprofit that provides legal aid. A11 is trying to help the Ahmetovićs and more than 100 other Roma families reclaim their benefits.

But first, Ćurčić needs to know how the system works. So far, the government has denied his requests to share the source code on intellectual property grounds, claiming it would violate the contract they signed with the company who actually built the system, he says. According to Ćurčić and a government contract, a Serbian company called Saga, which specializes in automation, was involved in building the social card system. Neither Saga nor Serbia’s Ministry of Social Affairs responded to WIRED’s requests for comment.

As the govtech sector has grown, so has the number of companies selling systems to detect fraud. And not all of them are local startups like Saga. Accenture—Ireland’s biggest public company, which employs more than half a million people worldwide—has worked on fraud systems across Europe. In 2017, Accenture helped the Dutch city of Rotterdam develop a system that calculates risk scores for every welfare recipient. A company document describing the original project, obtained by Lighthouse Reports and WIRED, references an Accenture-built machine learning system that combed through data on thousands of people to judge how likely each of them was to commit welfare fraud. “The city could then sort welfare recipients in order of risk of illegitimacy, so that highest risk individuals can be investigated first,” the document says. 

Officials in Rotterdam have said Accenture’s system was used until 2018, when a team at Rotterdam’s Research and Business Intelligence Department took over the algorithm’s development. When Lighthouse Reports and WIRED analyzed a 2021 version of Rotterdam’s fraud algorithm, it became clear that the system discriminates on the basis of race and gender. And around 70 percent of the variables in the 2021 system—information categories such as gender, spoken language, and mental health history that the algorithm used to calculate how likely a person was to commit welfare fraud—appeared to be the same as those in Accenture’s version.

When asked about the similarities, Accenture spokesperson Chinedu Udezue said the company’s “start-up model” was transferred to the city in 2018 when the contract ended. Rotterdam stopped using the algorithm in 2021, after auditors found that the data it used risked creating biased results.

Consultancies generally implement predictive analytics models and then leave after six or eight months, says Sheils, Accenture’s European head of public service. He says his team helps governments avoid what he describes as the industry’s curse: “false positives,” Sheils’ term for life-ruining occurrences of an algorithm incorrectly flagging an innocent person for investigation. “That may seem like a very clinical way of looking at it, but technically speaking, that’s all they are.” Sheils claims that Accenture mitigates this by encouraging clients to use AI or machine learning to improve, rather than replace, decision-making humans. “That means ensuring that citizens don’t experience significantly adverse consequences purely on the basis of an AI decision.” 

However, social workers who are asked to investigate people flagged by these systems before making a final decision aren’t necessarily exercising independent judgment, says Eva Blum-Dumontet, a tech policy consultant who researched algorithms in the UK welfare system for campaign group Privacy International. “This human is still going to be influenced by the decision of the AI,” she says. “Having a human in the loop doesn’t mean that the human has the time, the training, or the capacity to question the decision.” 

Despite the scandals and repeated allegations of bias, the industry building these systems shows no sign of slowing. And neither does government appetite for buying or building such systems. Last summer, Italy’s Ministry of Economy and Finance adopted a decree authorizing the launch of an algorithm that searches for discrepancies in tax filings, earnings, property records, and bank accounts to identify people at risk of not paying their taxes. 

But as more governments adopt these systems, the number of people erroneously flagged for fraud is growing. And once someone is caught up in the tangle of data, it can take years to break free. In the Netherlands’ child benefits scandal, people lost their cars and homes, and couples described how the stress drove them to divorce. “The financial misery is huge,” says Orlando Kadir, a lawyer representing more than 1,000 affected families. After a public inquiry, the Dutch government agreed in 2020 to pay the families around €30,000 ($32,000) in compensation. But debt balloons over time. And that amount is not enough, says Kadir, who claims some families are now €250,000 in debt. 

In Belgrade, ​​Ahmetović is still fighting to get his family’s full benefits reinstated. “I don’t understand what happened or why,” he says. “It’s hard to compete against the computer and prove this was a mistake.” But he says he’s also wondering whether he’ll ever be compensated for the financial damage the social card system has caused him. He’s yet another person caught up in an opaque system whose inner workings are guarded by the companies and governments who make and operate them. Ćurčić, though, is clear on what needs to change. “We don’t care who made the algorithm,” he says. “The algorithm just has to be made public.”

Additional reporting by Gabriel Geiger and Justin-Casimir Braun.