After more than two months of fighting following Hamas’ horrific Oct. 7 attacks on Israel, the Israel-Hamas War appears to be entering its most deadly phase. Israel is ratcheting up its aerial bombing campaign and expanding ground operations across Gaza, which is on the brink of humanitarian collapse. Peacemaking efforts have failed, again, with the United States vetoing a U.N. Security Council resolution on Dec. 8 calling for a ceasefire, while the vast majority of the U.N. General Assembly supports it. As the Palestinian death toll surpasses a staggering 18,000, civilian harm has become the focal point for growing international pressure to reach a ceasefire.
Israel maintains it has taken extraordinary steps to protect civilians, setting the “gold standard” for urban warfare. Officials have touted their ability to wage war with surgical precision, relying on artificial intelligence (AI) to develop more accurate collateral damage estimates, reduce targeting errors, and improve military decision-making overall.
But mounting civilian casualties in Gaza undermine this narrative. Instead, Israel’s expanded use of AI in airstrikes may partially explain the results. Rather than limiting harm to civilians, Israel’s use of AI bolsters its ability to identify, locate, and expand target sets which likely are not fully vetted to inflict maximum damage.
Failure to Protect
Israel’s staunchest ally has increasingly urged restraint. On Tuesday, U.S. President Joe Biden told Israeli Prime Minister Benjamin Netanyahu that Israel was losing international support due to its “indiscriminate bombing.” Since early December, senior Biden administration officials—including Vice President Kamala Harris, Secretary of State Anthony Blinken, and Defense Secretary Lloyd Austin—have publicly called on Israel to put a “premium on civilian protection.” Secretary Austin will travel to Israel next week to press Netanyahu’s government to do more to protect Palestinian civilians caught in the crossfire.
Even a conservative reading of the casualty figures suggests the level of civilian harm in Gaza is unprecedented. Operation Swords of Iron—the Israeli military response to the Oct. 7 attacks—marks a dramatic increase in the pace and lethality of airstrikes compared to previous Israeli campaigns in Gaza. According to Professor Yagil Levy’s calculations, published in Haaretz, the civilian death rate in previous Israeli military campaigns from 2012 to 2023 was roughly 30-40 percent, in contrast to more than 60 percent in Operation Swords of Iron. The United Nations has said two-thirds of the total number of casualties are women and children.
Israel assesses it kills two Palestinian civilians for every Hamas militant, a ratio that an Israel Defence Forces (IDF) spokesperson described as “tremendously positive.” But even that ratio exceeds comparable contemporary conflicts. Civilian harm in the current Israel-Hamas War surpasses the bloody U.S. counterinsurgency campaigns in Syria and Iraq that were waged under similar challenging conditions of urban warfare. In two months, more civilians have been killed in Gaza than in nearly two decades of war in Afghanistan. And Israel dropped more bombs during the first six days of the current war with Hamas than any single month in the U.S.-led coalition war with the Islamic State.
There are many possible explanations for the carnage. Hamas notoriously hides among civilians, using them as human shields and relying on a network of tunnels that runs beneath private homes and civilian infrastructure such as schools and hospitals. Israeli targeting rules have also apparently changed. Israeli officials, speaking anonymously, acknowledge that “targets that would have not been considered valuable enough to justify the risk to civilians in less serious skirmishes are being hit now,” according to the New York Times. Israel has also used ordinances with larger blast radiuses than is typical in counterinsurgency campaigns and a high percentage of unguided munitions, both of which pose greater risks to civilians. The U.S. Intelligence Community assesses, for example, that nearly half of all Israeli munitions dropped in Gaza since the start of the conflict have been imprecise “dumb bombs.” Perhaps, as an IDF spokesperson bluntly put it, the explanation is straightforward: “the emphasis is on damage, not precision.”
Another possible explanation for the mounting civilian death toll, however, has received surprisingly little attention (although it is not mutually exclusive of those mentioned above). Israel’s use of AI reportedly has allowed it to rapidly generate extensive targeting lists, enabling a rapid tempo of operations, likely without full human vetting of each target.
The Myth of Precision Targeting
This is not the first time Israel has used AI in war, but it is likely the most extensive use to date. During Operation Guardian of the Walls in 2021, the IDF said it relied on AI systems as a “force multiplier” for targeting Hamas and Palestinian Islamic Jihad militants. Lt. Gen. Aviv Kochavi, who headed the IDF from 2019 to January 2023, said in an interview this summer, “In Operation Guardian of the Walls, once this machine was activated, it generated 100 new targets every day. To put it in perspective, in the past, we would produce 50 targets in Gaza in a year. Now, this machine created 100 targets in a single day, with 50% of them being attacked.”
“Revolutionary changes” to how the IDF uses AI in military targeting date back to at least 2019. According to one report, Israeli officers in Unit 8200—an elite intelligence unit focused on cyber and signals intelligence—developed algorithms for a program known as “Hasbora” (“Gospel”) that analyzed data from signals and human intelligence, geographical intelligence, satellite imagery, drone feeds, facial recognition databases, and other sources to identify targets for elimination. Reservists from that same unit previously published an open letter in 2014 alleging that Unit 8200 was engaged in mass surveillance of Palestinians who were “unconnected to any military activity.”
Although these programs are classified, Israel for years has been collecting large amounts of Palestinian data that could be fed into AI algorithms for targeting purposes. Since 2018, Israel has relied on an AI-powered facial recognition system known as “Wolf Pack,” an extensive database containing information on virtually all Palestinians, including photographs, age, gender, where they live, family histories, level of education, close associates, and a security rating for each individual. Wolf Pack receives data from two military-run programs: “Red Wolf,” which is deployed at military checkpoints, and a smartphone app known as “Blue Wolf,” which Israeli soldiers have described as the “Facebook for Palestinians.” These systems collect extensive data on Palestinians, including children, without their consent.
Israel is expanding its reliance on AI-based systems such as Gospel in the current Gaza war. In a short statement on its website, the IDF confirmed that Gospel was used to “produce targets at a fast pace” through the “rapid and automatic extraction of intelligence.” The statement was made in the context of boasting that “the IDF’s target factory has been operating around the clock” with more than 12,000 targets completed in the first 27 days. The rapid tempo of Israeli strikes, enabled by AI, has likely contributed to mounting civilian deaths.
As the tempo of operations increases further, AI is also generating a larger set of targets than ever before. In just over two months of the current conflict, Israel has attacked roughly 25,000 targets, more than four times as many as in previous wars in Gaza. In the past, Israel would run out of targets—that is, known combatants or military objectives at which to aim strikes. Now, with AI more fully integrated into military operations, that is no longer as much of a barrier to killing (see JINSA’s Gemunder Center 2021 report, pp. 31-32). As a force multiplier, AI removes resource constraints on killing more junior operatives who normally would not be targeted due to the minimal impact of their deaths on military objectives.
In short, AI is increasing the tempo of operations and expanding the pool of potential targets, making target verification and other precautionary obligations much harder to fulfill, increasing the risk that civilians will be misidentified and mistakenly targeted.
Compounding these risks, AI systems may be producing inaccurate results that further exacerbate civilian harm. Research suggests that facial recognition software is less accurate when applied to people of color, especially darker-skinned women. AI systems may “hallucinate” and make up false information or the systems may become corrupted. Human error, such as misidentification and confirmation bias—the tendency to interpret information in a way that confirms previously held beliefs—could also be introduced in the data. For example, civilians of a certain age, gender, or status may be inappropriately categorized as militants, while patterns of behavior may be misinterpreted based on outdated or inaccurate information. As AI is used for complex tasks involving multiple levels of analysis, the risk of compounding errors increases. Intelligence services often get it wrong, and Israel’s massive intelligence failure on Oct. 7 is not reassuring on this front.
One limitation the IDF encountered in the past is that it lacked the data to train AI systems on what was not considered to be a military target. Algorithms were trained to recognize patterns of militant activity based on “billions of pieces of signals and other intelligence on Hamas’ and PIJ [Palestinian Islamic Jihad]’s orders of battle, military infrastructure, and daily routines.” But there was no information to train algorithms to reject targets because “historical records of rejected targets—that is, intelligence that was examined and then deemed not to constitute a target by human analysts—were not preserved” (JINSA’s Gemunder Center 2021 report, p. 31).
Ultimately, AI-based systems are only as accurate as the algorithms and data that power them. While there is no way for the public to evaluate the accuracy of classified programs such as Gospel or Wolf Pack, one thing is clear: AI has, by Israel’s own admission, generated significantly more targets, more rapidly than in previous conflicts. Of that enormous set of potential targets, more seem to have been “actioned” quickly for elimination. And a result has been widespread civilian harm.
Human review of targets, absent robust institutional and policy guardrails, will not sufficiently mitigate these risks. Even with human review, the pace and scale of Israeli strikes makes it difficult to fully vet targeting outputs, setting aside well-known explainability challenges with AI. As Tal Mimran, a lecturer at Hebrew University who has worked for the Israeli government on targeting previously, told NPR, “In the face of this kind of acceleration, those reviews become more and more constrained in terms of what kind of judgment people can actually exercise.” The pace at which AI is generating targets may simply outstrip the ability of intelligence officials, commanders, and military lawyers to keep up, although deliberately slowing down the operational tempo could allow for a course correction. Add to this the potential for automation bias—the propensity not to challenge a system’s output or search for contradictory information—which is rampant in crisis situations such as war, and human review begins to look increasingly perfunctory.
The doomsday scenario of killer algorithms is already unfolding in Gaza—and it is happening with humans fully “in the loop.”
Never Again
Israel’s use of AI in Gaza underscores the risks of allowing states to set their own policies for military AI, a domain that remains unregulated despite progress on other aspects of AI governance. Two days before the outbreak of the current Israel-Hamas War, U.N. Secretary-General António Guterres called on world leaders to negotiate a new legally-binding treaty for autonomous weapons systems by 2026. The treaty likely would not be applicable to military use of programs such as Gospel, which are neither fully autonomous nor weapons systems, highlighting an important gap in efforts to expand protections.
The international community has a responsibility to ensure standards are in place for the integration of AI into human operations to reduce, not exacerbate, civilian harm. Rather than increasing the tempo of military operations, policymakers should leverage the speed at which AI systems operate to introduce tactical pauses, more carefully review targets, ensure other precautionary measures are prioritized (particularly given the significant potential for flawed targeting lists for the reasons described above), and consider alternative options.
If AI speeds up killing in war, decision-makers must slow it down.
Greater transparency and accountability are also urgently needed. To that end, policymakers, legislators, and the public should ask the following questions—and demand answers—about Israel’s development and deployment of AI tools in war, including:
- What AI-based systems does Israel use in military targeting and how were these systems developed?
- What safeguards has Israel put in place to prevent errors, hallucination, misuse, and corruption of AI-based targeting systems?
- Can Israeli officials explain how AI targeting outputs are generated? What level of confidence exists in the traceability and explainability of results?
- How many targets do AI-based systems such as Gospel generate per day and what percentage of those targets are approved for use by the military?
- How much time do human operators spend verifying and checking AI-generated targets before those targets are approved?
- What does the human verification and approval process look like and at what level of seniority is it conducted?
- What is the error rate of these AI-based systems, and are there certain conditions or types of targets that are associated with higher or lower error rates? Has the error rate increased or decreased over time in this conflict?
- Does Israel feed U.S. or allied-provided intelligence data into AI-based systems such as Gospel? If so, has Israel informed its partners that this data is being used in AI targeting systems?
- How has Israel incorporated lessons learned about civilian harm in war into AI algorithms?
- In addition to selecting targets, are AI systems used to identify likely civilian harm? How new are those systems, and how have they been tested?
- Are any after-action review processes in place to identify, understand, and learn from mistakes? If so, what do these processes look like?
- What accountability mechanisms exist to prevent mistakes from reoccurring?
Answering these questions will not end the devastating violence in Gaza. But it is a critical step toward preventing a new era of AI-assisted warfare where “precise” mass killings are the norm. “I’m sure innocents have been killed, and it’s the price of waging a war,” President Biden proclaimed more than 10,000 casualties ago. The question remains: when is that price too high?