TechingToday

Hackers can take advantage of this flaw to obtain AI data left on the GPU.

General

While there are significant advantages to running AI workloads locally, exploiting a newly discovered vulnerability could allow users to obtain residual data from vulnerable Apple, AMD, Qualcomm, and Imagination Technologies GPUs.

As reported by BleepingComputer, this new security flaw (tracked as CVE-2023-4969) was discovered by security researchers Tyler Sorensen and Heidy Khlaaf of Trail of Bits, so and is referred to as LeftoverLocals.

Essentially, this flaw allows data to be recovered from an affected GPU running a large language model (LLM) or machine learning process locally. While hackers would need physical access to a vulnerable GPU on a system running an AI workload to exploit this flaw, this new attack technique remains a concern.

Whether you are running AI models locally or are concerned about the dangers posed by AI, here is everything you need to know about LeftoverLocals. [According to a blog post on Trail of Bits, this security flaw stems from the fact that some GPU frameworks do not fully isolate memory. As a result, one kernel running on a vulnerable machine could read values stored in local memory written by another kernel. [The Trail of Bits security researchers also explained that an attacker could read data left in the GPU's local memory by another user simply by running a GPU compute application such as OpenCL, Vulkan, or Metal. According to the researchers, this is done by "writing a GPU kernel that dumps uninitialized local memory."

This recovered data can reveal all kinds of sensitive information from the victim's calculations while running the AI model locally, including model inputs, outputs, weights, and intermediate calculations.

Trail of Bits security researchers went a step further by creating a proof of concept (available on GitHub) showing that 5.5 MB of data can be recovered per GPU call by exploiting the LeftoverLocals vulnerability. For example, on an AMD Radeon RX 7900 XT GPU running the open source llama.cpp LLM, an attacker can recover 181MB of residual AI data per query. This is enough to reconstruct the response from the LLM with a high degree of accuracy, allowing the attacker to know exactly what you were discussing with the AI in question.

Trail of Bits contacted Apple, AMD, Qualcomm, and Imagination Technologies in September and found that many companies have already released patches to address the flaw or are currently working on them.

It is also worth noting that while the MacBook M2 and iPhone 12 Pro are vulnerable, Apple's iPhone 15 line, MacBook M3, and other M3-based laptops and computers are not affected.

According to AMD's security bulletin, some models of its GPUs are still vulnerable, but engineers are working on a fix. Similarly, Qualcomm has released a patch for firmware v2.0.7 that addresses LeftoverLocals on some chips, but not others. Meanwhile, Imagination Technologies released a patch last December for DDK v23.3, but Google warned this month that some of its GPUs are still vulnerable to the flaw. Fortunately, GPUs from Intel, Nvidia, and ARM are not affected by LeftoverLocals at all.

However, for those GPUs that are still vulnerable, Trail of Bits suggests that companies that manufacture GPUs implement a mechanism to automatically clear local memory during kernel calls. This might affect performance, however. However, given the severity of the LeftoverLocals flaw, this trade-off may be worthwhile.

As GPU manufacturers strive to nip this flaw in the bud once and for all, more will be revealed about LeftoverLocals.