- | 3:00 pm
Homomorphic encryption could revolutionize privacy—so what is it?
A startup named Ravel claims breakthroughs in fully homomorphic encryption, a hotly-pursued method for analyzing encrypted data without ever decrypting it.
You want to take a DNA test to get a genomic health report, but there’s a problem: you don’t feel comfortable just handing over all your health and genomic data to that big biotech company. As an analogy, you might think of your data like gold, and the company like a jeweler. The jeweler will make you a gold necklace—produce your genomic health report—but you might not notice if the jeweler also takes a bit of gold for themselves, or lets their even less scrupulous friends have some, or loses it to thieves.
Now imagine another approach: instead of letting the jewelers have your pile of raw gold, it stays inside a locked box. Only you have the key. By inserting their hands into the box, the jeweler can still make the necklace, but the gold itself remains secure; the jeweler can’t even see it.
This is the promise of fully homomorphic encryption (FHE), a technique that could let tech firms analyze your personal data without being granted access to the original data itself. Among cryptographers, and for anyone who cares about privacy, the technology is considered a “holy grail.” First described in the late ’70s, homomorphic encryption has only in the past decade begun attracting millions of dollars from venture capital and agencies like the NSA. And it’s still not easy to do. Given the math involved, the process is often slow and storage-intensive, impractical for wide-scale, real-world use.
But homomorphic encryption received a jolt of fresh attention last week when a French startup said it had achieved major breakthroughs in FHE to make the process “scalable.”
“We have successfully overcome FHE’s biggest challenges,” says Mehdi Sabeg, the CEO and founder of Paris-based Ravel Technologies. He declined to share details about the advances; amid a fierce race in research funding, the company is seeking patents for its technology. But he says tests showed that, for certain operations, the system operates at four orders of magnitude faster than current state-of-the-art approaches, with a data requirement only 33 times larger than plaintext.
Sabeg founded Ravel in 2018, and bootstrapped its funding while building a team of scientists. This year the 15-person company received an undisclosed seed investment from Airbus Ventures, an investment arm of the French aerospace manufacturer, and tested a proof of concept with BNP Paribas, France’s biggest bank.
A breakthrough in FHE could be a boon for personal privacy. Banks could use the technology to perform fraud detection and Know Your Customer screenings without the need to see or store a customer’s raw data, or to process data in foreign jurisdictions that have stricter data rules. Big tech firms racing to train their machine learning models could feed them more sensitive and detailed learning material in a way that doesn’t also involve turning over your photos or your face. Or those same tech giants could personalize ads and make predictions about their users without decrypting their communications, or removing personal data from users’ devices. FHE could help eliminate the exploitation of plaintext data in bidding markets, a challenge for banks starting to rely on more transparent blockchain infrastructure, or allow researchers to more closely explore valuable genetic databases—for instance, to trawl associations between specific gene mutations to find genes associated with certain diseases—without accessing patients’ actual DNA data or medical histories.
Typical encryption technologies scramble data with a secret key, so that, ideally, it can be read only by a key-holder. The messages can be intercepted, but without the secret key, they are gibberish ciphertexts. This is how the now-ubiquitous HTTPS protocol secures many of the web’s transactions, including your activity on websites like this one, or how apps like WhatsApp and Signal protect messages.
The catch, of course, is that if someone wants to provide services to you based on your data—to determine a loan, calculate a credit score, analyze your health, or personalize your feed and the ads you see—they first need to decrypt your data. This is why thousands of companies have massive piles of our data sitting unencrypted on their servers (we clicked Agree, after all), and no one can really say what they’re doing with it. One consequence is that a lot of that data—mine and yours, probably—are now on others’ servers, too: hackers, scammers, spies, propagandists, and whoever else is waiting for the right moment to sell it or use it for reasons we may never know.
Homomorphic encryption, by contrast, would make it practical for a data processor to do valuable analysis on your encrypted data, without anyone having to see or possess the original plaintext data in the process.
To be practical, however, homomorphic encryption also has to be fast and efficient enough to power database and machine learning applications over the internet. Like other homomorphic encryption startups, Ravel’s approach involves using an arsenal of algorithms to interact with a specially made database that Sabeg says is designed to “answer industry imperatives.”Ravel adviser Cedric Villani, the Fields Medal-winning “dandy mathematician” who was also recently a Paris mayoral candidate, said in a statement that the company’s breakthroughs were “impressive.”
“With the continual increase of personal and industrial data being processed globally, privacy and confidentiality protection are of paramount importance,” Villani said. “Ravel’s breakthroughs bring an efficient and scalable answer to critical data privacy and security challenges.”
WHAT THE $#!+ IS HOMOMORPHIC ENCRYPTION?
In 1978, Ron Rivest, Leonard Adleman and Michael Dertouzos at MIT first described the problem of designing a fully homomorphic encryption scheme. (The first two would also lend their initials to the popular encryption system RSA, which they helped invent the previous year.) But it wasn’t until 2009 that the problem was first solved, by a lawyer-turned-mathematician named Craig Gentry. “This is a bigger deal than might appear at first glance,” security expert Bruce Schneier wrote at the time. As he explained:
Any computation can be expressed as a Boolean circuit: a series of additions and multiplications. Your computer consists of a zillion Boolean circuits, and you can run programs to do anything on your computer. This algorithm means you can perform arbitrary computations on homomorphically encrypted data. More concretely: if you encrypt data in a fully homomorphic cryptosystem, you can ship that encrypted data to an untrusted person and that person can perform arbitrary computations on that data without being able to decrypt the data itself. Imagine what that would mean for cloud computing, or any outsourcing infrastructure: you no longer have to trust the outsourcer with the data.
A clue to how FHE works is in the name. Morphic comes from the Greek word for shape; homo is from the word for same, or similar. In this case, the emphasis is on similar. A similar shape.
Imagine, as an analogy, that you wanted to disguise your mobile phone in public. So you turn it into a new shape—a banana, say—while preserving some of its key features, like redial and doom-scroll. It can still do most phone things, except no one but you would know it was actually a phone.
In more mathematical terms, according to Britannica, a homomorphism is:
a special correspondence between the members (elements) of two algebraic systems, such as two groups, two rings, or two fields. Two homomorphic systems have the same basic structure, and, while their elements and operations may appear entirely different, results on one system often apply as well to the other system.
Ronen Mukamel, a mathematician and geneticist who works with sensitive data at Harvard Medical School and Brigham and Women’s Hospital, went into slightly more math in an email. He explains:
A homomorphism is a correspondence or function between algebraic objects (e.g. numbers) that preserves the algebraic operation (e.g., addition or multiplication).
In symbols, if the operation is “+”, and the function is “f”, then we require: f(a+b)=f(a)+f(b).
For instance, in most familiar number systems (integers, real numbers), the function f(x) = 2*x is a homomorphism since f(a + b) = 2*(a+b) = 2*a + 2*b = f(a) + f(b)
Whereas f(x)=x+1 is not a homomorphism since f(a+b)=a+b+1, a quantity different from f(a)+f(b)=a+1+b+1=a+b+2.
In the case of homomorphic encryption, a homomorphism translates between the plain, unencrypted data and the encrypted data. The correspondence being, a homomorphism ensures that algebraic operations performed on the encrypted data translate to the same algebraic operations on the unencrypted data, allowing computations to be performed on encrypted data without having to decrypt any of the intermediate data.
Accordingly, Saroja Erabelli, a researcher at the encryption startup Duality, describes what this means within a basic homomorphic encryption scheme. It must meet these criteria:
- If you add the encryption of a and the encryption of b, you obtain the encryption of a + b.
- If you multiply the encryption of a and the encryption of b, you obtain the encryption of a • b.
- You can perform a limited number of additions and multiplications on a given ciphertext.
Gentry’s breakthrough was proving you could actually perform an unlimited number of computations. (This is what makes his technique fully homomorphic.) Using an encryption scheme based on the difficulty of solving lattice problems, Gentry introduced bootstrapping to the process, a method of performing operations on ciphertexts using the system’s own successive series of homomorphisms, while periodically “refreshing” the ciphertext to reduce the noise the system necessarily generates.
But this is an incredibly inefficient process. As your datasets grow larger, and your ciphertexts grow even larger, and your operations grow more complex (for instance, from addition to multiplication), there’s more noise to deal with, dramatically increasing your computational and data storage requirements. Gentry estimated in 2009 that simply performing a Google search with encrypted keywords would increase the amount of computing time by about a trillion.
A succession of FHE schemes have since improved upon Gentry’s approach, but cryptographers continue to struggle with the noise and inefficiencies in FHE schemes. They’ve also made impressive progress. Consider IBM’s HElib C++ library for homomorphic encryption, one of a dozen or so freely-available, open-source implementations of FHE. The 2018 version is 25-75 times faster than the previous version, which was two million times faster than the original version, released in 2015. That acceleration—by a factor of 100 million in the span of three years—ain’t bad.
And yet: that first version performed algebra on a set of encrypted text about 100 trillion times slower than it took the same computer to perform the same operations on the corresponding plaintexts. If you were using the 2018 version of HElib, a calculation that would take one second using plaintexts would thus take an average of 11.5 days to perform. More sophisticated operations like multiplication are particularly costly. Even for IBM’s newest version of HElib, released in 2021, allowing ciphertexts to be added or multiplied ad infinitum without too much noise or too many errors remains a challenge.
The last lines of a 2020 paper on HElib by Shai Halevi, a researcher at the Algorand Foundation, and Victor Shoup at IBM, captures the noisy, clunky approach that cryptographers continue to rely on as they pursue the holy grail. Their algorithms will “choose a generator that yields a good dimension, if that is possible,” they warn. “Nevertheless, it may produce bad dimensions.” Still, progress: “However, it will never produce a very bad dimension.”
“QUANTUM-LEAP BREAKTHROUGHS”
After over a decade of research and millions in funding into software and hardware, how did a small startup address the efficiency challenges of FHE?
The breakthroughs are “purely algorithmic,” says Sabeg. “It’s not just one but actually a succession of breakthroughs.”
Sabeg declined to share more specific details or discuss its proof of concept with BNP Paribas, citing a nondisclosure agreement, and stuck to a broad sketch of the process and progress.
According to Sabeg, Ravel’s algorithms are “powerful enough to power the only fully encrypted SQL database over large volumes of data,” which he says is the first such database “equipped to query billions of encrypted data points, as well as machine learning algorithms to process the data.”
In its proof of concept with BNP, Sabeg says, the database was used over a very large homomorphically encrypted dataset with query response times that last less than a second. “As an example, we can efficiently run computations over large integers (over 32bits) with an extremely low level of error probability,” something he says that “none of our competitors can achieve.”
The advances were all the more impressive because they didn’t require advanced hardware. “All this on a laptop,” he added.
“The trend is that most resources [in homomorphic encryption] are heading toward hardware improvements and accelerations, which is, of course, beneficial for Ravel.”
Noise, he says, remains one of the biggest challenges, but “we are dealing with it very efficiently,” says Sabeg, using a patent-pending process.
“At the moment, none of the people working on FHE are working on the same verticals,” he adds. “Our commercial approach is as unique as our technology.” In a press release, the company even gave its flavor of FHE a new name. It’s RHE—Ravel homomorphic encryption.
ANOTHER ARMS RACE
Indeed, while science tends to thrive on open, collaborative research, much ambitious cryptographic science has long looked more like an arms race. Literally: cryptography, including homomorphic encryption, has long benefited from the big budgets of defense entities like DARPA and IARPA, or the black budgets of more secretive three-letter agencies. But growing numbers of academic researchers, tech companies, and investors are joining the race.
IBM has been working on FHE since 2009, when Genry discovered his bootstrapping step during a summer internship there. (He is now chief technology officer at an AI startup called TripleBlind.) Apple and Google and Huawei have been investing in FHE as part of their privacy initiatives. Kristin Lauter, a pioneer in the area of the related field of elliptic curve cryptography, spent two decades at Microsoft Research, where she helped build SEAL, a free and open-source software library that implements various forms of homomorphic encryption. Last year, prior to a hiring freeze, Meta hired a number of specialists in FHE, including Lauter, who is now head of Meta’s Seattle-based AI research division.
Ravel also faces competition from a handful of startups, including Enveil, which has raised $40 million from VCs including the CIA’s investment arm, Paris-based Zama, and Duality, which has attracted $50 million from investors including Microsoft, Wal-Mart, Airbus, and AT&T. Fewer than 1 percent of companies fund FHE projects now, according to Gartner, but by 2025 the research firm predicts that number will be at least 20%.
Despite its promises for personal data, FHE wouldn’t be able to prevent other violations of our privacy, our right to be let alone. The technology could let a social media app access your data without endangering the data itself, but that wouldn’t prevent the company from continuing to find new ways to hijack your attention, nudge your emotions, and track you everywhere. It’s plausible that under the banner of better encryption—but without smarter rules for an under-policed data industry—technologies like homomorphic encryption could make these kinds of violations of our freedom even worse. A paper coauthored by Meta’s Lauter and published this month in Nature describes a way to apply FHE in neural networks in order to bring ostensible privacy to another emerging technology: the automated detection systems used to track people in surveillance video.
As Ravel continues to refine its system, Sabeg says it is also making progress on enabling neural networks to train on encrypted data.
The other big challenge now is selling its system to advertisers, tech companies, banks, and anyone else interested in enhancing user privacy. Apart from BNP Paribas, Sabeg declined to name other potential customers Ravel is talking to. “We are engaged in active discussions,” he says. “We already have the capabilities to have the average person benefit from the power of FHE.”