Without Code for DeepMind’s Protein AI, One Lab Wrote Its Own

Unbeknownst to them, at DeepMind, an extensive scientific paper detailing its system was already under review at Nature, according to John Jumper, who leads the AlphaFold project. DeepMind had submitted its manuscript to Nature on May 11.

At that point, the scientific community knew little about DeepMind’s timeline. That changed three days after Baker’s preprint became available, on June 18, when DeepMind CEO Demis Hassabis took to Twitter. “We’ve been heads down working flat out on our full methods paper (currently under review) with accompanying open source code and on providing broad free access to AlphaFold for the scientific community,” he wrote. “More very soon!”

On July 15, the very same day that Baker’s RoseTTAFold paper was published, Nature released DeepMind’s unedited but peer-reviewed AlphaFold2 manuscript. Simultaneously, DeepMind made the code for AlphaFold2 freely available on GitHub. And a week later, the team released an enormous database of 350,000 protein structures that had been predicted by its method. The revolutionary protein prediction tool, and a vast volume of its predictions, were at last in the hands of the scientific community.

According to Jumper, there’s a banal reason for why DeepMind’s paper and code weren’t released until more than seven months after the CASP presentation: “We weren’t ready to open source or put out this extremely detailed paper that day,” he says. Once the paper was submitted in May, and the team was working through the peer review process, Jumper says they tried to get the paper out as soon as possible. “We had honestly been pushing as fast as we could,” he says.

The DeepMind team’s manuscript was published through Nature’s Accelerated Article Preview workflow, which the journal most frequently uses for Covid-19 papers. In a statement to WIRED, a spokesperson for Nature wrote that this process is intended “as a service to our authors and readers, in the interests of making particularly noteworthy and time-sensitive peer reviewed research available as quickly as possible.”

Jumper and Pushmeet Kohli, lead of DeepMind’s science team, demurred about whether Baker’s paper factored into the timing of their Nature publication. “From our perspective, we contributed and submitted the paper in May, and so it was out of our hands, in some sense,” Kohli says.

But CASP organizer Moult believes that the University of Washington team’s work may have helped DeepMind scientists convince their parent company to make their research freely available on a shorter timescale. “My sense from knowing them—they are really outstanding scientists—is that they would like to be as open as possible,” Moult says. “There is some tension there, in that it’s a commercial enterprise, and in the end it’s got to make money somehow.” The company that owns DeepMind, Alphabet, has the fourth-highest market cap in the world.

Hassabis characterizes the release of AlphaFold2 as a benefit to both the scientific community and Alphabet. “This is all open science, and we’re giving this to humanity, no strings attached—the system, the code, and the database,” he said in an interview with WIRED. Asked whether there was any discussion about keeping the code private for commercial reasons, he said, “It’s a good question how we deliver value. Value can be delivered in a lot of different ways, right? One is obviously commercial, but there’s also prestige.”

Baker is quick to praise the DeepMind team for the thoroughness of their paper and code release. In a sense, he says, RoseTTAFold was a hedge against the possibility that DeepMind would not act in the spirit of scientific collaboration. “If they had been less enlightened and decided not to release the code, then there at least would have been a starting point for the world to build on,” he says.

Article Categories: