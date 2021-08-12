For biologists who study the structure of proteins, the recent history of their field is divided into two epochs: before CASP14, the 14th biennial round of the Critical Assessment of Protein Structure conference, and after. Scientists had worked for decades to figure out how to determine the structure of proteins based on the amino acid sequence. After CASP14, which took place in December 2020, the problem had effectively been solved, by researchers at the Google subsidiary DeepMind.

DeepMind is a research firm that focuses on “deep learning”, a type of artificial intelligence. DeepMind was previously famous for creating an AI system which beat Go’s world champion. Their success in protein structure prediction was achieved with a neural network called AlphaFold2. This marked the first time that they were able to solve an actual scientific problem. Scientists can use their ability to predict the structure of proteins to aid research and drug discovery. DeepMind released their code to the public on July 15th, when Nature published an unedit manuscript that detailed DeepMind’s model.

In the time that CASP was published, however, another team took over this role. A team headed by David Baker (director of the Institute for Protein Design, University of Washington), released their model for protein structure prediction in June. This was a month prior to DeepMind’s manuscript being published. RoseTTAFold was for a month the best-performing protein prediction algorithm available to scientists. Though it did not reach the same peaks of performance as AlphaFold2, the team ensured the model would be accessible to even the least computationally inclined scientist by building a tool that allowed researchers to submit their amino acid sequences and get back predictions, without getting their hands dirty with computer code. The Baker Lab paper that described RoseTTAFold was published by Science a month later.

RoseTTAFold2 and AlphaFold2 both have multilayered complex neural networks which can predict 3D structures of a protein based on its amino acid sequence. They also share interesting similarities such as a multi-track structure, which allows them to examine different parts of the protein structure independently.

This is no accident. The University of Washington team created RoseTTAFold from ideas from DeepMind’s short presentation at CASP in which they described the unique elements of AlphaFold2. They were inspired also by the lack of information from the DeepMind team about when scientists would have access to the technology. Researchers were concerned that the code might be kept secret from others by a private firm, in contravention of academic practices. Baker says, “Everyone was stunned, there were a lot press and then it became radio silence basically.” You’re stuck in a strange situation, where you’ve made a significant advance in your field but can’t continue to build upon it.

Baker and Minkyung Baek saw an opportunity. Although they didn’t have the DeepMind code to solve the problem of protein structure, they were aware that it was possible. They also had a general understanding of how DeepMind did it. David said, “This is proof of existence.” DeepMind demonstrated that these types of methods work,” says John Moult of University of Maryland College Park, who is also the organizer of CASP. He said that was sufficient.