RNA and NKS

Erik A. Schultes

Hedgehog Research

RNA, the versatile chemical relative of DNA, is a linear polymeric system, composed of four physiochemically distinct nucleotide bases: A, C, G, and U. Like proteins, single-stranded RNAs fold into complex, sequence-specific configurations having precisely defined biochemical functions. Hence, RNA molecules can be considered as a kind of computational hardware, their nucleotide sequences as programs, and their folding dynamics as computations. For example, a sequence (program) can be said to halt when the molecule achieves a stable conformation. RNA sequence space is therefore a space of programs where one can ask not only a range of NKS questions, but can interpret the answers with respect to the origin and evolution of biochemical function: How do arbitrary RNA sequences behave? Do evolved and unevolved RNA sequences behave differently? Can evolutionarily derived adaptations in RNA be resolved from those features that arise intrinsically among random sequences?

Inspired by research with cellular automata (CA), I have developed bioinformatic, theoretical, and experimental tools to answer NKS questions about RNA. By directly comparing evolved and arbitrary sequences, it appears that arbitrary RNAs frequently acquire compact, sequence-specific folds comparable to evolved instances of functional RNAs, but fail to reach those conformations uniquely. Hence, natural selection is not required to explain compact folded states, but is essential for the discovery of relatively rare folding mechanisms that avoid alternative conformations. Furthermore, I have shown that sequence space is structurally and functionally redundant and that sequences having a particular function are not randomly distributed in sequence space, but rather, belong to large networks interconnected by one- or two-step mutations. The ever present dynamics of mutation and selection compel populations of RNAs to explore these networks of neutral variants. Because neutral networks for different functions are interwoven throughout sequence space, arbitrary sequences are guaranteed to be near in sequence space to many (all?) functional sequences and that one- or two-step mutations can sometimes transform a functional RNA from one network, into a radically new fold and function as part of another neutral network. Many important experiments remain to be done, including a truly systematic and comprehensive sampling of RNA sequence space that would require the synthesis and analysis of 35,000 arbitrary RNA sequences and cost $100M.

Coming full circle, these experimental results from molecular biology prompt analogous questions in the context of CA: Do evolved and unevolved CA rules behave differently? Are there neutral networks of CA rules with respect to behavior or function? Along with CA, the theoretical and experimental tractability of RNA helps to answer, as well as define questions of NKS.

[presentation materials]


Created by Mathematica  (May 11, 2006)