How difficult is it to go from a random point on a PES to the global minimum? The answer, of course, depends upon the system. Many proteins are able to fold reversibly to the physiologically active native structure from unfolded states. However, a random amino acid sequence is very unlikely to be able to fold to a unique structure. Similarly, in condensed matter physics there are substances for which it is hard to prevent crystallization as the liquid is cooled, and others which instead are likely to form glasses.
The seminal analysis of the global optimization problem in the context of protein folding was made by Levinthal[226]. This author estimated the number of configurations a typical protein could adopt (3N where N is the number of amino acids) and noted that even if protein configurations could be sampled at a rate of, say, 1013 per second, it would take longer than the age of the universe to find the native structure if this sampling was simply random. The result of this simple calculation markedly contradicts the actual properties of proteins, which can generally find the native structure from an unfolded state on time scales of the order of seconds or less. This well-known contradiction has come to be known as Levinthal's paradox. A similar paradox can be formulated for a cluster. For example, the number of geometrically distinct minima on the PES of a 55-atom cluster interacting via the Lennard-Jones potential has been estimated to be of the order of 1021, yet the cluster can rapidly find the Mackay icosahedral global minimum from the liquid-like state[110,111]. The existence of `magic number' peaks in the mass spectra of rare gas clusters[47,48] indicates that these systems can also locate certain global minima very efficiently.
A more rigorous approach to defining the effort involved in finding the global minimum can be obtained using computational complexity theory[227], which categorizes problems into classes of similar difficulty. The most relevant class is that of NP-hard problems, for which there is no known algorithm that is guaranteed to find the solution within polynomial time, effectively rendering NP-hard problems intractable for large sizes. It has been shown that finding the global minimum of a protein[228,229] or a cluster[93] is NP-hard. Again, there seems to be a paradox between the NP-hardness of protein folding and the actual behaviour of proteins. However, it should be noted that computational complexity theory provides a worst case analysis. First, the proofs apply to general categories of problems and not necessarily to every instance of a problem. Although there may be no efficient method for finding the global minimum of a general heteropolymer, biological proteins may represent a subset whose properties have evolved so that they can fold. Secondly, the criterion for a successful algorithm is very stringent; one must not only find the global minimum, but prove that this it is actually the global minimum. In contrast, proteins do not necessarily have to fold to the global potential energy minimum, but instead must have a high probability of folding to the native structure which probably corresponds to one of the lowest energy minima on the PES. Similarly, the heuristic global optimization methods, such as simulated annealing[107] and genetic algorithms[230,231] are not rigorously guaranteed to find the global minimum. For example, these methods can find the Mackay icosahedron for the 55-atom Lennard-Jones cluster, which is widely accepted to be the global minimum even though this has never been rigorously proved.
Some have tried to resolve Levinthal's paradox by hypothesizing a reduction in the number of states that the protein has to search through. Typically it is argued that the protein first collapses rapidly to a compact conformation and that the subsequent search is through the greatly reduced space of these compact states[232,233,234]. Although in some cases this might so reduce the number of states that Levinthal's paradox no longer applies, it cannot in general provide a complete answer. For instance, the number of minima corresponding to the liquid-like state of a 55-atom Lennard-Jones cluster is of the order of 1012 (§3.6). This is much less than the total number of minima on the PES but the ease with which the global minimum can be found is still incompatible with a random search through this reduced configuration space. In fact, when the cluster has an energy or temperature in the melting region, it can switch back and forth between the Mackay icosahedral global minimum and the liquid-like states on a time scale of the order of nanoseconds (using parameters appropriate for argon)[110,111].
Seeking to resolve Levinthal's paradox by reducing the search space may be unproductive in a more fundamental sense, for it ignores a basic fallacy in Levinthal's paradox: namely the assumption that each point in configuration space is equally likely. Levinthal was effectively assuming that the PES is totally flat. However, the topography of a protein PES is far from flat, involving many `mountain ranges' and `valleys', thus making some configurations more likely than others. (In the canonical ensemble the low potential energy configurations have larger Boltzmann weights, and in the microcanonical ensemble they have a larger momentum density of states.) To illustrate this point Zwanzig et al. produced a simple model of a protein with an energetic bias towards the native structure which could find the global minimum on physically reasonable time scales[235]. However, this particular model problem belongs[229] to the class of problems P, which are tractable in polynomial time, not to the class NP.
In lattice model studies of proteins, sequences that fold are often compared with those that do not. Some studies have suggested that one of the distinctive characteristics of the folding sequences is a significant energy gap between the global minimum and the next lowest energy structure[233,234,236] and this criterion has been used to design sequences that fold rapidly[237,238]. One particularly comprehensive study of a 27-unit chain led Sali et al. to conclude that the necessary and sufficient condition for a folding sequence is that the native state be a pronounced global minimum[233,234]. However, this conclusion has been criticized both because the correlation between the energy gap and folding ability is rather weak and because the criterion concentrates on only a very small part of the PES, whereas folding ability surely depends on the entire energy landscape[239]. Interestingly, Sali et al.'s results actually show a stronger correlation between the folding ability and the temperature, Tx, at which the global minimum has an equilibrium probability of 0.8, if only compact states are considered.
There are further problems when trying to apply the energy gap criterion to real proteins, since for a real protein there are many minima on the PES which involve only small perturbations to the native structure and have very similar potential energies[240]. In contrast, for the lattice model considered by Sali et al. any change to the native structure involves a much larger perturbation and can give rise to a large increase in the energy. Therefore, when applied to more complex models and real proteins, it is probably best to interpret the energy gap criterion as the energy gap between the native state and the lowest energy structurally distinct non-native state. The higher the energy of the non-native states relative to the global minimum the less likely they are to act as kinetic traps. Support for this reinterpretation has been found from an annealing study of simple off-lattice proteins[241].
Other studies of lattice models have come to very different conclusions from Sali et al.. In particular, Klimov and Thirumalai report no correlation between the folding time and the energy gap between the two lowest energy states[242]. Instead, they found that the folding time decreased as the ratio of the folding temperature, Tf, (at which the native structure becomes the thermodynamically most stable state) to the collapse transition temperature, , (at which the protein transforms from an extended to a compact state) increased. The source of the differences between this study and that of Sali et al. is not clear since they involved very similar systems.
An alternative approach to understanding protein folding has been put forward by Wolynes and coworkers[243,244,245,246,247]. It is based upon a statistical characterization of the free energy landscape, and has its origin in spin-glass theory. This approach was combined with the idea of a `folding funnel', a set of convergent pathways which lead to the global minimum, making it kinetically accessible from the ensemble of misfolded states[248]. A protein that can fold would have a single folding funnel, whereas a random heteropolymer is likely to have numerous funnels leading to structurally different low energy states. The theory describes the global properties of the free energy surface in terms of a few parameters: the ruggedness of the surface, the extent of the funnel (the size of the configurational space), and the potential energy gradient towards the native state. A rugged PES has many minima with large barriers between them. For a protein to be able to fold the funnel would need a sufficiently large potential energy gradient to direct the folding protein to the native structure.
In the above model, the slope of the funnel can also be related to the energy gap between the native state and the disordered collapsed structures, (a different energy gap to that used by Sali et al.). By maximizing this energy gap the ratio of Tf (defined above) to the glass transition temperature, Tg, (at which the dynamics dramatically slow down because of kinetic barriers to escape local minima on the PES) is maximized. A large ratio Tf/Tg ensures that the native state is the most stable state at temperatures where the dynamics are still fast, allowing the protein to reach the native state rather than becoming trapped in a local minimum. This criterion can also explain the performance of global optimization methods, such as simulated annealing, which attempt to follow the global free energy minimum as a system is cooled. Such methods are likely to fail when the global minimum of the free energy changes at a temperature below Tg. This approach can also help us to understand some aspects of the results for lattice model proteins that we noted earlier. The Tx that was used by Sali et al. has a very similar definition to Tf; both are measures of the stability of the native state of the protein. It is, therefore, unsurprising that increasing Tx (and hence Tf/Tg, if Tg is fairly independent of sequence) increases the ability of the protein to reach the native state. Also, Thirumalai has emphasized[249] the relationship between the values of the ratios and Tf/Tg.
Another equivalent formulation of the above ideas is in the `principle of minimal frustration'. Proteins that have a single folding funnel are likely to have a native state in which all the interactions between residues are favourable. In contrast, in compact configurations of a random heteropolymer there are likely to be residues brought together with conflicting interactions--a random heteropolymer is `frustrated'. Minimizing the frustration leads to a low energy native structure and consequently to a large value for Tf. These ideas not only provide an elegant theoretical understanding of protein folding, but by using experimental information to estimate the characteristic parameters of the energy landscape for real proteins, they allow a connection to be made between real proteins and lattice models[250].
These theories developed in the context of protein folding can also be useful in understanding the dynamics of systems in the realm of condensed matter physics. As we noted in §4.4, simple liquids have been described as frustrated systems: they have significant polytetrahedral character, yet all space cannot be packed with regular tetrahedra. Therefore, close-packed structures, which are packings of both tetrahedra and octahedra, are the densest and lowest energy structures. Frank was the first to suggest that this structural difference between the solid and the liquid was responsible for the large degree of supercooling that could be achieved for liquid metals[188]. The widest funnels on the PES in such systems lead to glassy minima rather than to close-packed structures.
The packing constraints, and hence the degree of frustration, can be changed by introducing positive curvature into space. In fact for a space of appropriate positive curvature, the lowest energy, highest density structure is a perfect polytetrahedral packing[191] called polytope . Straley performed a series of simulations which showed that crystallization occurs much more rapidly in this curved space than in Euclidean space, elegantly demonstrating the effect of frustration on the dynamics[200]. For the curved space there is a single funnel on the PES which leads down through the liquid states to the global minimum.
Similarly, for small Lennard-Jones clusters global optimization strategies, such as genetic algorithms, are able to find the global minimum based on icosahedra from the mass of liquid-like structures fairly easily[73] because icosahedral structures have significant polytetrahedral character. However, there are a number of sizes at which small Lennard-Jones clusters do not have an icosahedral global minimum. For example, the global minimum of a 38-atom cluster is the face-centred-cubic truncated octahedron[70,71] and the lowest energy structures for clusters with 75, 76, 77, 102, 103 and 104 atoms are probably decahedral[71,72]. These cases are much harder for a global optimization algorithm to find since the structures have a much greater close-packed character. In fact the decahedral clusters have never been found yet by an unbiased global optimization method. It is much harder for the cluster to find the funnel associated with the decahedral minima, compared to the wide funnel that leads down to the icosahedral minima.
Interestingly, in a study of small Lennard-Jones clusters (19 atoms and less) which used global optimization methods based upon the simulated annealing of the classical density distribution, the efficiency was found to be correlated with the energy gap between the global minimum and the next lowest energy state[251]. However, it may be dangerous to draw general conclusions from this result. First, the difference in dynamic behaviour between proteins that can and cannot fold is much greater than that between clusters of different sizes; all the clusters considered in the above study can reach the global minimum relatively easy. Secondly, for these very small systems, much of the thermodynamics is determined by the low energy minima on the PES and the energy gap between the two lowest energy structures is likely to correlate with the melting temperature. For example, the `melting' transition of the 13-atom cluster is associated with transitions between the icosahedron and the set of the defective icosahedral minima which are next lowest in energy[128]. However, for larger clusters melting is associated with a transition between the low energy minima and the numerous high energy amorphous minima, and the melting temperature is now likely to be correlated with the energy gap between the global minimum and this band of `liquid-like' minima.
The complexity of proteins means that, inevitably, one has to be satisfied with a coarse-grained picture of the dynamics involved in protein folding. Studies of small clusters, therefore, could offer particularly important insights since their size allows a much more detailed analysis of the PES[252] and the time scale of the transition from the disordered liquid-like states to a unique solid-like structure is short enough to be probed by computer simulation. Initial studies exploring the relationship between the dynamic behaviour of clusters and the PES have already produced interesting results[252,253,254]. For example, a comparison of potassium chloride and argon clusters has shown that the greater ability of potassium chloride clusters to crystallize into solid-like structures from the liquid is due to the lower energy barriers and the greater potential energy gradient along series of reaction paths which lead to rock-salt type structures[252]. A more quantitative study sought to understand the behaviour of a 19-atom Lennard-Jones cluster based upon large samples of minima and transition states. The dynamical behaviour was found by calculating transition rates between minima, and then constructing and solving a master equation to describe the flow of probability through configuration space[253,254]. The latter results, however, did not include corrections for the fact that the samples represent only a small fraction of the total number of stationary points on the PES.
In this chapter we follow a similar approach, proceeding from a complete description of a model PES in terms of its minima and transition states to the thermodynamic and dynamic behaviour. We concentrate on the ability of the system to find the global minimum and how this is related to the thermodynamics through Landau entropy profiles. This approach allows us to systematically investigate the effects of various topographic features. In §5.2 we describe the basic PES and the methods for calculating the thermodynamic and dynamic behaviour. In §5.3 we present the results for the PES with a single folding funnel, and in §5.4 we consider the results for the modified surfaces. Finally in §5.5, we summarize the main conclusions from this work. Throughout the chapter we attempt to show how our approach can illuminate and provide a framework for understanding the results already known for specific systems, such as proteins, clusters and glasses.