(S)-Crizotinib To test our refinement protocol
To test our refinement protocol, we considered a benchmark set of 12 N-linked glycan-containing protein structures determined by X-ray crystallography at resolutions ranging between 1.9 Å and 3.5 Å and comprising a total of 133 glycan units. We identified four incorrect anomeric configurations and 23 high-energy ring conformations using the Privateer software (Agirre et al., 2015b) and detected by visual inspection that (S)-Crizotinib one structure was missing a glycosidic bond (Table 1). We compared refinement of our Rosetta-based method with that of Phenix refinement alone and, for models with high-energy ring conformations in the input, Phenix refinement with constraints generated by Privateer (Agirre, 2017b, Gristick et al., 2017). Following refinement of these 12 structures with our Phenix-Rosetta protein structure refinement pipeline, which alternates real- and reciprocal-space refinement, we were able to markedly improve the carbohydrate geometry, as assessed by Privateer. All the errors detected in the input coordinates were corrected in the output models. In particular, four incorrect anomeric carbon configurations were resolved by adjusting to the correct anomeric state and 23 high-energy ring conformations were refined into a corresponding low-energy conformation with only a slight decline in agreement to the experimental data, consistent with the idea that these glycans are being forced into poor geometry in order to over-fit the density (Figure S1). The best alternative method tested, Phenix refinement with constraints from Privateer, was able to resolve only two of the four incorrect anomers and 12 of the 23 sugars in high-energy conformations (Table 1).
A comparison of real-space correlations of the refined and initial models, using both 2mFo-dFc density maps as well as polder omit maps (Liebschner et al., 2017), is also shown in Figure S1. While the geometry consistently improves following Rosetta refinement, the real-space correlations show mixed results: while in some sugars, we see a better fit to the data, in other cases we see a slight worsening. This might be due to the fact our relatively more-restrained model does a worse job at explaining the density resulting from heterogeneous conformations.
To illustrate the improvements resulting from our protocol, we examined the structures of human IL-17AF (PDB: 5N92) and IgG1-Fc (PDB: 5K65). In the IL-17AF structure, fucose 507 was modeled with an incorrect beta configuration, which has been resolved to the correct alpha connection in the refined model (Figure 1C). In IgG1-Fc the fucose 507 is also problematic as it is in a high-energy boat conformation that was automatically detected and corrected in the Rosetta-refined model, which has the expected low-energy 1C4 (Figure 1D). In the IgG1-Fc structure, the most proximal N-acetyl glucosamine of the N-linked glycosylation is not bonded to residue Asn 297 and does not properly fit the density. After refinement and rephasing, the carbohydrate moiety fits with better agreement to the electron density map and its covalent linkage to Asn 297 is properly formed (Figure 1E). These residues are also shown in the polder omit map (Figure S2) (Liebschner et al., 2017).
Discussion Here we describe a method for refining glycan atomic coordinates against cryoEM and X-ray crystallography data using Rosetta. Since Rosetta uses a physically realistic all-atom force field, it is well suited for modeling into near-atomic resolution density maps, which is the resolution regime achieved for most cryoEM structures. This Rosetta glycan refinement protocol expands upon previous iterations (Labonte et al., 2017) by avoiding fitting stereochemically unfavorable glycan structures into sparse experimental data and instead yielding physically realistic geometries based on prior knowledge of saccharide chemical properties. Using a benchmark set of 12 deposited crystal structures, we demonstrated our algorithm is capable of correcting models containing significant errors in glycan geometry (Table 1) above and beyond previous methods (Agirre, 2017b, Gristick et al., 2017). This is likely due to the increased radius of convergence of our refinement, as well as the ability to flip anomeric state. This functionality should prove beneficial for large-scale model validation efforts such as “PDB redo” (Joosten et al., 2011, Terwilliger et al., 2012). We further demonstrated the strength and versatility of this algorithm for improvement of glycan stereochemistry in cryoEM models determined at near-atomic resolution with two examples of glycoprotein structures recently obtained and refined using Rosetta.