# Automated Refinement in Servalcat Refine Here we will use Servalcat to automatically refine the model, to improve the fit to the experimental map while applying stereochemical restraints. In the `NEW JOB` panel in Doppio, search for “servalcat” to find the `Servalcat Refine` job (or look in the Atomic Model Refine section), create a new job and enter the following: ```params Input model:: Moorhen/jobXXX/docked_model1_moorhen.pdb Resolution:: 2.9 NOTE: Use the resolution of your final 3D refinement Use half maps (recommended):: Yes NOTE: We always recommend using half maps as inputs if available. This ensures the input maps are unsharpened and unfiltered. Servalcat can generate automatically sharpened and weighted maps as per Yamashita et al (2021). Input map 1 (half map 1 or full map):: Refine3D/job029/run_half1_class001_unfil.mrc NOTE: Use the half map from your final 3D refinement. If half maps are not available, you can use a full map (ideally raw map before postprocessing) but that is not the optimal approach. Input map 2 (half map 2):: Refine3D/job029/run_half2_class001_unfil.mrc Mask for Fo-Fc map calculation:: MaskCreate/job020/mask.mrc NOTE: This mask is used for calculation of normalised maps including difference maps. Refinement cycles:: 10 NOTE: For the sake of speed, we will only use 10 cycles here. Point group:: D2 NOTE: Servalcat will enforce symmetry in the refinement i.e. it will refine against one copy of the protein and apply symmetry constraints. This is more efficient and ensures that the model’s symmetry strictly matches the symmetry that was applied in the 3D refinement of the map. NOTE: N.B. when using symmetry, ensure that the input model only contains one copy of the protein, not the whole complex. Jelly body restraints:: Yes NOTE: To maintain current interatomic distances, they are effective when starting with a good geometry. Keep the default values for Jelly body sigma and distance. NOTE: …and at the very bottom: Comprehensive Molprobity Run?:: Yes NOTE: Full Molprobity geometry evaluation will be run on input and refined models. This can be turned off to make the refinement job faster. ``` Leave the rest of the parameters as the defaults and click `RUN`. It will take about 5 minutes to complete the job. Note that the map will be masked around the model (with a certain radius : `Mask radius`) if `Masked refinement` is `Yes`. This speeds up refinement and reports fit-to-map statistics relating to the masked sub-volume (not the whole map). This does not use the mask input above (for Fo-Fc map calculation). When the job has been completed look at the table and plots in the `RESULTS` tab. ### Refinement statistics: Use the table to track the change in model geometry and fit-to-map statistics after refinement. The average Fourier Shell Correlation (FSC) quantifies agreement of the model with the map up to a given resolution. The average model-map FSC should increase as the fit-to-experimental-data improves. The `Actual weight` used in the refinement for the experimental data is also reported in the table. Compare the geometry statistics to check whether they improve after refinement. Use the recommended range / percentile as reference to assess the scope for further improvement. For statistics where percentile values are available (with respect to a distribution from structures solved at similar resolution), we expect it to be higher and around or greater than 50). Where expected ranges are given as reference, we usually expect the score to fall within this range. Deviations from expected ranges may occur in genuine cases where the map justifies the presence of energetically less favourable outlier(s) or is known (from other experimental evidence) to have these. Root mean square deviations from expected bond lengths and angles are reported (and plotted per refinement cycle). If some/most of the geometric statistics get worse after refinement, this might indicate that the refinement weight could be optimized (e.g. use lower starting `Weight`) and/or use appropriate restraints (Jelly body, external restraints e.g. from ProSMART, LIBG etc). Weights of specific geometric features can also be adjusted to improve the associated statistic/score. E.g. if the Clashscore gets significantly worse, the `Van der Waals restraint weight` can be increased to 1.5 or 2 in another round of refinement in Servalcat to reduce severe clashes. ### Map-Model FSC: The map-model Fourier Shell Correlation (FSC) plot shows the correlation between the map and model in resolution shells. For a well-fitted model, the model-map FSC curve is expected to cross FSC 0.5 at the resolution corresponding to half-map FSC 0.143 (Rosenthal & Henderson, 2003). In this case, the resolution was about 2.9 Å. ```{image} ../_static/images/ModelBuilding/4_Map_Model_FSC.png :align: centre :scale: 100% ``` ### Average FSC vs Refinement cycle: The Average FSC plot shows how the average FSC has changed over the course of the refinement. Typically this will increase rapidly in the first few cycles and then more slowly, eventually reaching convergence when the FSC is stable and no longer changing after each cycle. If the score is still improving and therefore refinement has not converged, more cycles should be run (but as mentioned above, in this case we ran only a few cycles to make the job finish quickly, and since we’re going to go on and edit the model further, it’s not crucial for the refinement to converge completely at this stage). ```{image} ../_static/images/ModelBuilding/4_FSC_average.png :align: centre :scale: 100% ``` ### B-value distribution: ```{image} ../_static/images/ModelBuilding/4_B_factor_distributions.png :align: centre :scale: 100% ``` *Typical B-factor distribution showing positive skew without excessive low values.* Let’s check the B-factor distribution plot. We expect a positively skewed distribution (i.e. a greater number of low B-factors) with some large B-factor outliers which either represent ill-modelled residues or highly mobile regions. A large number of very low B-factors and/or B-factors close to 0 Å2 demonstrate an over-sharpened map, conversely a non-skewed distribution (Gaussian or normal distribution) or a negatively skewed distribution shows an over-blurred map. Some models (from public databases or other modelling software) might have unrefined B-factors, typically indicated by a sharp spike in the B-factor distribution. ### Geometry outliers Servalcat reports atom distances, angles, torsion angles, etc., with unexpected values. It is worth checking these geometry outliers in Moorhen and Coot. A more detailed analysis of outliers will be carried out in section 5 about the model validation. ## Inspecting Servalcat Refine outputs in Moorhen From a completed Servalcat Refine job go to the I/O tab and look at the outputs, here are some of the important ones to be aware of: ● `refined.mmcif (and refined.pdb)` *This is the refined model (single chain).* ● `refined_expanded.mmcif (or refined_expanded.pdb)` *This is the symmetry expanded refined model (four chains in this case).* ● `refined_normalized_fo.mrc` *This is the sharpened and weighted map created by Servalcat Refine using the variance in the two half maps to calculate variance as described in Yamashita et al (2021).* ● `refined_normalized_fofc.mrc` *The Fo-Fc is a difference map between the experimental map and a map calculated based on the refined model. It is useful for finding areas of the model that do not agree with the experimental map.* ● `refined_maps.mtz` *Fourier coefficients for both maps described above.* Select the following output nodes and launch a new Moorhen job: ● `refined.mmcif (or refined.pdb)` *This is the refined model (single chain) we will work on.* ● `refined_maps.mtz` *Using the mtz file allows Moorhen to automatically open both the FoFc difference map with positive (green) and negative (red) density along with the normalised Fo map.* When making changes to the model below make sure you work on the single monomer copy (`refined.mmcif or refined.pdb`). Running Servalcat with the edited model will apply changes to all symmetry related chains. ## Modelling a ligand AlphaFold 2 predictions do not contain ligands so we will now add a missing ligand to the structure. To identify the position of the ligand and verify there is missing density go to `Validation -> Difference map peaks...` Click through the four largest peaks (tall green bars in the plot) until you find the peak near the model (the other three peaks come from the symmetry copies). ```{image} ../_static/images/ModelBuilding/4_Servalcat_difference_map.png :align: centre :scale: 100% ``` The map contains a large unconnected blob of density between Phe602 and Trp569. (Depending on the contour level, this might appear as one big curved blob or two separate blobs.) AlphaFold 2 predictions do not contain waters or ligands so this blob is currently unmodelled. The experimental sample contained the inhibitor 2-phenylethyl 1-thio-beta-D-galactopyranoside (PETG). Before adding the ligand, we have to fix a residue in the binding pocket. The Phe602 side chain has the wrong rotamer, in the image above you can see it doesn’t fit the Fo density map (blue map) and is surrounded by positive (green) and negative (red) density which all support the need to move the side chain. Right click on an atom in Phe602 and click `Auto-Fit Rotamer`. ```{image} ../_static/images/ModelBuilding/4_AutoFit_Rotamer_icon.png :align: centre :scale: 100% ``` ```{image} ../_static/images/ModelBuilding/4_Servalcat_difference_map2.png :align: centre :scale: 100% ``` This will improve the fit to the map and create space in the binding pocket to fit the ligand. The PDB ligand code for the ligand molecule is PTQ. Let’s model this ligand in Moorhen. Ensure the central cross hair is in the centre of the blob of unmodelled density (or in this case, since the density is curved, put the cross hair where you estimate the blob’s centre of mass is). Rotate the map to make sure the cross hair is centred. You can also find the ligand blob in `Validation -> Unmodelled blobs…`, and the blob 2 is the ligand. Insert the ligand using `Ligand -> Get monomer...` with `Monomer identifier: PTQ` ```{image} ../_static/images/ModelBuilding/4_Insert_ligand1.png :align: centre :scale: 100% ``` ```{image} ../_static/images/ModelBuilding/4_Insert_ligand2.png :align: centre :scale: 100% ``` Now fit the ligand into the density. Ensure that you have a clear view of the ligand density to be able to fit (rotate/translate) the ligand. Right click on an atom in the ligand and select `Rotate/Translate zone` ```{image} ../_static/images/ModelBuilding/4_Rotate_Translate_zone_icon.png :align: centre :scale: 100% ``` Rotate (left mouse button) and Translate (middle mouse button or shift+option on a Mac) the ligand until you have roughly fit it in the density. Make sure to click accept changes when finished. ```{image} ../_static/images/ModelBuilding/4_Accept_changes.png :align: centre :scale: 100% ``` You can use Auto-fit Rotamer (`Right click menu -> black residue-green map icon`) to improve the fit. If the fit of any part of the ligand needs to be improved, right click on an atom in the ligand and select `Drag atoms` ```{image} ../_static/images/ModelBuilding/4_Drag_atoms_icon.png :align: centre :scale: 100% ``` Manipulate the ligand until you are happy with its fit. Again, remember to click `accept changes`. ```{image} ../_static/images/ModelBuilding/4_fit_ligand.png :align: centre :scale: 100% ``` In many cases (including this one!) it’s easy to get the pose of the ligand wrong. Here, it takes some close examination to decide which way round the ligand should be (i.e. which end is the phenyl group and which is the sugar) and it’s very hard to be sure which way up the sugar goes. In these situations it’s necessary to draw on other sources, rather than trying to interpret the shape of the map density alone. One option is to examine the stereochemistry: do the interactions between the ligand and the binding site (e.g. hydrophobic contacts, hydrogen bonds, salt bridges) indicate that a particular ligand conformation fits better? But remember that a lot of the protein details are also uncertain, for example side chains can often be flipped which can completely change a putative hydrogen bonding network. If available, you can use a higher resolution structure as a reference. But beware! Remember that published structures are never perfect and ligands can easily be mis-fitted. Always check the details, and preferably draw on multiple independent sources to reduce the chance of copying someone else’s mistake. In this case we know that PDB 6TTE has the ligand correctly fitted. Click `File --> Fetch from online services` and use the PDB code 6TTE. ```{image} ../_static/images/ModelBuilding/4_6TTE.png :align: centre :scale: 100% ``` You now need to align 6TTE to the structure we are modelling. Click `Calculate --> Superpose Structures` and set as the Reference structure your refined model and the Moving structure to 6TTE and click Superpose to align. ```{image} ../_static/images/ModelBuilding/4_Superpose_structures.png :align: centre :scale: 100% ``` Use the model menu to toggle the models on and off and check if your ligand pose is similar to that found in 6TTE. ```{image} ../_static/images/ModelBuilding/4_PTQ_Pose.png :align: centre :scale: 100% ``` If not, repeat the steps above to move the PTQ in your model to the correct pose. (Be careful not to move the ligand atoms in the 6TTE reference!) Once in the correct pose we need to ‘merge’ the ligand atoms into the main model. To merge go to `Edit -> Merge molecules...` and merge PTQ into your model. (`From molecule` is PTQ and `Into molecule` is your refined model.) ```{image} ../_static/images/ModelBuilding/4_Merge_molecules.png :align: centre :scale: 100% ``` The colour of the bonds of the ligand will change to match the protein. Now they’re in the same model, Moorhen will consider clashes between the atoms when refining, so now we can do a quick refinement of the ligand and all the nearby residues to tidy up the whole binding pocket. First we need to select Sphere Refinement mode. In the menu, go to `Preferences -> Refinement settings...` and change `Default refinement selection` to Sphere. ```{image} ../_static/images/ModelBuilding/4_Sphere_refinement.png :align: centre :scale: 100% ``` Then right-click on one of the ligand atoms and select `Refine Residues` (the red bull’s eye). The ligand and nearby residues should all settle more nicely into the density. Let’s analyse the ligand a bit more. Select the `Models` menu and click on the `Ligands` drop-down menu. On the right, you can select radio buttons to visualise: Contact dots - can indicate interatomic clashes. ```{image} ../_static/images/ModelBuilding/4_Contact_dots.png :align: centre :scale: 100% ``` Chemical features - in case of PTQ, you can see aromatic character of the phenyl moiety (yellow circle); and that all the oxygen atoms here can act as acceptors (pink marker) and the ones in hydroxyl groups also as donors (blue marker), green marker indicate hydrophobicity. ```{image} ../_static/images/ModelBuilding/4_Chemical_features.png :align: centre :scale: 100% ``` ● Environmental distances - useful for checking chemical contact. ● Geometry validation - colour coded validation of bond lengths and angles. Below, you can also find a 2D environment view: Make sure to add hydrogen atoms to the model to see the interactions: `Edit -> Add/Remove hydrogen atoms…` ```{image} ../_static/images/ModelBuilding/4_2D_environment_view.png :align: centre :scale: 100% ``` Finally, delete all the other models (the 6TTE reference and the PTQ ligand molecule), and also delete the difference map (to make saving speedier) and save the Moorhen job. If you are working on a new or unusual ligand that is not in the monomer library, you will need to create a new structure and corresponding set of restraints using `AceDRG` in Doppio. See the “Modelling a New Ligand” section at the end of this tutorial for further details. ## Refining modified model with Servalcat With the improvements to the model you should now re-run Servalcat. This will generate and refine the B-factors of the ligand as well as refining all atom positions and B-factors using the modified model. Click `SAVE INTO NEW MOORHEN JOB`. In the Doppio jobs menu find your previous `ServalcatRefine` job, click the triple dot menu and click `Clone`. ```{image} ../_static/images/ModelBuilding/4_Clone_job.png :align: centre :scale: 100% ``` Keep all the inputs and parameters the same, except replace the input model with your latest Moorhen model with the improved loop and added ligand: ```params Input model:: Moorhen/jobXXX/refined_moorhen.pdb NOTE: The file name might be slightly different. Refinement cycles:: 20 NOTE: Now the model is closer to completion so it is more important to get closer to convergence. ``` Click `RUN` and wait for the job to be completed. When the job is finished look at the results again and see if the structure has improved. It’s also useful to click between this job and the previous Servalcat job to see how the final statistics have changed. (Opening a second Doppio tab can be helpful for comparing jobs like this.) Model refinement and validation can be a lengthy process, often involving multiple rounds of iteration between manual refinement in Coot / Moorhen and automated refinement in Servalcat. Building models correctly is a time-consuming process but it is important to give you and any others who may use it in the future the best possible structure to work with.