# Detailed model validation in Doppio Further validation of models is available from the `Atomic model validation` job in Doppio (search “model validation”). This runs multiple validation tools including MolProbility, Servalcat, TEMPy and CheckMySequence to extensively validate your model. ## Run the job In the Atomic model validation job enter your current model and the following options and then click `Run`: ```params Input model:: ServalcatRefine/jobXXX/refined.pdb Servalcat FSC (Model-map FSC):: Yes TEMPy global (Real space map fit):: Yes CheckMySequence (Sequence agreement):: Yes Input map:: PostProcess/job030/postprocess.mrc NOTE: It is best to use the unmasked version as metrics are calculated with the supplied mask. Resolution:: 2.9 Input halfmap 1:: Refine3D/job029/run_half1_class001_unfil.mrc Input halfmap 2:: Refine3D/job029/run_half2_class001_unfil.mrc Input map mask:: MaskCreate/job020/mask.mrc Input sample sequence:: Import/job004/P00722.fasta ``` ## Analysing the results When the job is complete click the `RESULTS` tab. ### Global map-fit scores: ```{image} ../_static/images/ModelBuilding/5_Global_map_fit_scores.png :align: centre :scale: 100% ``` These metrics indicate the global agreement of the model against the map. Use the percentile values to evaluate how good the scores are relative to the distribution of scores associated with models built with data in this resolution range. Average Model-Map FSC gives the weighted average of Fourier space correlation in resolutions shells (up to Nyquist). Average Model-Map FSC (FSC 0.5) is the same metric calculated up to the resolution corresponding to the FSC 0.5. The model overlap fraction (fraction of masked map covered by the model) is very sensitive to extent of the mask used. CCC overlap mask is the CCC calculated withing the volume of overlap between model and masked map. CCC contoured map is the CCC calculated between masked map and model, including volume not occupied by the model. ### Model-map FSC: ```{image} ../_static/images/ModelBuilding/5_Model_Map_FSC_Plot.png :align: centre :scale: 100% ``` As discussed in previous sections, for a well-fitted model, the model-map FSC 0.5 should correspond to the half-map resolution at FSC 0.143. In this plot, both curves are shown along with horizontal guides at the two FSC thresholds. We can see that the crossing points occur at roughly the same spatial frequency, and we can also check the general shape of the two curves. It is useful to compare the Model-Map FSC curve (solid green line) against the Cref curve (dash-dotted black line). The correlation is calculated within the volume covered by the input mask. The Cref curve is derived from the half-map FSC curve (ref) and shows the fall-off for a perfectly fitted model (Rosenthal & Henderson 2003). Compare the Model-Map FSC curve against Cref to check for under (or over) fitting. Often the differences between the molecular/solvant boundary of experimental map vs model (model based synthetic map) gives rise to differences between the Model-Map FSC and Cref at low resolutions. These differences are more pronounced when the molecular boundaries are significantly different e.g. when the map has unmodelled regions (e.g. partial models, unmodelled detergent micelles in case of membrane proteins) At resolutions better than 5Å, the Model-Map FSC curve agrees well with Cref in this case. The resolution corresponding to Model-Map FSC 0.5 is slightly worse than that of Cref 0.5 (dotted orange line) indicating scope for slight improvement in fit. Note that the Cref curve is implemented in Doppio versions above 1.3.0. If the Cref curve is absent, compare the resolution corresponding to Model-Map FSC 0.5 and that of half-map FSC 0.143. For a well-fitted model, these should match. ### Global geometry scores: ```{image} ../_static/images/ModelBuilding/5_Global_geometry_scores.png :align: centre :scale: 100% ``` The geometry scores table provides statistics from Molprobity. Check for any geometry score outside the expected range or associated with low percentile values. In that case, it is good to try and fix the associated outliers. ### Outlier clusters: To help identify areas that need manual inspection outliers from the various metrics are spatially clustered and grouped together in the outlier cluster table. ```{image} ../_static/images/ModelBuilding/5_Outlier_clusters.png :align: centre :scale: 100% ``` This is ordered by the size of the cluster (number of residues). We find that the most effective way of inspecting and fixed these errors is to download the outlier_clusters.csv file by clicking the Open in new tab icon from the I/O panel: You can view this in a spreadsheet or text editor whilst then running Moorhen or Coot to inspect these regions. In this case, cluster 1 is associated with bad clashes (e.g. residues 461, 462, 487) and bond angle outliers. Note that the clusters may be different based on how the model was processed in the previous steps. Go to Asn461 (Shift G and enter 461). We can visualise clashes in Moorhen, but to do this we first need to add hydrogen atoms to the model. From the main Moorhen menu, go to `Edit` and `Add/Remove hydrogen atoms`. Select the model to add hydrogen atoms. To view clashes click `Models` (from the main Moorhen menu) and click Cont. Dots. ```{image} ../_static/images/ModelBuilding/5_cont_dots.png :align: centre :scale: 100% ``` Red dots indicate bad clashes/overlaps with pink cylindrical lines reflecting the extent of overlap. Right click on a residue associated with a bad clash and try `Refine residue` to check if it relieves the clashes with neighbouring atoms. Alternatively, the `Van der Waals restraint weight` can be increased to 1.5 or 2 in another round of Servalcat refinement to reduce severe clashes. ### Ramachandran outliers In this case, the cluster 2 has residues associated with Ramachandran outliers (e.g. 600, 602). Go to residue 600. To view the probability of backbone Ramachandran angles, click `Models` (from the main Moorhen menu) and click `Rama balls`. Orange and Red balls are associated with disfavored and outlier Ramachandran angles respectively. ```{image} ../_static/images/ModelBuilding/5_Ramachandran_outliers.png :align: centre :scale: 100% ``` To view the Ramachandran map with all outliers, go to `Validation -> Ramachandran plot.` Outliers are shown as red dot. You can click through these outliers and fix them as well. Try Refine residues to see whether it fixes the Ramachandran outliers. You may also need to use Drag backbone atoms, followed by Refine Region if you see a backbone misfit based on the map features. ### Omega angle outliers If any of the top clusters include residues associated with omega angle outliers, this is due to deviation from expected torsion angle around the peptide bond. You can view the peptide omega angles along the chain from the pept. omega plot under `Validation -> Validation plot`. If you find an incorrectly modelled cis-peptide (often illustrated by the red flag): ```{image} ../_static/images/ModelBuilding/5_Omega_angle_outliers1.png :align: centre :scale: 100% ``` Use `Flip Peptide` and `Refine Residues` to fix this area. ```{image} ../_static/images/ModelBuilding/5_Omega_angle_outliers2.png :align: centre :scale: 100% ``` You may also need to use Drag atoms, followed by Refine Region to place the backbone oxygen into density. Work through the other clusters to make your model as best as possible and then re-run `Servalcat Refine`. Remove the hydrogens (if added above) before saving the modified model.