[Cod-bugs] Data registration

Toshiyuki Sasaki toshiyuki.sasaki at spring8.or.jp
Mon May 15 05:00:02 EEST 2023


Dear Dr. Saulius Gražulis,

I uploaded the CIF files by the (a) route.
Here are the ID and removed lines.

3000440, _refine_ls_R_factor_gt             0.1565, _refine_ls_wR_factor_ref           0.3685
3000442, _refine_ls_wR_factor_ref           0.4200

Thank you for your help.
Sincerely yours,

Toshiyuki Sasaki

-----Original Message-----
From: Takanori Nakane [mailto:tnakane.protein at osaka-u.ac.jp] 
Sent: Sunday, May 14, 2023 1:58 PM
To: grazulis at ibt.lt; Toshiyuki Sasaki <toshiyuki.sasaki at spring8.or.jp>
Cc: cod-bugs at ibt.lt; genji.kurisu.protein at osaka-u.ac.jp; kawamoto at protein.osaka-u.ac.jp; 'Ranjit Thakuria' <ranjit.thakuria at gmail.com>; 'Diptajyoti Gogoi' <dipta087 at gmail.com>; 北條裕信 <hojo at protein.osaka-u.ac.jp>
Subject: Re: [Cod-bugs] Data registration

Dear Dr. Saulius Gražulis,

I am in charge of MicroED data processing of the structure Dr. Toshiyuki Sasaki is trying to deposit.

I am writing you about data quality concerns you mentioned.
Toshiyuki will write to you separately on how to proceed the deposition.

 > Moreover, I have noticed that the _exptl_absorpt_coefficient_mu is zero,  > and the remaining absorption correction parameters are not specified. I  > see that very few of the electron diffraction studies reported in the  > COD use absorption correction, but if it is technically possible to  > apply it, maybe the final refinement R-factors will get lower?

Unlikely. The high R factors (merging and refinement) are due to other reasons (see below).

In electron diffraction at 200 kV, absorption effects are negligible for light elements. Effects of inelastic scattering can be treated by absorption scaling, but this is a very crude, ad hoc approximation and does not have physical meaning. In dials.scale, which we used for scaling, the scaling factors are empirically modeled as a smooth function of resolution, rotation angle and position on the detector. This approach is different from physics-based modeling based on the crystal composition (mu).

 > I also see that there are large values reported for the symmetry  > equivalent reflection agreement:
 >
 >     _diffrn_reflns_av_R_equivalents    0.8897
 >     _diffrn_reflns_av_unetI/netI       0.3546
 >
 > The COD min / avg / max values are 0.0661 / 0.218177 (sample σ = 0.099)  > / 0.4322 for _diffrn_reflns_av_R_equivalents and 0.0007 / 0.141944  > (sample σ = 0.120) / 0.5071 for _diffrn_reflns_av_unetI/netI; thus the  > values in your file seem quite high compared to what we see in the COD  > (for the 35 structures explicitly reported as electron diffraction  > studies by setting 'radiation' column to 'electron'), the  > _diffrn_reflns_av_R_equivalents is beyond 5*σ. Could it  be that  > applying absorption correction would decrease these statistics as well?

While traditional small molecular crystallography collects a dataset from one or a few crystal(s), we take a massively high multiplicity approach. The OLN-SUCA dataset resulted from
34 crystals out of 244 measured crystals.

The traditional R factor increases with the multiplicity of a dataset and is considered inadequate as a resolution metric.
This is pointed out in Diederichs & Karplus, Nat. Struct. Biol., 1997 for the case of a related metric R(merge) in macromolecular crystallography.

Similarly, unetI/netI is not a great metric, because it takes the absolute value of intensities. Weak reflections can have negative observed intensities (due to background subtraction), so taking absolute values is not adequate.
In advanced data processing methods (common in macromolecular crystallography), information in negative reflections can still be utilized via maximum-likelihood intensity based target or French-Wilson scaling. Thus, we didn't remove such reflections but this led to worse statistics.

 > The _refine_ls_R_factor_gt and _refine_ls_wR_factor_ref, however, are  > somewhat high, even for electron diffraction studies.

We applied kinematical refinement, not dynamical refinement.
Many variants of electron diffraction exist (precession electron diffraction, convergent-beam electron diffraction, etc) and they suffer from dynamical diffraction to varying extents.
Among MicroED, we don't consider our statistics is significantly worse than others.

Another reason for seemingly worse metrics is that we included weaker reflections. Excluding noisy high resolution structure factors does not improve the structure accuracy, provided that each reflection is weighted properly. Removal can degrade the refined structure because valuable information, however noisy, is excluded.

This is a modern view of crystallographic structure refinement, initially introduced in macromolecular crystallography.
Please see Diederichs and Karplus "Better models by discarding data?"
Acta Crystallographica Section D 69.7 (2013): 1215-1222, which says:
"even though discarding the weaker data leads to improvements in the merging R values, the refined models based on these data are of lower quality."

Thus, we used higher resolution reflections for refinement than traditional small molecular crystallographers would use.
In addition, the deposition includes high resolution reflections not used in refinement, in the hope that future algorithm developments might allow extraction of more information from them.

 > I did not use the CheckCIF [1] tool to avoid breaking confidentiality  > of your structure, but it would be interesting if your could get a  > CheckCIF / PLATO reports on your structure. Are there any Level A  > alerts?

The level A alert was about high R(int), which is caused by high multiplicity and dynamical effects.

Thank you very much for your feedback.
I hope this explanation helps.

Best regards,

Takanori Nakane


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.



More information about the Cod-bugs mailing list