[Cod-bugs] Cif files with incorrect set of group operators

Steef Boerrigter sxmboer at gmail.com
Thu Feb 23 20:07:18 EET 2023


Saulius,

For the 15 cases I checked that the given space group symbols are correct.
I can send you the computer output from my program, if you are
interested to see it.
I double-checked the structures visually in mercury to see that the
fixed version indeed look correct.

Speaking of gotchas... Something I learned recently is the issue of
the symmetry of ADPs.
ADPs can be given in different formats and the different formats
basically project the ADP into different sets of projection vectors.
The symmetry copies of the ADPs pose limitations on the degrees of
freedom and the refined values reflect that. So, making changes to the
space group, such as for instance changing the setting to the
international tables standard setting is not straightforward.
When changing crystallographic settings, all the ADPs will typically
change accordingly. Also, the given ADPs of the asymmetric unit are
not necessarily the same for the generated symmetry copies, only in
the case of inversion symmetry, I believe, so you also not generally
change the coordinates to more conveniently looking positions (for
instance to have a single molecule show as the asymmetric unit in the
jsmol as opposed to atoms scattered around in neighboring unit cells.)
Lattice transformations are not straightforward and I know of
professional programs that do not do it correctly.

Making changes to a structure indeed causes a train of things to be
changed accordingly to keep the crystal structure consistent.
So, long story short, it makes perfect sense to try to keep changes to
a submitted CIF to a minimum.

Steef

P.S. Speaking of _geom entries, I usually don't take too much interest
in these numbers, because they are relatively easy to calculate. In a
way, it is derived data, so what is the point of including it in the
CIF anyway?
Well, it does offer a great mechanism to see if a CIF file is
internally consistent.
The _geom entries rely on a system of referencing by atomic labels.
ADPs work in similar fashion and that's where I found that in a number
of entries the atomic labels are not unique.
So it isn't clear which _geom or ADP is actually being given. Like I
said, the _geom  is not all that important to me, because I just
calculate the numbers on the spot for a label in my structure viewer.
But, the ADPs are a different story. Assigning the wrong ADP to an
atom is typically not going to give a huge difference when comparing
the calculated powder pattern for an organic structure, but I found it
can make a significant difference for the relative intensities of
inorganics. Enough so, that the relative intensities of the most
intense peaks are affected. And that, in turn, affects how the
structure is indexed for database matching. ICDD only looks at the 3
most intense peaks and I assume that most matching algorithms will
have such a limitation.
I flagged those cases, but the list is quite extensive and I need to
do some additional work to see how to handle those cases. I think the
pragmatic -- but formally incorrect -- approach is to assume that the
references follow the same order of appearance in the file. This can
be confirmed by recalculating the values and see which values match.
The correct solution is of course to rename the affected atoms to make
the labels unique, but given the diverse use of labeling systems, this
may not be feasible.





On Thu, Feb 23, 2023 at 9:49 AM Saulius Gražulis <grazulis at ibt.lt> wrote:
>
> Hi, Steef,
>
> On 2023-02-22 10:07, Steef Boerrigter wrote:
> > I have added a diff file of (today's) database entry with the fixed
> > version of the cif. The diff file includes my analysis of what went
> > wrong with the operators.
> > I am also including the fixed cif files.
>
> There is one more gotcha with the change of symmetry operators. The
> operator indices are used in _geom_bond and _geom_angle_ tables. When we
> change the symmetry operators, these _geom_ references are no longer
> valid. This is one of the reasons why we try not to modify symops in
> general.
>
> But in the case of the 15 files that you have corrected, the initial
> symmetry operators where wrong, for one reason or another. Therefore we
> can not rely on the references to these operators in the _geom_ tables
> since we do not know what the authors meant. For this reason I will
> replace the '_geom_bond_site_symmetry_...' values and other similar
> values with question marks ('?', without the quotes), to indicate that
> these references are not know.
>
> You do not need to do anything about this change (other than to make
> sure that your software tolerates CIFs with '?' values ;) ); I just
> inform you that I will make additional changes to the files you have sent.
>
> Sincerely yours,
> Saulius
>
> --
> Dr. Saulius Gražulis
> Vilnius University, Life Science Center, Institute of Biotechnology
> Saulėtekio al. 7, LT-10257 Vilnius, Lietuva (Lithuania)
> phone (office): (+370-5)-2234353, mobile: (+370-684)-49802, (+370-614)-36366
>
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>


More information about the Cod-bugs mailing list