[Cod-bugs] Issues found in a systematic scanning of the COD
Saulius Gražulis
grazulis at ibt.lt
Tue Oct 14 18:34:56 EEST 2025
Dear Pierre,
thank you for your email and for the list of spacegroup vs. symop
mismatch, and sorry for a delay with the answer.
I've looked into the log and blow are my comments, based on my
interpretation of this log.
First of all, you are absolutely write to use symmetry operations
provided in the CIF if these are available and interpretable. This is
our standard procedure and recommendation to COD users. Symmetry
operations are the most versatile way to describe symmetry relations in
a computer- and human-readable form, so that can always be used. In
contrast, Hermann-Mauguin and even Hall symbols can give humans a
symbolic identification of the space group, but they lack a standard way
to express all varieties of settings, cell choices and origins used
nowadays by crystallographers.
The Hermann-Mauguin and Hall symbols /can/ be adapted to express
multiple (all?) non-standard settings and origins; for this, however, a
Change of Basis matrix must be used, or at least Shift of Origin which
might be sufficient in some cases. Unfortunately, there is no standard
way to encode these elements (I am seriously considering writing up a
standardisation proposal to IUCr...), but the current use in published
literature (even in the Tables!) attests the following uses:
-- Parse the Change of Basis (CoB) operator strings in the form
-- "(a-b,a+b,c)" or "(x-y,x+y,z)" and return a matrix that encodes
-- this operator. The CoB is described in [1,2].
--
-- The 'abc' for is a transpose inverse of the 'xyz' form.
The use cases and brief mention of these conventions can be found in
IUCr sources [1,3] or on the Web by the same authors [2].
We in the COD adopted these conventions, and I wrote a small parser for
the space group symbol interpretation [4] to make these conventions
explicit. The code is in Ada which, although not in the top 5 popularity
list :), is very readable, standard and stable. To see how we interpret
the extended H-M and Hall space group symbol strings, you may want to
look into the Change_of_Basis Ada package [5].
I would thus like to ask you a question as to what you mean by saying
that "list of symmetries is obviously wrong or contradicts the Bravais
symbol in the group name"?
So, for example, in the COD 1001841 entry which is mentioned first in
your log, the space group symbol derived form the symmetry operations is
'P 2yb (-1/2*x+z,1/2*x,y)' (Hall), or 'P 1 21 1 (c,2*a+c,b)' (H-M). Both
symbols yield identical symmetry operations when decoded by my program
[4], and they coincide with the symmetry operation lists given in the
COD entry. The space group for this crystal is ITC No. 4, H-M symbol 'P
21'. However, for whatever reason authors chose to represent the
structure in a C-centered cell (probably to compare with a similar C2
structure or something like that). They identified the space group as
'C1121', which is of course a non-standard symbol and cell choice for
this space group, but that is what the authors reported.
When I calculate the crystal composition using the symmetry operations
and the atom list that the authors provided, I get the summary formula
"La3 O8 Re", exactly like authors have reported. Thus, I conclude that
this structure is reported the way it was intended by the authors.
To interpret the space group symbols, it is not enough to take just the
Table symmetries for P1211 (which you list in your log); you need to
analyse the change-of-basis as well, so to interpret the whole string 'P
1 21 1 (c,2*a+c,b)' given in the file; this will yield four symmetry
operations, not two.
It seems that many of your log entries are of similar kind, representing
non-standard settings and cell choices.
In some cases, like for COD 1553126, you report the duplicated symmetry
operations, while in fact they are not: in this entry, the symmetry
operations are present under different data names
("_space_group_symop_operation_xyz" and "_symmetry_equiv_pos_as_xyz"),
one of the old and another new. This is not a bug, all data names are
permitted in the CIF, as long as the symmetry operation lists are
equivalent. Please make sure that you software correctly chooses one set
of symmetry operations or another (we suggest using newer CIF name if it
exists, and falling back to the old one if the new one is missing).
There are other cases where the same data names report different
symmetry operations sets, like in COD 1564490. This is indeed wrong,
thanks for spotting such cases! We will have to look into them individually.
My plans are to update our the space group symbol determination software
and to assign space group names with the change-of-basis operators to
all remaining COD entries that do not have these designations. Also, we
will check that the "_space_group_symop_operation_xyz" and
"_symmetry_equiv_pos_as_xyz", if they exists, always report correct
symmetry operations.
What you can do on your side is to make sure that your program pick only
one data item, "_space_group_symop_operation_xyz" or
"_symmetry_equiv_pos_as_xyz", the one that has complete symmetry
operation list (i.e. contains unity operation 'x,y,z' and the specified
symops form a group), and that the program either interprets the
change-of-basis operators or ignore space group symbols that have them.
HTH,
Saulius
Refs.:
[1] Zwart, P. H.; Grosse-Kunstleve, R. W.; Lebedev, A. A.;
Murshudov, G. N. & Adams, P. D. (2007) Surprises and pitfalls
arising from (pseudo)symmetry. Acta Crystallographica Section D
Biological Crystallography 64(1), 99-107. International Union
of Crystallography (IUCr). DOI:
https://doi.org/10.1107/s090744490705531x
[2] Sydney R. Hall, Ralf W. Grosse-Kunstleve (1996) "Concise
Space-Group Symbols". URL:
https://cci.lbl.gov/sginfo/hall_symbols.html [accessed:
2022-06-14T15:24+03:00]
[3] International Tables Volume B (2010), "Symmetry in
reciprocal space". Section 1.4., Appendix A1.4.2. Space-group
symbols for numeric and symbolic computations, URL:
https://onlinelibrary.wiley.com/iucr/itc/B/ [accessed:
2022-06-14T15:35+03:00]
[4] Gražulis, S. (2024) decode-Hall-symbol [computer software].
https://github.com/sauliusg/decode-Hall-symbol
[5] Gražulis, S. (2024) decode-Hall-symbol [computer software]. Ada
package 'Change_Of_Basis'.
https://github.com/sauliusg/decode-Hall-symbol/blob/master/src/change_of_basis.ads,
https://github.com/sauliusg/decode-Hall-symbol/blob/master/src/change_of_basis.adb
On 2025-09-02 11:03, Caussin, Pierre wrote:
>
> Hi there,
>
> I am working for Bruker and trying to better streamline the use of the
> COD in our search/match and semiquantitative software. One of the
> goals is being able to compute the peak positions and relative
> intensities and RIR of COD entries. We do this at CuKA1 wavelength for
> all the entries that have a structure and space group given (all
> entries except 1119 that have either no atomic coordinates and/or
> blank or unknown space group, which are currently ignored). This
> compilation is done using the symmetries stored in the CIF file.
>
> To be able to compute the selected entries at other wavelengths on the
> fly, we need the space group symmetries, without keeping the complete
> CIFs, which use over 26GB after ZIP compression. I have written code
> to create a table [space group HM name] => [table of symmetries]. This
> works well, but I have found 478 CIF files where the given HM name
> appears to contradict the list of symmetries. I enclose the diagnostic
> output of my program (plus my case-by-case comment ‘//’), which can be
> summarized in two cases:
>
> 1. The list of symmetries is obviously wrong or contradicts the
> Bravais symbol in the group name
> 2. The list of symmetries is plausible but either is erroneous or
> does not match the most common space group settings (axes and
> origin). This is not a problem if I rely on the CIF list but
> causes a conflict if I have other phases using the same space
> group in the ‘usual’ settings. I can handle this by adding a
> character to the group name when the phase originates in the COD.
>
> Thank you for your attention, best regards,
>
> Pierre Caussin
>
>
> --
> This message has been scanned for viruses and
> dangerous content by *MailScanner* <http://www.mailscanner.info/>, and is
> believed to be clean.
>
> _______________________________________________
> Cod-bugs mailing list
> Cod-bugs at lists.crystallography.net
> http://lists.crystallography.net/cgi-bin/mailman/listinfo/cod-bugs
--
Dr. Saulius Gražulis
Vilnius University Institute of Biotechnology, Saulėtekio al. 7
LT-10257 Vilnius, Lietuva (Lithuania)
mobile: (+370-684)-49802, (+370-614)-36366
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.crystallography.net/pipermail/cod-bugs/attachments/20251014/83512213/attachment.htm>
More information about the Cod-bugs
mailing list