[Cod-bugs] Issues found in a systematic scanning of the COD

Saulius Gražulis grazulis at ibt.lt
Tue Oct 14 18:34:56 EEST 2025


Dear Pierre,

thank you for your email and for the list of spacegroup vs. symop 
mismatch, and sorry for a delay with the answer.

I've looked into the log and blow are my comments, based on my 
interpretation of this log.

First of all, you are absolutely write to use symmetry operations 
provided in the CIF if these are available and interpretable. This is 
our standard procedure and recommendation to COD users. Symmetry 
operations are the most versatile way to describe symmetry relations in 
a computer- and human-readable form, so that can always be used. In 
contrast, Hermann-Mauguin and even Hall symbols can give humans a 
symbolic identification of the space group, but they lack a standard way 
to express all varieties of settings, cell choices and origins used 
nowadays by crystallographers.

The Hermann-Mauguin and Hall symbols /can/ be adapted to express 
multiple (all?) non-standard settings and origins; for this, however, a 
Change of Basis matrix must be used, or at least Shift of Origin which 
might be sufficient in some cases. Unfortunately, there is no standard 
way to encode these elements (I am seriously considering writing up a 
standardisation proposal to IUCr...), but the current use in published 
literature (even in the Tables!) attests the following uses:

    -- Parse the Change of Basis (CoB) operator strings in the form
    --  "(a-b,a+b,c)" or "(x-y,x+y,z)" and return a matrix that encodes
    --  this operator. The CoB is described in [1,2].
    --
    -- The 'abc' for is a transpose inverse of the 'xyz' form.

The use cases and brief mention of these conventions can be found in 
IUCr sources [1,3] or on the Web by the same authors [2].

We in the COD adopted these conventions, and I wrote a small parser for 
the space group symbol interpretation [4] to make these conventions 
explicit. The code is in Ada which, although not in the top 5 popularity 
list :), is very readable, standard and stable. To see how we interpret 
the extended H-M and Hall space group symbol strings, you may want to 
look into the Change_of_Basis Ada package [5].

I would thus like to ask you a question as to what you mean by saying 
that "list of symmetries is obviously wrong or contradicts the Bravais 
symbol in the group name"?

So, for example, in the COD 1001841 entry which is mentioned first in 
your log, the space group symbol derived form the symmetry operations is 
'P 2yb (-1/2*x+z,1/2*x,y)' (Hall), or 'P 1 21 1 (c,2*a+c,b)' (H-M). Both 
symbols yield identical symmetry operations when decoded by my program 
[4], and they coincide with the symmetry operation lists given in the 
COD entry. The space group for this crystal is ITC No. 4, H-M symbol 'P 
21'. However, for whatever reason authors chose to represent the 
structure in a C-centered cell (probably to compare with a similar C2 
structure or something like that). They identified the space group as 
'C1121', which is of course a non-standard symbol and cell choice for 
this space group, but that is what the authors reported.

When I calculate the crystal composition using the symmetry operations 
and the atom list that the authors provided, I get the summary formula 
"La3 O8 Re", exactly like authors have reported. Thus, I conclude that 
this structure is reported the way it was intended by the authors.

To interpret the space group symbols, it is not enough to take just the 
Table symmetries for P1211 (which you list in your log); you need to 
analyse the change-of-basis as well, so to interpret the whole string 'P 
1 21 1 (c,2*a+c,b)' given in the file; this will yield four symmetry 
operations, not two.

It seems that many of your log entries are of similar kind, representing 
non-standard settings and cell choices.

In some cases, like for COD 1553126, you report the duplicated symmetry 
operations, while in fact they are not: in this entry, the symmetry 
operations are present under different data names 
("_space_group_symop_operation_xyz" and "_symmetry_equiv_pos_as_xyz"), 
one of the old and another new. This is not a bug, all data names are 
permitted in the CIF, as long as the symmetry operation lists are 
equivalent. Please make sure that you software correctly chooses one set 
of symmetry operations or another (we suggest using newer CIF name if it 
exists, and falling back to the old one if the new one is missing).

There are other cases where the same data names report different 
symmetry operations sets, like in COD 1564490. This is indeed wrong, 
thanks for spotting such cases! We will have to look into them individually.

My plans are to update our the space group symbol determination software 
and to assign space group names with the change-of-basis operators to 
all remaining COD entries that do not have these designations. Also, we 
will check that the "_space_group_symop_operation_xyz" and 
"_symmetry_equiv_pos_as_xyz", if they exists, always report correct 
symmetry operations.

What you can do on your side is to make sure that your program pick only 
one data item, "_space_group_symop_operation_xyz" or 
"_symmetry_equiv_pos_as_xyz", the one that has complete symmetry 
operation list (i.e. contains unity operation 'x,y,z' and the specified 
symops form a group), and that the program either interprets the 
change-of-basis operators or ignore space group symbols that have them.

HTH,

Saulius

Refs.:

[1] Zwart, P. H.; Grosse-Kunstleve, R. W.; Lebedev, A. A.;
Murshudov, G. N. & Adams, P. D. (2007) Surprises and pitfalls
arising from (pseudo)symmetry. Acta Crystallographica Section D
Biological Crystallography 64(1), 99-107. International Union
of Crystallography (IUCr). DOI:
https://doi.org/10.1107/s090744490705531x

[2] Sydney R. Hall, Ralf W. Grosse-Kunstleve (1996) "Concise
Space-Group Symbols". URL:
https://cci.lbl.gov/sginfo/hall_symbols.html [accessed:
2022-06-14T15:24+03:00]

[3] International Tables Volume B (2010), "Symmetry in
reciprocal space". Section 1.4., Appendix A1.4.2. Space-group
symbols for numeric and symbolic computations, URL:
https://onlinelibrary.wiley.com/iucr/itc/B/ [accessed:
2022-06-14T15:35+03:00]

[4] Gražulis, S. (2024) decode-Hall-symbol [computer software]. 
https://github.com/sauliusg/decode-Hall-symbol

[5] Gražulis, S. (2024) decode-Hall-symbol [computer software]. Ada 
package 'Change_Of_Basis'. 
https://github.com/sauliusg/decode-Hall-symbol/blob/master/src/change_of_basis.ads, 
https://github.com/sauliusg/decode-Hall-symbol/blob/master/src/change_of_basis.adb

On 2025-09-02 11:03, Caussin, Pierre wrote:
>
> Hi there,
>
> I am working for Bruker and trying to better streamline the use of the 
> COD in our search/match and semiquantitative software. One of the 
> goals is being able to compute the peak positions and relative 
> intensities and RIR of COD entries. We do this at CuKA1 wavelength for 
> all the entries that have a structure and space group given (all 
> entries except 1119 that have either no atomic coordinates and/or 
> blank or unknown space group, which are currently ignored). This 
> compilation is done using the symmetries stored in the CIF file.
>
> To be able to compute the selected entries at other wavelengths on the 
> fly, we need the space group symmetries, without keeping the complete 
> CIFs, which use over 26GB after ZIP compression. I have written code 
> to create a table [space group HM name] => [table of symmetries]. This 
> works well, but I have found 478 CIF files where the given HM name 
> appears to contradict the list of symmetries. I enclose the diagnostic 
> output of my program (plus my case-by-case comment ‘//’), which can be 
> summarized in two cases:
>
>  1. The list of symmetries is obviously wrong or contradicts the
>     Bravais symbol in the group name
>  2. The list of symmetries is plausible but either is erroneous or
>     does not match the most common space group settings (axes and
>     origin). This is not a problem if I rely on the CIF list but
>     causes a conflict if I have other phases using the same space
>     group in the ‘usual’ settings. I can handle this by adding a
>     character to the group name when the phase originates in the COD.
>
> Thank you for your attention, best regards,
>
> Pierre Caussin
>
>
> -- 
> This message has been scanned for viruses and
> dangerous content by *MailScanner* <http://www.mailscanner.info/>, and is
> believed to be clean.
>
> _______________________________________________
> Cod-bugs mailing list
> Cod-bugs at lists.crystallography.net
> http://lists.crystallography.net/cgi-bin/mailman/listinfo/cod-bugs


-- 
Dr. Saulius Gražulis
Vilnius University Institute of Biotechnology, Saulėtekio al. 7
LT-10257 Vilnius, Lietuva (Lithuania)
mobile: (+370-684)-49802, (+370-614)-36366

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.crystallography.net/pipermail/cod-bugs/attachments/20251014/83512213/attachment.htm>


More information about the Cod-bugs mailing list