[Cod-bugs] 2997 invalid files in C.O.D.

David Palmer david at crystalmaker.com
Wed Jul 5 12:59:41 EEST 2023


Dear Colleagues,

I send you a message a few weeks ago about my plans to provide easy phase ID via C.O.D.-hosted structures. I haven’t heard back from you, so I assume you have no objections.

In the meantime, we have used our automated tools to analyse all current structures files. I am attaching a summary, listing file IDs and errors for 2,997 out of your 0.5M or so files: a relatively-small figure (ca. 0.6%). However, these files are invalid, and cannot be used for structural work, so I would recommend getting them fixed.

The most common errors are:

- missing fractional coordinates
- ambiguous site labelling
- invalid element symbols

A common issue is a mismatch between site labels in different data blocks (e.g., a table of anisotropic displacement parameters and a table of fractional coordinates). We found these errors in numerous files submitted via the American Mineralogist crystal structures database (clearly, substantial amounts of U.S. governmental funding failed to prevent basic transcription errors!)

Take the following file, 9003355, as an example:-

• Sites SiT1’, AlT1’ (etc.) are listed in the loop containing Uij
• The same site are labelling differently (e.g., SiT1*, AlT1*, etc.) in the loop containing xyz

Whilst, to a human, one could make inferences as to how these labels should be related, a computer cannot make such a judgement, thereby rendering these files useless.

I hope this helps, and do let me know if you have any questions.

With best wishes,
Yours faithfully,

David Palmer

David C Palmer, Ph.D. (Cantab), M.A. (Cantab),
Managing Director, CrystalMaker Software Ltd
Centre for Innovation & Enterprise |  Oxford University Begbroke Science Park
Woodstock Road, Begbroke, Oxfordshire, OX5 1PF, UK


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.crystallography.net/pipermail/cod-bugs/attachments/20230705/329f29da/attachment-0002.htm>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Error Files from COD (2023-07-04).txt
URL: <http://lists.crystallography.net/pipermail/cod-bugs/attachments/20230705/329f29da/attachment-0001.txt>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.crystallography.net/pipermail/cod-bugs/attachments/20230705/329f29da/attachment-0003.htm>


More information about the Cod-bugs mailing list