[Cod-bugs] [EXT] AMCSD duplicate atoms and missing aniso links?
Downs, Robert T - (rdowns)
rdowns at arizona.edu
Sat Aug 3 22:48:31 EEST 2024
Saulius,
I will replace every single quote ‘ in atom names in the entire database. You won’t see the changes until I rebuild it however, and that wont be for a month or so.
Thanks for pointing in out to me. If you see any issues with double quotes “ let me know.
Thanks,
Bob
From: Saulius Gražulis <grazulis at ibt.lt>
Sent: Saturday, August 3, 2024 3:49 AM
To: Bob Downs <rdowns at u.arizona.edu>
Cc: cod-bugs at lists.crystallography.net; kicis at lists.crystallography.net
Subject: [EXT] AMCSD duplicate atoms and missing aniso links?
External Email
________________________________
Hi, Bob!
How are you?
I'm currently doing COD data validation, and I'm running into two issues that I would need your help to resolve.
1. Some AMCSD entries (e.g. COD ID 9015515 [1], AMCSD ID "0019710" [2]) have recently received the '_atom_site_aniso_[]' loop. This is great, but in some cases atom site labels in the _atom_site_[] loop do not match:
loop_
_atom_site_aniso_label
_atom_site_aniso_U_11
_atom_site_aniso_U_22
_atom_site_aniso_U_33
_atom_site_aniso_U_12
_atom_site_aniso_U_13
_atom_site_aniso_U_23
# ...
Pb' 0.03159 0.03245 0.01811 0.01856 -0.00932 -0.00819
# ...
vs.:
loop_
_atom_site_label
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z
_atom_site_occupancy
_atom_site_U_iso_or_equiv
# ...
Pb* 0.10313 0.33075 0.34924 0.19800 0.02774
# ...
Can I assume that the atoms with quotes and atoms with stars are the same, e.g. that "Pb'" and "Pb*" is the same atom?
I have written a script to fix this under the above-mentioned assumptions, and I'm about to commit the changes to the COD. It would be great to propagate these changes back to AMCSD, what you think?
(Actually, I had the same issue with atom names containing hyphens, e.g. 'OH1' and 'O-H1', but I see that this issue is already fixed in AMCSD; I've updated the COD accordingly :).
2. In some places (56 COD entries) there are duplicate atom labels. When these are in two loops, we can not decide unambiguously which ANISO entry pertains to which atom. Example (from COD ID 9000543 [3], AMCSD ID "0000554"):
loop_
_atom_site_aniso_label
_atom_site_aniso_U_11
_atom_site_aniso_U_22
_atom_site_aniso_U_33
_atom_site_aniso_U_12
_atom_site_aniso_U_13
_atom_site_aniso_U_23
# ...
Mg 0.00977 0.01092 0.00940 0.00000 0.00000 0.00000
Mg 0.01490 0.00863 0.00956 0.00008 -0.00278 0.00053
# ...
and
loop_
_atom_site_label
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z
_atom_site_occupancy
# ...
Mg 0.00720 0.39590 0.50000 0.21400
Mg 0.25000 0.50000 0.25000 1.00000
# ...
Can I assume, in general, that the atoms in both loops in the same order, and number them 'Mg1' and 'Mg2' correspondingly? Like this:
loop_
_atom_site_aniso_label
_atom_site_aniso_U_11
_atom_site_aniso_U_22
_atom_site_aniso_U_33
_atom_site_aniso_U_12
_atom_site_aniso_U_13
_atom_site_aniso_U_23
# ...
Mg1 0.00977 0.01092 0.00940 0.00000 0.00000 0.00000
Mg2 0.01490 0.00863 0.00956 0.00008 -0.00278 0.00053
# ...
and
loop_
_atom_site_label
_atom_site_fract_x
_atom_site_fract_y
_atom_site_fract_z
_atom_site_occupancy
# ...
Mg1 0.00720 0.39590 0.50000 0.21400
Mg2 0.25000 0.50000 0.25000 1.00000
# ...
I would do this for the COD (probably manually). Again, it would be great to back-propagate these changes to the AMCSD collection. I can send you the lists of changed files or the list of problematic entries of the would help.
Cheers,
Saulius
Refs.:
[1] http://www.crystallography.net/cod/9015515.cif<http://www.crystallography.net/cod/9015515.cif>, http://www.crystallography.net/cod/9015515.html<http://www.crystallography.net/cod/9015515.html>
[2] https://rruff.geo.arizona.edu/AMS/CIF_text_files/05517_cif.txt
[3] http://www.crystallography.net/cod/9000543.cif<http://www.crystallography.net/cod/9000543.cif>, http://www.crystallography.net/cod/9000543.html<http://www.crystallography.net/cod/9000543.html>
[4] https://rruff.geo.arizona.edu/AMS/CIF_text_files/00635_cif.txt
--
Dr. Saulius Gražulis
Vilnius University Institute of Biotechnology, Saulėtekio al. 7
LT-10257 Vilnius, Lietuva (Lithuania)
fax: (+370-5)-2234367 / phone (office): (+370-5)-2234353
mobile: (+370-684)-49802, (+370-614)-36366
--
This message has been scanned for viruses and
dangerous content by MailScanner<http://www.mailscanner.info>, and is
believed to be clean.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.crystallography.net/pipermail/cod-bugs/attachments/20240803/3c17c379/attachment-0001.htm>
More information about the Cod-bugs
mailing list