[Cod-bugs] theoretical calculation results in COD
Saulius Gražulis
grazulis at ibt.lt
Thu Sep 11 14:49:15 EEST 2025
Dear Michal,
thank you very much for your e-mail and for the offer of structure
depositions!
On 2025-09-10 11:44, Michal Hušák wrote:
> Hi
>
> I was asked somebody during ECM35 conference to store our
> theoretically recalculated structures in COD ....
That was probably me :). In any case, you send your mail to the right
address :)
>
> Can I have related questions:
>
> 1) I see you have suggested a CIF DFT dictionary:
>
> https://wiki.crystallography.net/cif/dictionaries/ddl1/cif_dft/
>
> Can I have some sample CIF files using this dictionary to be sure I
> understand the fields ?
We have recently discussed the similar example with our colleagues, so I
will share the same example with you:
> On 2025-06-18 12:09, Andrius Merkys wrote:
> Below I list TCOD IDs (or their ranges) which originate from VASP:
>
> * 20000173: mentions VASP usage
>
> * range 20000419 -- 20000423: from VASP, has bulk modulus
>
> * range 20001806 -- 20002020: a bunch of structures from
> doi:10.1103/physrevb.92.014106 converted from POSCAR format, has
> POSCAR embedded in CIF
>
> * 20006288 and 20006289: seemingly from VASP, but contains no
> calculation descriptions, just the coordinates.
>
> Most of them were manually curated (by me). 10 years ago I wrote a
> (lossy) tool converting VASP XML to CIF with TCOD data items [1], but
> I am not sure how well did it age - I have not used it for a while. I
> could in principle revive it if there was any interest.
The 8-digit IDs are IDs from the TCOD [1]. You can access structures
over the Web from our server using the standard resolving URLs, e.g.:
https://www.crystallography.net/tcod/20000173.html
https://www.crystallography.net/tcod/20000173.cif
Likewise for other examples:
https://www.crystallography.net/tcod/20001806.html
https://www.crystallography.net/tcod/20001806.cif
You can also check out the full TCOD on your disk. The data are in a
public Subversion repository; you can check it our – on Linux the
command will be:
svn co svn://www.crystallography.net/tcod
NB: we were recently "attacked" by Web bots that would indiscriminately
download multiple CIFs, send tens of requests at once from different
locations and download all historical CIFs (old revisions), which caused
our servers to crash from overload. We there fore had to resort to
limiting downloads to 20-30 CIFs per day and to block IP numbers that
download much more.
If you need more data, the better way (both for you and for us) is to
get the whole TCOD using the "svn co" method :). If, however, you IP
gets blocked for some reason, please e-mail me with your IP and we will
whitelist it. The same applies to COD data :)
>
> Results from CASTEP, Quantum Espresso are preferable (CP2K, CRYSTAL ,
> VASP results are OK as well).
The examples are mostly from VASP; but I think we have QE examples as
well. Please let me know if you are interested in these as well and you
can not find them in the TCOD data.
>
> It will be nice to see the output form the QM program and the CIF to
> validate the data correspondence ...
Some files contain VASP and QE inputs.
>
> 2) Can i search in COD only for calculated structures (and e.g.
> specify some of the DFT fields) ?
It has been decided that calculated structures should go to TCOD, and
experimental (i.e. refined against diffraction or other measurement
data) will be stored in the COD. It happens, however, that sometimes
theoretical structures end up in the COD before we spot them (mostly
from publication supplementary materials). If this happens, since COD ID
is a stable identifier, we do not delete the COD structure but mark it
in the database column `method` with the value "theoretical". In the
CIFs themselves, the data are marked with the data item
"_cod_struct_determination_method theoretical".
To search for the theoretical structures, in the COD you have three
methods:
- check out the full COD CIF collection using "svn co
svn://www.crystallography.net/cod/cif cod-cif" (caution: 190G checkout...)
and then run something like this:
find cod-cif/ -name '*.cif' | xargs grep -l
"_cod_struct_determination_method *theoretical"
on the obtained files to get the list of identified theoretical ones, or
- query the COD SQL directly:
mysql -h sql.crystallography.net -u cod_reader cod -e 'select file from
data where method = "theoretical"'
- query structures on the Web and check the "include theoretical
structures" box on the Web form; this will give yo both theoretical
*and* experimental structures, but if you then search for only
experimental structures and then subtract that set from the previous
one, the remaining ones will be theoretical.
In the TCOD, all structures are theoretical.
>
> 3) How can I deposit recalculated structures (corresponding to some
> CSD or COD structures) ? Will they do not mix with experimental
> structures ?
Good point – as said, in order to not mix theoretical and experimental
structures, we should deposit theoretical structures to the TCOD [1].
There is a "deposit your data" link on the Web, and for larger sets we
have programmatic deposition (please let us know if you would like to
use it). The procedure for deposition is the same as for the COD database.
HTH,
Yours,
Saulius
>
> Michal Husak
>
> UCT Prague
>
>
>
References:
[1] Theoretical Crystallography Open Database (2025)
https://www.crystallography.net/tcod
--
Dr. Saulius Gražulis
Vilnius University Institute of Biotechnology, Saulėtekio al. 7
LT-10257 Vilnius, Lietuva (Lithuania)
mobile: (+370-684)-49802, (+370-614)-36366
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
More information about the Cod-bugs
mailing list