--- Torbjörn Björkman, PhD COMP, Aalto University School of Science Espoo, Finland
My point is: the task of comparing experiment and theory (i.e. comparing COD and TCOD) is a research task in its own right.
To my knowledge, theoricians are aware about the low quality of the cell parameters derived from their DFT optimizations to the point that they prefer to fix them to the experimental values when available.
I would agree with Stefaan that this is a problem we largely know how to deal with, if all you want is lattice constants for a single compound. But in practice there are often associated complications, such as the need to simultaneously describing two different structures at an interface, and then you often need to weigh a lot of different factors together to get the best approximation to the whole problem. It is fairly common that this balance in the end produces the solution "just fix the lattice constant to their experimental values", since that is typically good enough and well controlled and easily reproducible.
This triggered another thought: isn't the information whether only positions were optimized and/or whether the full cell shape was optimized as well level-0 information? If we see only the cif without this information, then there is not much that can be concluded about such an entry. And as a corollary: doesn't that imply that there is no place in TCOD for DFT calculations that start from the experimental cif without any subsequent optimization? What do others think?
Hmm... actually, it never even occured to me that someone might want to submit an only partially optimized structure. I guess that means that I think that there is no place for such calculations in TCOD. It seems very difficult to estimate the practical value of such a calculation and I can't see myself being very interested in the outcome. But there may be folks out there with a different viewpoint, I can guess that there could be a use for calculations like that for things like molecular crystals, where the molecules are easy and their weak interaction (which will determine the cell shape) is hard, whereas experimentally the cell shape is easier (I guess, but correct me if I'm wrong). Anyway my instinctive vote is No.
Cheers, Torbjörn
_______________________________________________ Tcod mailing list Tcod@lists.crystallography.net http://lists.crystallography.net/cgi-bin/mailman/listinfo/tcod
This triggered another thought: isn't the information whether only positions were optimized and/or whether the full cell shape was optimized as well level-0 information? If we see only the cif without this information, then there is not much that can be concluded about such an entry. And as a corollary: doesn't that imply that there is no place in TCOD for DFT calculations that start from the experimental cif without any subsequent optimization? What do others think?
Hmm... actually, it never even occured to me that someone might want to submit an only partially optimized structure. I guess that means that I think that there is no place for such calculations in TCOD. It seems very difficult to estimate the practical value of such a calculation and I can't see myself being very interested in the outcome. But there may be folks out there with a different viewpoint, I can guess that there could be a use for calculations like that for things like molecular crystals, where the molecules are easy and their weak interaction (which will determine the cell shape) is hard, whereas experimentally the cell shape is easier (I guess, but correct me if I'm wrong). Anyway my instinctive vote is No.
People who compute from DFT spectroscopic spectra often just take the experimental cif and do their computational spectroscopy on that one. Although these are legitimate DFT calculations, there is nothing in them that is useful for a structural database as TCOD.
Everbody is biased by his/her background. So am I. I have often calculated hyperfine properties of solids. They can sensitively depend on the atomic positions, but usually do not depend very much on small variations of the cell shape. And as one needs a LAPW code for accurate hyperfine properties -- a type of method in which cell shape optimizations are very tedious due to the lack of a stress tensor -- the standard procedure in that field is to take the experimental cell shape yet DFT-optimized atomic positions. Having such data included in TCOD looks meaningful to me, provided one is aware that the cell parameters are taken from experiment. Which is why this should probably be level-0 info.
Furthermore, as atomic positions are harder to get from experiment than the cell parameters, I can imagine also refinement people would want to contribute calculations with experimental cell parameters and DFT-optimized positions.
Obviously, a fully DFT-optimized crystal (cell parameters and positions) is the most useful type of entries.
Stefaan
H, Stefaan,
On 2014-07-31 09:47, Stefaan Cottenier wrote:
Everbody is biased by his/her background. So am I. I have often calculated hyperfine properties of solids. They can sensitively depend on the atomic positions, but usually do not depend very much on small variations of the cell shape. And as one needs a LAPW code for accurate hyperfine properties -- a type of method in which cell shape optimizations are very tedious due to the lack of a stress tensor -- the standard procedure in that field is to take the experimental cell shape yet DFT-optimized atomic positions. Having such data included in TCOD looks meaningful to me, provided one is aware that the cell parameters are taken from experiment.
Absolutely! We just need to mark carefully that atomic positions were optimised, and cell constants were not.
Something along the lines:
loop_ _tcod_optimisation_id _tcod_optimisation_parameter _tcod_optimisation_flag 1 cell_constants no 2 atomic_coordinates yes 3 thermal_displacements no # Most probably 'no' for current # computations, but 'yes' may become common # in the future
OK?
Which is why this should probably be level-0 info.
Just for clarification: the levels, as I was thinking of them, were were not to reflect "completeness" of computations or level of theory (there will be a separate tag for this), but a level of *presentation* in TCOD. That is, the computation you have just discussed could be in principle described as level-0 TCOD entry (only coordinates and a publication reference, peer reviewed at the time and place of the publication), level-1 (coordinates + convergence parameters, can be in principle inspected and evaluated by experts on its own right) or level-2 (full description of computation allowing automated replication of the computations at least at the time of deposition, when the software codes are still available).
Regards, Saulius
Hi,
Absolutely! We just need to mark carefully that atomic positions were optimised, and cell constants were not.
Something along the lines:
loop_ _tcod_optimisation_id _tcod_optimisation_parameter _tcod_optimisation_flag 1 cell_constants no 2 atomic_coordinates yes 3 thermal_displacements no # Most probably 'no' for current # computations, but 'yes' may become common # in the future
OK?
More ideally, I think that the whole DFT instruction file should be provided inside of the CIF. As you know, IUCr is now asking for such refinement instruction details in the following way with a tag :
_iucr_refine_instructions_details ; insert here the SHELX .ins or .res file or etc (Rietveld entry file for powders) ;
We could have some specific TCOD tag :
_tcod_optimization_instructions_details ; insert here the VASP/WIEN2K/CASTEP/... entry file ;
Of course the software name and the version should be given. This would allow for some reproducibility test and for example test files. The fact that cell parameters are optimized or not (and other details) will appear there.
Best,
Armel
PS - Crystallography (apples) is >100 years old and now some are considering it as a technic rather than a science. Quantum Mechanics (bananas) is almost as old but quite younger in its applications for optimizing/predicting/... matter, so it needs maybe 100 more years before to be called a technic too ;-). TCOD may participate to some acceleration in the progress by allowing access to the complete datafiles.
More ideally, I think that the whole DFT instruction file should be provided inside of the CIF.
(...)
Of course the software name and the version should be given. This would allow for some reproducibility test and for example test files. The fact that cell parameters are optimized or not (and other details) will appear there.
The entire instruction file is something which in Saulius' scheme is at level 2. It is definitely the ideal situation if every entry would have all information up to level 2. But while for some codes that will be straightforward (one input file and go), for others it would require many input files and a sequence of commands (with varying options) to go from the initial structure to the optimized one in a reproducible way (and as codes evolve, the input might quickly get incompatible). Strictly requiring that all this information is present, would scare away many entries that would nevertheless have been useful.
Providing different levels is an elegant solution. The question then is: what is the strictly necessary information that must be provided (= level 0)? Items that have been suggested so far for level 0 are:
*cif of the final structure *publication reference *level of theory (XC within DFT, or name of method if not DFT) *full optimization or only positions
Part of this information will be repeated in the full input files if level 2 is included, but that is not a problem.
Stefaan
On 2014-07-31 10:46, Armel le Bail wrote:
More ideally, I think that the whole DFT instruction file should be provided inside of the CIF. As you know, IUCr is now asking for such refinement instruction details in the following way with a tag :
_iucr_refine_instructions_details ; insert here the SHELX .ins or .res file or etc (Rietveld entry file for powders) ;
Absolutely. This is suggested for the Level 2 description.
What the tag names and semantics is open for discussion, but once we decide on them here we put the TCOD CIF dictionaries and we can reuse them.
As was mentioned on this list several times by DFT experts, some codes might require more complicated computation flow for complete reproducibility; e.g. relaxation might be done using different program than was used for DFT optimisation. Still, such work flow can still be accommodated in the same way, by providing a shell script or .bat file as an input to /bin/sh or cmd.exe, and providing contents, or stable web references, of all additional input files as well.
We could have some specific TCOD tag :
_tcod_optimization_instructions_details ; insert here the VASP/WIEN2K/CASTEP/... entry file ;
Since there may be more than one file needed for a DFT computation, we'll probably need a loop (see the attached CIF file).
This would be a level-2 description.
Of course the software name and the version should be given.
Tags are already here for this:
_tcod_software_package _tcod_software_package_version
(http://www.crystallography.net/tcod/cif/dictionaries/cif_tcod.dic)
Maybe we should be more specific to mention the *code* version (as versioned by the developers of the code) and a *package* version (e.g. Debian package version used for installation and computation).
This would allow for some reproducibility test and for example test files. The fact that cell parameters are optimized or not (and other details) will appear there.
Indeed. There are concerns that full description will be too large, or too soon obsolete to be useful; I would say however that TCOD should provide uniform mechanisms to describe the necessary information; if people want, they use it; if they don't -- not a big deal, a lot of CIF tags are used only once in a while. But we definitely do not want a situation where people would like to provide information but TCOD has no means to capture it :)
Regards, Saulius
tcod@lists.crystallography.net