Re: [TCOD] Questions still left regarding TCOD dictionaries

5 Nov 2014


      Hello Saulius,
...
I have a (rather long) list of questions regarding the TCOD
dictionaries; I've tried to compile it here. I'd be grateful if you
comment on them as much as you have time.
Let me comment on those questions that I can do on the spot, without 
looking into the dictionary yet :
...
a) the units of energy that we started to use in the dictionaries are eV
(electron-Volts). For distances, we should probably use Angstroems,
since then we can easier compare computation results with
crystallographic experimental data. This naturally suggests units for
forces as eV/A. Is that OK, or should we better use SI units of similar
scale (say aJ -- atto-Joules, on the similar scale)? The only problem I
see with eV is that it is measured, not defined from the basic SI units
(http://en.wikipedia.org/wiki/Electronvolt,
http://physics.nist.gov/cuu/Units/outside.html). We should probably stay
of from Hartries, Rydbergs, Bohrs for archiving purposes, shouldn't we?
NB.: for archival computer files, it is more convenient to have all data
in always in the same units, in all files, and not to allow different
units -- at least with current standard software libraries.
Although SI units would be in principle the good choice, nobody uses 
them in this context. At least eV and Angstrom have a special status 
('tolerated units' within SI), hence allowing eV, Angstrom and therefore 
eV/A for forces is a fair compromise.
...
b) Is nuclear electric dipole moment used/necessary for DFT computations
(_dft_atom_type_nuclear_dipole)?
c) If b) is "yes", what units we should use for electric dipole (and
higher moments) -- Debayes, e*A (amount of unit charges times
Angstroems), or something else?
There must be some kind of confusion here -- my old nuclear physics 
courses always emphasized that nuclei do not have an electric dipole 
moment. Either you mean magnetic dipole moment or nuclear quadrupole 
moment? Anyway, nuclear properties are never required for DFT 
calculations as such, but they can be used to convert DFT-predictions 
into quantities that are experimentally accessible. I don't see the need 
to keep track of this, however, in a computational database.
...
d) Is my definition of residual forces in cif_tcod.dic,
"data_tcod_atom_site_residual_force" correct/acceptable? If not, how
should we define them?
<skip>
...
e) If I understand correctly, DFT and most other QM methods operate
under Born-Oppenheimer approximation; under this approximation, electron
densities (electron wave-functions) are optimised to minimal energy at
fixed nuclei and unit cell parameters, and when this converges, the
nuclei parameters and/or cell constants are changed slightly (e.g. along
gradients), and the electron energy is minimised again. Is this a
correct view? Is it a universal situation across QM codes?
Yes, correct. It is pretty universal. There some special-purpose 
applications that do not make the born-oppenheimer approximation, but 
that's really a minority.
...
f) If e) is "yes", then we can talk about "microcycles" (electron w/f
refinement) and "macrocycles" (nuclei/cell shifted in each
"macrocycle"). We could also document total energy changes in all these
cycles, to monitor the convergence, as I suggest in
_tcod_computation_cycle_... data items. Is such view acceptable? What is
the terminology used in different codes? Will such table be useful? Will
it be easy to obtain from most codes?
I advice to stay away from that. Your view is correct, but this 
information is often not mentioned even in papers. Documenting such 
changes would be arbitrary, to some extent. The final relaxed geometry 
is well-defined, but what do you take as the unrelaxed starting point...?
...
g) I have attempted to put all DFT related data items into the
cif_dft.dic dictionary, and the general computational items (suitable
also for MM and other methods), and into the cif_tcod.dic. Are all the
parameters in cif_dft.dic indeed DFT specific? Are they named and
commented properly?
<skip>
...
h) The CML CompChem dictionary mentions SCF as a method. I know HF and
its modifications are SCF; is DFT technically also SCF? Are there more
SCF methods that are not HF? Should we include "SCF" into the
enumeration values of the _tcod_model as a separate model?
'SCF' refers only to the fact that a particular iterative solving scheme 
is used. As such, I would consider that term as being less informative 
than HF or DFT (one could even imagine to do DFT without SCF, although 
in practice this very rarely is done).
...
i) "Model" is a very overloaded term. Maybe it would be better to rename
_tcod_model to "_tcod_method", or "_tcod_theory_level"?
<skip>
...
j) I have taken the _dft_basisset_type list from
http://www.xml-cml.org/dictionary/compchem/#basisSet, to strive at least
potentially for a possibility of the CIF->CML->CIF roundtrip. The two
big classes of basis functions, as I have learned, are localised
(Slater, Gaussian) and plane wave. Should we introduce such
classification on top of the _dft_basisset_type enumerator?
I don't think so. It will be implicit in the name people use for the 
basis set.
...
Are
localised bases relevant for DFT at all?
Yes, sure (SIESTA, for instance).
...
Is it enough for DFT to specify
just an energy cut-off, assuming plane wave bases (for a given
pseudopotential), or are there different possible bases also among plane
waves (I guess there should not be, but maybe I'm missing something...)?
For plane waves the energy cut-off is the only quantity. But there are 
many other types of basis sets that are no plane waves. For these, more 
specification might be needed (although often they are contained in the 
published definition of the basis set).
...
Or are localised bases sets relevant for computing pseudopotentials
(cores)? Can I assume that dftxfit, dftorb and dftcfit are plane wave
bases? What about 'periodic'? (these terms are all from the Basis Set
Exchange database, I guess).
<skip>
...
k) What is the difference between _dft_atom_basisset and _dft_basisset?
Can they be simultaneously used in one computation? If not, maybe we can
merge the two definition sets into one?
<skip>
...
l) There are a lot of *_conv data items declared (e.g.
_dft_basisset_energy_conv). Are they for convergence tests? Or for
convolutions? What is their proposed definition?
These are the convergence criteria, for instance: stop the iterative 
(SCF) cycle once the total energy does change by less than 
_dft_basisset_energy_conv during the last few iterations.
...
m) Is _dft_cell_energy the same as the "total energy" reported by some
codes? Can we rename it to _dft_total_energy?
Probably yes.
...
n) Am I correct to assume that the "total energy" reported by codes will
always be the sum of separate energy terms (Coulomb, exchange,
1-electron, 2-electron, etc.)? Is there an interest to have them
recorded in the result data files (CIFs) separately? If yes, what is the
"Hartree energy" (is it a sum of all single electron energies in the SCF
for each of them?), "Ewald energy" (is it the electrostatic lattice
energy, obtained by Ewald summation?) and the rest in the values from
the AbInit output file? Are these terms consistent across QM codes?
Also here I think this is asking for way too much detail. Most codes can 
indeed split up the total energy in many contributions, but papers 
usually do not report that (only in the special cases when there is 
useful information in the splitting). If papers don't do it, databases 
shouldn't either -- that feels as a sound criterium.
...
o) How does one check that computation has converged on k-points,
E-cuttof, smear and other parameters, and that pseudopotential is
selected right? From the Abinit tutorial
(http://flex.phys.tohoku.ac.jp/texi/abinit/Tutorial/lesson_4.html) I got
impression that one needs to run computation with different values of
these parameters, and see that the total energy, or other gauge values,
no longer change significantly when these parameters are increased. Is
that right? If yes, are there codes that do this automatically? Should
we require Etotal (or coordinates') dependence on k-grid, E-cuttof,
smear, to check convergence when depositing to TCOD? Or should TCOD side
check this automatically when appropriate (say for F/LOSS codes)?
"k-points, E-cuttof, smear and other parameters" are indeed tested as 
you describe. The pseudpotential can't be tested that way, what people 
usually do is to verify whether the numerically converged results when 
using a particular pseudo do agree with experiment.
Doing such tests is the responsibility of each user. In principle, 
journals should not publish ab initio results if such tests are missing. 
Journals are not that strict, unfortunately. And some researchers are 
not very careful in that respect.
It's a longstanding problem, that is gradually being solved because 
computers are so fast now that the default settings of most codes are 
sufficiently accurate for many cases, even if a researcher does not 
explicitly tests it.
Also here, TCOD shouldn't try to do better than the journals do.
...
p) what are other obvious things that one could make wrong in QM/DFT
computations, that could be checked formally?
That's an interesting one... With no answer from my side. If there is 
anything that can go obviously wrong, the codes will have an internal 
test for it already.
Stefaan

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [TCOD] Questions still left regarding TCOD dictionaries