Questions still left regarding TCOD dictionaries - Tcod

4 Nov 2014


      Hi, folks,
I have a (rather long) list of questions regarding the TCOD
dictionaries; I've tried to compile it here. I'd be grateful if you
comment on them as much as you have time.
Questions:
a) the units of energy that we started to use in the dictionaries are eV
(electron-Volts). For distances, we should probably use Angstroems,
since then we can easier compare computation results with
crystallographic experimental data. This naturally suggests units for
forces as eV/A. Is that OK, or should we better use SI units of similar
scale (say aJ -- atto-Joules, on the similar scale)? The only problem I
see with eV is that it is measured, not defined from the basic SI units
(http://en.wikipedia.org/wiki/Electronvolt,
http://physics.nist.gov/cuu/Units/outside.html). We should probably stay
of from Hartries, Rydbergs, Bohrs for archiving purposes, shouldn't we?
NB.: for archival computer files, it is more convenient to have all data
in always in the same units, in all files, and not to allow different
units -- at least with current standard software libraries.
b) Is nuclear electric dipole moment used/necessary for DFT computations
(_dft_atom_type_nuclear_dipole)?
c) If b) is "yes", what units we should use for electric dipole (and
higher moments) -- Debayes, e*A (amount of unit charges times
Angstroems), or something else?
d) Is my definition of residual forces in cif_tcod.dic,
"data_tcod_atom_site_residual_force" correct/acceptable? If not, how
should we define them?
e) If I understand correctly, DFT and most other QM methods operate
under Born-Oppenheimer approximation; under this approximation, electron
densities (electron wave-functions) are optimised to minimal energy at
fixed nuclei and unit cell parameters, and when this converges, the
nuclei parameters and/or cell constants are changed slightly (e.g. along
gradients), and the electron energy is minimised again. Is this a
correct view? Is it a universal situation across QM codes?
f) If e) is "yes", then we can talk about "microcycles" (electron w/f
refinement) and "macrocycles" (nuclei/cell shifted in each
"macrocycle"). We could also document total energy changes in all these
cycles, to monitor the convergence, as I suggest in
_tcod_computation_cycle_... data items. Is such view acceptable? What is
the terminology used in different codes? Will such table be useful? Will
it be easy to obtain from most codes?
g) I have attempted to put all DFT related data items into the
cif_dft.dic dictionary, and the general computational items (suitable
also for MM and other methods), and into the cif_tcod.dic. Are all the
parameters in cif_dft.dic indeed DFT specific? Are they named and
commented properly?
h) The CML CompChem dictionary mentions SCF as a method. I know HF and
its modifications are SCF; is DFT technically also SCF? Are there more
SCF methods that are not HF? Should we include "SCF" into the
enumeration values of the _tcod_model as a separate model?
i) "Model" is a very overloaded term. Maybe it would be better to rename
_tcod_model to "_tcod_method", or "_tcod_theory_level"?
j) I have taken the _dft_basisset_type list from
http://www.xml-cml.org/dictionary/compchem/#basisSet, to strive at least
potentially for a possibility of the CIF->CML->CIF roundtrip. The two
big classes of basis functions, as I have learned, are localised
(Slater, Gaussian) and plane wave. Should we introduce such
classification on top of the _dft_basisset_type enumerator? Are
localised bases relevant for DFT at all? Is it enough for DFT to specify
just an energy cut-off, assuming plane wave bases (for a given
pseudopotential), or are there different possible bases also among plane
waves (I guess there should not be, but maybe I'm missing something...)?
Or are localised bases sets relevant for computing pseudopotentials
(cores)? Can I assume that dftxfit, dftorb and dftcfit are plane wave
bases? What about 'periodic'? (these terms are all from the Basis Set
Exchange database, I guess).
k) What is the difference between _dft_atom_basisset and _dft_basisset?
Can they be simultaneously used in one computation? If not, maybe we can
merge the two definition sets into one?
l) There are a lot of *_conv data items declared (e.g.
_dft_basisset_energy_conv). Are they for convergence tests? Or for
convolutions? What is their proposed definition?
m) Is _dft_cell_energy the same as the "total energy" reported by some
codes? Can we rename it to _dft_total_energy?
n) Am I correct to assume that the "total energy" reported by codes will
always be the sum of separate energy terms (Coulomb, exchange,
1-electron, 2-electron, etc.)? Is there an interest to have them
recorded in the result data files (CIFs) separately? If yes, what is the
"Hartree energy" (is it a sum of all single electron energies in the SCF
for each of them?), "Ewald energy" (is it the electrostatic lattice
energy, obtained by Ewald summation?) and the rest in the values from
the AbInit output file? Are these terms consistent across QM codes?
o) How does one check that computation has converged on k-points,
E-cuttof, smear and other parameters, and that pseudopotential is
selected right? From the Abinit tutorial
(http://flex.phys.tohoku.ac.jp/texi/abinit/Tutorial/lesson_4.html) I got
impression that one needs to run computation with different values of
these parameters, and see that the total energy, or other gauge values,
no longer change significantly when these parameters are increased. Is
that right? If yes, are there codes that do this automatically? Should
we require Etotal (or coordinates') dependence on k-grid, E-cuttof,
smear, to check convergence when depositing to TCOD? Or should TCOD side
check this automatically when appropriate (say for F/LOSS codes)?
p) what are other obvious things that one could make wrong in QM/DFT
computations, that could be checked formally?
Sorry for a long list, but I would like to get the things possibly from
the very beginning...
Regards,
Saulius
PS. If you find obvious mistakes in the current dictionaries, please
feel free to correct them and commit the corrections back to the repository.
-- 
Saulius Gražulis

Biotechnologijos institutas
Graičiūno 8
02241 Vilnius-28
Lietuva

Tel.: vidaus BTI:    226
      Vilniaus BTI:  (8-5)-260-25-56
      mobilus TELE2: (8-684)-49-802
      mobilus OMNIT: (8-614)-36-366