Dear Saulius and everyone else,
I'll be happy to join.
(I realize that people already have replied, sorry for not accounting for that, but I need to send this quickly now.)
One question back regarding your second question: -- What are the existing quality criteria for COD? I know that I have to give Fabs, but that only applies if the data is not "old". However, data that is "old" (I forget the cutoff year) is still considered acceptable, so clearly it has been decided that the most important thing is not that a number is accurate but that an experienced person can see HOW accurate it is.
I think that we should calibrate our mindsets to this "intermediately liberal" attitude and would prefer a rather liberal attitude when it comes to the pure convergence parameters as long as the most important ones are given. That would be (in addition to which code was used...) a description of the basis set (e.g. something like "400 eV" for a plane wave pseudopotential code or perhaps "tier 1" if you use FHI-AIMS), k-point set and the force/cell convergence criterion. Then we could have a rough classification resulting in a little green, yellow or red label for well, intermediately and poorly converged calculations. A major problem is that the choice of functional may not be as easily tagged as good or bad. There are a number of pointers that all of us in the business know, and of course we can collect these into a sort of heuristic. The modern thing would of course be to crosslink structures in TCOD with corresponding structures in COD and let some kind of neural network (or similar) generate a new heuristic free from our biases, although of course biased by only accounting for the crystal structure.
Regards, Torbjörn
P.S. OK, so maybe someone has to go first and get complained at by everoyne else for his sloppy standards... Here a suggestion for marking up the two most important convergence parameters: Max residual force on atoms < 0.01 eV/Å = good, 0.1-0.01 = intermediate, > 0.1 = low K-point resolution < 0.2 Å^{-1} = good, 0.75-0.2 = intermediate, > 0.75 = low
There, I took the plunge, now to hide under my desk.... /T.B.
--- Torbjörn Björkman, PhD COMP, Aalto University School of Science Espoo, Finland
________________________________________ Från: tcod-bounces@lists.crystallography.net [tcod-bounces@lists.crystallography.net] för Saulius Gražulis [grazulis@ibt.lt] Skickat: den 10 februari 2014 15:14 Till: tcod@lists.crystallography.net; blueobelisk-discuss@lists.sourceforge.net; Nicola Marzari; Peter Murray-Rust; Antanas Vaitkus; Andrius; Chateigner Daniel Ämne: [TCOD] Presenting TCOD at the IUCr 23rd Congress?
Dear colleagues,
as I am going to the 23rd Congress of IUCr this year, I though that this would be a good opportunity to present TCOD to crystallographic and chemical community there.
Unfortunately, the deadline of abstract submission is tomorrow, 2014-02-11, so I need a short feedback from you ASAP... I apologize for such a short notice.
At this stage, TCOD poster/talk is not to present some spectacular results, but rather to inform community and to make sure that everyone is invited, so that nobody is excluded. Only in this case will the TCOD have its value.
I attach a project of an abstract and an author list. If you do not mind, I present the abstract as a presenting author and thus put my name first; otherwise the list is alphabetical. If the NWchem people, or anyone else, would wish to participate and to provide their input about computational data representation and ontologies, I'd be glad to include them as co-authors. At the moment, I have included people on the tcod mailing list (except that I do not know the full name of , and Peter and Nicola with whom we discussed TCOD in detail; I hope you will participate :).
Since we are a new team, I'll do as follows:
a) those who e-mail me till tomorrow (2014-02-11) that they participate in the presentation I leave on the author list;
b) those people who do not agree or *do not reply* by 2014-02-11 I will leave out as not consenting with their authorship;
c) If no one from the theoretical community replies, I'll probably refrain from submitting the abstract.
d) If you join the team, I'll submit the abstract and the author list tomorrow, on 2014-02-11.
Me, Andrius and Antanas will do all the technical editing of the poster or slides; any comments on the text from you are welcome (but please keep in mind that the abstract is limited to 2000 chars). The most important contribution from you wold be the ideas:
-- how to describe the computation data so that it is useful and reusable (what parameters need to be specified for different methods)?
-- what quality criteria do we want to put on structures in TCOD (and in general on published DFT and other structures)? In other words, what computed structures will we be happy with?
After the abstract submission we will have some time to polish TCOD policy details, data dictionaries (initial version can be found here: http://www.crystallography.net/tcod/cif/dictionaries/) and software pipeline (we will take care of this last bit in Vilnius, but participants are welcome :).
Regards, Saulius
-- Dr. Saulius Gražulis Vilnius University Institute of Biotechnology, Graiciuno 8 LT-02241 Vilnius, Lietuva (Lithuania) fax: (+370-5)-2602116 / phone (office): (+370-5)-2602556 mobile: (+370-684)-49802, (+370-614)-36366
so clearly it has been decided that the most important thing is not that a number is accurate but that an experienced person can see HOW accurate it is.
That's a wise vision. The value of a database lies as much in its size as in the accuracy of its data: the ideal situation would be a very large database with very accurate data; sticking too much to accuracy would in reality lead to a very small database -- the practical compromise is a sufficiently large database with sufficiently accurate data.
I see no good reasons why the standards for TCOD should be more stringent than they are for COD ;-).
I think that we should calibrate our mindsets to this "intermediately liberal" attitude and would prefer a rather liberal attitude when it comes to the pure convergence parameters as long as the most important ones are given. That would be (in addition to which code was used...) a description of the basis set (e.g. something like "400 eV" for a plane wave pseudopotential code or perhaps "tier 1" if you use FHI-AIMS), k-point set and the force/cell convergence criterion. Then we could have a rough classification resulting in a little green, yellow or red label for well, intermediately and poorly converged calculations.
I support this. As the number of codes out there is still countable, it would be doable to make such a decision table for every individual code.
A major problem is that the choice of functional may not be as easily tagged as good or bad.
That's not necessarily a problem. Unlike the numerical parameters that you listed above, the functional is an uncontrolled approximation. The former can all be tested for convergence, the functional cannot. What matters most is to be reassured that a particular calculation is numerically converged. Such a database would then be useful to check against COD, in order to find out the performance of a particular functional over a wide class of materials (some work along this lines has been done by MaterialsProject.org, where materials for which GGA was overbinding could be flagged as interesting for further study: either the experiments were not good, or there was special physics at work).
P.S. OK, so maybe someone has to go first and get complained at by everoyne else for his sloppy standards... Here a suggestion for marking up the two most important convergence parameters: Max residual force on atoms < 0.01 eV/Å = good, 0.1-0.01 = intermediate,
0.1 = low K-point resolution < 0.2 Å^{-1} = good, 0.75-0.2 =
intermediate, > 0.75 = low
In order to judge the k-mesh, one would preferably have also the information whether the material is a metal or not.
Stefaan
tcod@lists.crystallography.net