From erik.rakovsky at uniba.sk Mon Jun 21 19:47:28 2021 From: erik.rakovsky at uniba.sk (=?utf-8?B?UmFrb3Zza8O9IEVyaWs=?=) Date: Mon, 21 Jun 2021 16:47:28 +0000 Subject: [Cod-bugs] depositing "duplicates" Message-ID: Hello Anatas & co. ? so my problem in short is that I need to have deposited 4 structures of the same compound and the same polymorph but refined using several strategies. The results/differences will be thoroughly discussed in the article. For example * "routine" refinement with geometrically placed hydrogens, X-H distances fixed in default distances, Uiso(H) riding (1.2 or 1.5 times Ueq of parent atom) * the same strategy but X-H distances free to refine and even Uisos can be sometimes refined freely, too * Hirshfeld atom refinement using several methods (HF/DFT, various basis sets or functionals) - in my case, I used two different basis sets for final two refinements originally I described the refinement using _olex2_refinement_description ; HAR refinement using NoSpherA2/ORCA def2-TZVPP basis set PBE0 functional integration accuracy Max No AFIX SCF Threshold VeryTightSCF SCF Strategy VerySlowConv ; any idea how to do this without the exploit of _chemical_compound_source or other tag? Best regards, Erik Rakovsky -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -------------- next part -------------- An HTML attachment was scrubbed... URL: From antanas.vaitkus90 at gmail.com Mon Jun 21 22:05:44 2021 From: antanas.vaitkus90 at gmail.com (Antanas Vaitkus) Date: Mon, 21 Jun 2021 22:05:44 +0300 Subject: [Cod-bugs] depositing "duplicates" In-Reply-To: References: Message-ID: Dear Erik Rakovsky, CIF files that are acquired from the same data using different refinement methods are normally handled by marking one the structures (the best one in your opinion) as the optimal and the rest are marked as suboptimal ones. This markup allows us to automatically exclude the suboptimal structures from database-wide calculations and in doing so not overrepresent a single structure (in some extreme cases the same structure may contain several tens of models). For now, let's assume that the CIF file that you have already uploaded is the optimal one. This can easily be changed later. To successfully deposit the rest of the CIF files, you should add the following two lines to each of the other CIF files: _cod_related_optimal_entry_code 3000306 _cod_suboptimal_structure yes The first line states that the entry that you have already deposited (COD ID 3000306 [1]) is the optimal one and the second line simply states that the given structure is a suboptimal one (in general). The deposition website will complain about the '_cod_suboptimal_structure' data item not being recognised, but just ignore it -- it is a minor known bug in our system. Does this seem reasonable? If entry 3000306 is not the optimal one, just let me know and I can change it manually after you deposit the rest of the structures. [1] https://www.crystallography.net/cod/3000306.html Please let us know if this works for you. Sincerely, Antanas Vaitkus On Mon, 21 Jun 2021 at 21:31, Rakovsk? Erik wrote: > Hello Anatas & co. ? > so my problem in short is that I need to have deposited 4 structures of > the same compound and the same polymorph but refined using several > strategies. The results/differences will be thoroughly discussed in the > article. > For example > > - "routine" refinement with geometrically placed hydrogens, X-H > distances fixed in default distances, Uiso(H) riding (1.2 or 1.5 times Ueq > of parent atom) > - the same strategy but X-H distances free to refine and even Uisos > can be sometimes refined freely, too > - Hirshfeld atom refinement using several methods (HF/DFT, various > basis sets or functionals) - in my case, I used two different basis sets > for final two refinements > > originally I described the refinement using > > _olex2_refinement_description > ; > HAR refinement using NoSpherA2/ORCA > def2-TZVPP basis set > PBE0 functional > integration accuracy Max > No AFIX > SCF Threshold VeryTightSCF > SCF Strategy VerySlowConv > ; > > any idea how to do this without the exploit of _chemical_compound_source > or other tag? > > Best regards, > Erik Rakovsky > > -- > This message has been scanned for viruses and > dangerous content by *MailScanner* , and is > believed to be clean. > _______________________________________________ > Cod-bugs mailing list > Cod-bugs at lists.crystallography.net > http://lists.crystallography.net/cgi-bin/mailman/listinfo/cod-bugs > -- Antanas Vaitkus, PhD student at Vilnius University Institute of Biotechnology, room V325, Saul?tekio al. 7, LT-10257 Vilnius, Lithuania -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -------------- next part -------------- An HTML attachment was scrubbed... URL: From grazulis at ibt.lt Tue Jun 22 10:40:25 2021 From: grazulis at ibt.lt (=?UTF-8?Q?Saulius_Gra=c5=beulis?=) Date: Tue, 22 Jun 2021 10:40:25 +0300 Subject: [Cod-bugs] depositing "duplicates" In-Reply-To: References: Message-ID: Hello, Erik, dear CODers, As Antanas has mentioned, the '_cod_related_optimal_entry_code' and '_cod_suboptimal_structure' data items are essential for marking different versions of the structure refinement, in case you demonstrate that one structure solution or refinement method is better than all others. On 2021-06-21 19:47, Rakovsk? Erik wrote: > so my problem in short is that I need to have deposited 4 structures > of the same compound and the same polymorph but refined using several > strategies. The results/differences will be thoroughly discussed in > the article. > For example > > * "routine" refinement with geometrically placed hydrogens, X-H > distances fixed in default distances, Uiso(H) riding (1.2 or 1.5 > times Ueq of parent atom) > * the same strategy but X-H distances free to refine and even Uisos > can be sometimes refined freely, too > * Hirshfeld atom refinement using several methods (HF/DFT, various > basis sets or functionals) - in my case, I used two different > basis sets for final two refinements > This is a very valuable information. The problem may be that if the unit cells of all structures are very similar or identical, and you deposit the structures subsequently, the system will complain about duplicates. Upon original deposition, however, you can concatenate all structures into one CIF file and submit the file. The file will be split on the server side, and all structures will be deposited even if they are very similar ? duplicates are not searched among the "fledgings from the same nest", i.e. structures originating from the same deposition. In the worst case, if the deposition of similar structures does not work over the net, please e-mail the structures to me or Antanas, and we will insert them directly into the Subversion repo (under you name, of course :). > originally I described the refinement using > > _olex2_refinement_description > ; > HAR refinement using NoSpherA2/ORCA > def2-TZVPP basis set > PBE0 functional > integration accuracy Max > No AFIX > SCF Threshold VeryTightSCF > SCF Strategy VerySlowConv > ; This data item is useful, for sure, and can/should be retained. Our system will not recognise it, most probably, but it should not prevent the structure from being deposited. Hope this helps. Sincerely, Saulius -- Dr. Saulius Gra?ulis Vilnius University, Life Science Center, Institute of Biotechnology Saul?tekio al. 7, LT-10257 Vilnius, Lietuva (Lithuania) phone (office): (+370-5)-2234353, mobile: (+370-684)-49802, (+370-614)-36366 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: grazulis.vcf Type: text/x-vcard Size: 4 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: OpenPGP_signature Type: application/pgp-signature Size: 228 bytes Desc: OpenPGP digital signature URL: From antanas.vaitkus90 at gmail.com Tue Jun 22 13:50:35 2021 From: antanas.vaitkus90 at gmail.com (Antanas Vaitkus) Date: Tue, 22 Jun 2021 13:50:35 +0300 Subject: [Cod-bugs] depositing "duplicates" In-Reply-To: References: Message-ID: Dear Erik, I will provide a combined answer from Saulius and me to your last letter. On 2021-06-22 10:50, Rakovsk? Erik wrote: Dear Saulius, dear Antanas, > and what about using > _cod_suboptimal_structure no > as a switch for resetting status of older structures to suboptimal? > This would not work. The current implementation of the system is only capable of recognising files that are being deposited as suboptimal and does not check the previously deposited files. (of course I started from the "routine" structure and that best one was > last to deposit ?) Makes perfect sense. Steps to resolve this situation: 1. Add the following two lines to the rest of the CIF files and deposit them: _cod_related_optimal_entry_code 3000306 _cod_suboptimal_structure yes Upon successful deposition the CIF files will most likely be assigned sequential COD IDs 3000307, 3000308, 3000309. Let's assume that entry 3000309 is the optimal one. Steps 2-4 detail how to properly markup the optimal-suboptimal relations. You can either do that by following the provided instructions or let us know which structure is the optimal one and we can change the markup on our end. 2. Using the COD website interface update the optimal entry 3000309 by removing the _cod_related_optimal_entry_code data item and by changing the _cod_suboptimal_structure from "yes" to "no". Alternatively, you can also remove the _cod_suboptimal_structure data item altogether since "no is the default value. 3. Using the COD website interface update the suboptimal entry 3000306 by adding the following two lines: _cod_related_optimal_entry_code 3000309 _cod_suboptimal_structure yes 4. Using the COD website interface update the suboptimal entries 3000307, 3000308 by changing the _cod_related_optimal_entry_code data item value from "3000306" to "3000309". However, thanks a lot, I am going to play with it. > We are now discussing a mechanism for how you could mark the structures as being solved by different techniques and thus not duplicates; but it will take us some time to roll out this feature. I have tested the optimal-suboptimal markup solution on a test server and it seems to work, however, if it does fail for you for some reason there are two other options: 1. You send us the CIFs and we use the "human override" feature to insert your structures into the COD; 2. At the moment, you can use "_sample_thermal_history" or "_sample_pressure_history" tags to specify your refinement method; these data names are of course not meant for refinement peculiarities, but for describing samples that were quenched/annealed/pressurised; but if you would prefer to deposit the structure into the COD yourself, this would be currently a workaround. After the structures are deposited and the COD IDs are assigned, you or we can edit the files and rename the '_sample_{pressure|thermal}_history' to something more appropriate. I see two data names in the CIF core dictionary that seem to be designed for your purposes: - _computing.structure_refinement - _computing.structure_solution Please let us know if you were able to successfully deposit your structures. Sincerely, Antanas On Tue, 22 Jun 2021 at 10:40, Saulius Gra?ulis wrote: > Hello, Erik, > dear CODers, > > As Antanas has mentioned, the '_cod_related_optimal_entry_code' and > '_cod_suboptimal_structure' data items are essential for marking different > versions of the structure refinement, in case you demonstrate that one > structure solution or refinement method is better than all others. > > On 2021-06-21 19:47, Rakovsk? Erik wrote: > > so my problem in short is that I need to have deposited 4 structures of > the same compound and the same polymorph but refined using several > strategies. The results/differences will be thoroughly discussed in the > article. > For example > > - "routine" refinement with geometrically placed hydrogens, X-H > distances fixed in default distances, Uiso(H) riding (1.2 or 1.5 times Ueq > of parent atom) > - the same strategy but X-H distances free to refine and even Uisos > can be sometimes refined freely, too > - Hirshfeld atom refinement using several methods (HF/DFT, various > basis sets or functionals) - in my case, I used two different basis sets > for final two refinements > > This is a very valuable information. > > The problem may be that if the unit cells of all structures are very > similar or identical, and you deposit the structures subsequently, the > system will complain about duplicates. > > Upon original deposition, however, you can concatenate all structures into > one CIF file and submit the file. The file will be split on the server > side, and all structures will be deposited even if they are very similar ? > duplicates are not searched among the "fledgings from the same nest", i.e. > structures originating from the same deposition. > > In the worst case, if the deposition of similar structures does not work > over the net, please e-mail the structures to me or Antanas, and we will > insert them directly into the Subversion repo (under you name, of course :). > > originally I described the refinement using > > _olex2_refinement_description > ; > HAR refinement using NoSpherA2/ORCA > def2-TZVPP basis set > PBE0 functional > integration accuracy Max > No AFIX > SCF Threshold VeryTightSCF > SCF Strategy VerySlowConv > ; > > This data item is useful, for sure, and can/should be retained. Our system > will not recognise it, most probably, but it should not prevent the > structure from being deposited. > > Hope this helps. > > Sincerely, > Saulius > > -- > Dr. Saulius Gra?ulis > Vilnius University, Life Science Center, Institute of Biotechnology > Saul?tekio al. 7, LT-10257 Vilnius, Lietuva (Lithuania) > phone (office): (+370-5)-2234353, mobile: (+370-684)-49802, (+370-614)-36366 > > _______________________________________________ > Cod-bugs mailing list > Cod-bugs at lists.crystallography.net > http://lists.crystallography.net/cgi-bin/mailman/listinfo/cod-bugs > -- Antanas Vaitkus, PhD student at Vilnius University Institute of Biotechnology, room V325, Saul?tekio al. 7, LT-10257 Vilnius, Lithuania -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -------------- next part -------------- An HTML attachment was scrubbed... URL: