From t.munro at deakin.edu.au Wed Apr 7 21:09:02 2021 From: t.munro at deakin.edu.au (Thomas Munro) Date: Wed, 7 Apr 2021 18:09:02 +0000 Subject: [Cod-bugs] Undercounting of structure factors (reposting) Message-ID: Hi, I'm resending this because I didn't receive a reply from the moderator last time, so I'm not sure if it got posted: I'm not a trained crystallographer, so I may have misunderstood this, but I notice that only a small fraction of COD entries are returned under the ?has Fobs? filter. This seems to refer to having a separate hkl file. But checking recent structures without one, I find that about half of them have the hkl data embedded in the cif. So I thought it might be useful for users to flag these as well, so that the filter would return many more examples. Presumably it would be straightforward to detect them with a regular expression, or even just by their much larger file size, and update the indexing. Just a thought! Keep up the good work. Cheers, Thomas Munro Important Notice: The contents of this email are intended solely for the named addressee and are confidential; any unauthorised use, reproduction or storage of the contents is expressly prohibited. If you have received this email in error, please delete it and any attachments immediately and advise the sender by return email or telephone. Deakin University does not warrant that this email and any attachments are error or virus free. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrius.merkys at gmail.com Thu Apr 8 07:39:47 2021 From: andrius.merkys at gmail.com (Andrius Merkys) Date: Thu, 8 Apr 2021 07:39:47 +0300 Subject: [Cod-bugs] Undercounting of structure factors In-Reply-To: References: Message-ID: Hi Thomas, Thank you for your message and sorry about the delay to respond. We in the COD are aware about this undercounting. The way we would like to deal with it is to extract the HKL data embedded in CIF files and place it in separate HKL files, so that the representation of HKL in the COD be homogeneous. However, currently we lack workforce to implement this. If you would be willing to contribute a program for HKL data extraction, we could include it in the data processing pipeline. Our requirements for the program are free/libre open source software, usable unsupervised in Linux command line environment. Best wishes, Andrius Merkys (on behalf of the Crystallography Open Database) On 2021-03-23 18:30, Thomas Munro wrote: > Hi, > I?m a huge fan of your work. It?s tragic that the CCDC is still so > closed-minded. > I'm not a trained crystallographer, so I may have misunderstood this, > but I notice that only a small fraction of COD entries are returned > under the ?has Fobs? filter. This seems to refer to having a separate > hkl file. But checking recent structures without one, I find that > about half of them have the hkl data embedded in the cif. So I thought > it might be useful for users to flag these as well, so that the filter > would return many more examples. Presumably it would be > straightforward to detect them with a regular expression, or even just > by their much larger file size, and update the indexing. Just a > thought! Keep up the good work. > Cheers, > Thomas -- Dr. Andrius Merkys Vilnius University Institute of Biotechnology, Saul?tekio al. 7 LT-10257 Vilnius, Lithuania -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.