<div dir="ltr"><div dir="ltr">Dear Norwid Behrnd,<br><br></div><div>Thank you for your question.<br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, 26 Jul 2021 at 15:07, Norwid Behrnd <<a href="mailto:nbehrnd@yahoo.com">nbehrnd@yahoo.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Dear developers of COD,<br>
<br>
I became aware of the specification about the syntax for cif 2.0,<br>
which -- if used -- requires an early comment<br>
```<br>
#\#CIF_2.0<br>
```<br>
while Bernstein et al. remind / recommend (the not mandatory)<br>
comment<br>
```<br>
#\#CIF_1.1<br>
```<br>
in files of .cif 1.1.[1] In your description of the<br>
COD::CIF::Parser: parser[2] both mentions focus on the syntax of<br>
.cif (v1.1) as a target as well as preparation for .cif (v2.0).<br></blockquote><div><br></div><div>I am glad to inform you that the COD::CIF::Parser from the cod-tools [AV-1]<br>package is now capable of parsing CIF_2.0 files as well. However, I am yet<br></div><div>to encounter a CIF 2.0 data file in the wild.<br></div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
While I might misunderstand its processing of the data, I speculate<br>
a file recovery program like photorec[3] might work better if these<br>
ASCII-files would contain either one form of this type of identifier<br>
while re-assigning the extension .cif instead of a mere .txt to the<br>
file restored. The tentative addition of `#\#CIF_1.1` to a COD .cif<br>
(attached) retained the file's content fully accessible to e.g.,<br>
Mercury (2020.3), or Jmol. On the other hand, I'm unable to recall a<br>
recent instance where a .cif, downloaded as SI of a publication, or<br>
/via/ CCDC's conquest interface, explicitly contained such a label.<br></blockquote><div><br></div><div>A similar question has already been raised in our team internally [AV-2],<br></div><div>but has not gained a lot of traction. The change itself is indeed quite<br></div><div>simple, however, it would require to update all COD CIF files as well<br></div><div>as the related data curation and depositions pipelines. At that time it<br>was not deemed a priority since CIF_1.1 files seem to function well<br></div><div>without the explicit format comment. Having a few real-world examples<br></div><div>of where the comment actually proves useful would help move things<br></div><div>along in this regard.<br><br></div>As for the examples that you have already provided:<br>- photorec: the crystallographic CIF file is quite obscure to most people,<br> so I would be very surprised if 'photorec' already had the heuristics to<br> recognise CIF files as such. Nevertheless, the developers may be open<br> to including such enhancements in the future.<br></div><div class="gmail_quote">- jmol: as far as I know jmol determines the format by actually parsing<br></div><div class="gmail_quote"> the file so the file extension (cif, txt, etc.) should not really make a<br></div><div class="gmail_quote"> difference.<br>- Mercury (2020.3): I am unsure how Mercury process files, but I image<br></div><div class="gmail_quote"> it should still recognise CIF files with the .txt extension as CIF files<br> regardless of the presence of the "#\#CIF_1.1".</div><div class="gmail_quote"><br></div><div class="gmail_quote">Additional examples would indeed be useful.<br> </div><div class="gmail_quote"><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
What is your perspective on adding `#\#CIF_1.1` to the .cif?<br></blockquote><div><br></div><div>In general, we are not opposed to this idea, however, any large-scale<br></div><div>modifications to the COD data should be backed up by a specific need<br></div><div>(i.e. ensuring data quality, adherence to the FAIR principles, etc.).<br></div><div> <br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
[1] <a href="https://scripts.iucr.org/cgi-bin/paper?aj5269" rel="noreferrer" target="_blank">https://scripts.iucr.org/cgi-bin/paper?aj5269</a><br>
[2] <a href="https://journals.iucr.org/j/issues/2016/01/00/po5052/index.html" rel="noreferrer" target="_blank">https://journals.iucr.org/j/issues/2016/01/00/po5052/index.html</a><br>
[3] <a href="https://www.cgsecurity.org/wiki/PhotoRec#How_PhotoRec_works" rel="noreferrer" target="_blank">https://www.cgsecurity.org/wiki/PhotoRec#How_PhotoRec_works</a><br>
<br></blockquote><div><br></div><div>Sincerely,<br></div><div>Antanas Vaitkus<br></div><div><br></div><div>[AV-1] <a href="https://github.com/cod-developers/cod-tools">https://github.com/cod-developers/cod-tools</a><br>[AV-2] <a href="https://projects.ibt.lt/repositories/issues/100">https://projects.ibt.lt/repositories/issues/100</a></div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
-- <br>
This message has been scanned for viruses and<br>
dangerous content by MailScanner, and is<br>
believed to be clean.<br><br>
Cod-bugs mailing list<br>
<a href="mailto:Cod-bugs@lists.crystallography.net" target="_blank">Cod-bugs@lists.crystallography.net</a><br>
<a href="http://lists.crystallography.net/cgi-bin/mailman/listinfo/cod-bugs" rel="noreferrer" target="_blank">http://lists.crystallography.net/cgi-bin/mailman/listinfo/cod-bugs</a><br>
</blockquote></div><br clear="all"><br>-- <br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div>Antanas Vaitkus,<br></div>Vilnius University,<br>Life Sciences Center,<br>Institute of Biotechnology,<br><span><span><span>room C521, </span></span></span>Saulėtekio al. 7,<br>LT-10257 Vilnius, Lithuania<br></div><div><div><div><div><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><br><br></div></div></div></div></div></div></div></div></div></div></div></div>
<br />--
<br />This message has been scanned for viruses and
<br />dangerous content by
<a href="http://www.mailscanner.info/"><b>MailScanner</b></a>, and is
<br />believed to be clean.