[Cod-bugs] special characters (0x1b, 0x07) in CIF files

Antanas Vaitkus antanas.vaitkus90 at gmail.com
Wed Dec 11 14:37:21 EET 2019


Dear Marcin Wojdyn,

as of COD revision r245002 the issues you outlined are considered resolved.

I would also like to note, that during the reparsing of the entire COD we
discovered several more COD entries with illegal ASCII characters that were
not picked up by your software.
A representative list of such structures:
https://www.crystallography.net/cod/4350338.cif@239844 -- contains the ACK
symbol in the value of the '_refine_diff_density_rms' data item;
https://www.crystallography.net/cod/4089334.cif@243612 -- contains the SOH
symbol in the value of the '_refine_diff_density_rms' data item.

The '@' postfix points to the specific SVN revision where the file still
contained the error. Just pointing this out in case you would find these
examples useful in testing your software.

Sincerely,
Antanas Vaitkus


On Wed, 11 Dec 2019 at 07:08, Antanas Vaitkus <antanas.vaitkus90 at gmail.com>
wrote:

> Dear Marcin Wojdyr,
>
> currently, the naming conventions of multi-block hkl files are a little
> inconsistent in the COD. However, I do agree that we should at least avoid
> duplicate data names. We will fix this issue as soon as possible.
>
> As for hkl entry 4115482, it seems to contain a CIF syntax error that our
> parser did not properly detect. We will definitely investigate that.
>
> Sincerely,
> Antanas Vaitkus
>
> On Tue, 10 Dec 2019 at 22:46, Marcin Wojdyr <wojdyr at gmail.com> wrote:
>
>>
>> and four hkl file with different syntax problems:
>>
>> $ time find ../cod/hkl/ -name \*.hkl | xargs -n1000 ./build/gemmi
>> validate
>> ../cod/hkl/2/00/88/2008821.hkl: duplicate block name: 2008821_Fobs
>> ../cod/hkl/4/11/54/4115482.hkl:27:0(860): parse error
>> ../cod/hkl/4/11/75/4117532.hkl: duplicate block name:
>> 4117532_diffractogram_1
>> ../cod/hkl/4/11/75/4117533.hkl: duplicate block name:
>> 4117532_diffractogram_1
>>
>> real 2m27.263s
>> user 1m41.871s
>> sys 0m6.641s
>>
>> --
>> This message has been scanned for viruses and
>> dangerous content by *MailScanner* <http://www.mailscanner.info/>, and
>> is
>> believed to be clean. _______________________________________________
>> Cod-bugs mailing list
>> Cod-bugs at lists.crystallography.net
>> http://lists.crystallography.net/cgi-bin/mailman/listinfo/cod-bugs
>>
>
>
> --
> Antanas Vaitkus,
> PhD student at Vilnius University Institute of Biotechnology,
> room V325, SaulÄ—tekio al. 7,
> LT-10257 Vilnius, Lithuania
>
>
>

-- 
Antanas Vaitkus,
PhD student at Vilnius University Institute of Biotechnology,
room V325, SaulÄ—tekio al. 7,
LT-10257 Vilnius, Lithuania

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.crystallography.net/pipermail/cod-bugs/attachments/20191211/458ea9c4/attachment.html>


More information about the Cod-bugs mailing list