[Cod-bugs] Authors matching in Optimade API

Mehmet Giritli - STFC UKRI mehmet.giritli at stfc.ac.uk
Mon Nov 17 17:03:39 EET 2025


Dear Andrius,

Thank you so much for your quick response.

We are looking forward to have exact matches in COD. Thank you so much!

Regarding "HAS CONTAINS", this does not seem to work with the Lark validator we have. In our system, we use a query validator to check for valid optimade filters. This is done by the Lark language interpreter which the optimade library uses. It gives an error when this combination is used. Here is an example:

Unable to parse filter ( _psdi_authors HAS CONTAINS " Day " ). Lark traceback:
No terminal matches 'C' in the current parser context, at line 1 col 21

( _psdi_authors HAS CONTAINS " Day " )
                    ^
Expected one of:
        * SIGNED_INT
        * OPERATOR
        * FALSE
        * SIGNED_FLOAT
        * ESCAPED_STRING
        * TRUE
        * ALL
        * IDENTIFIER
        * ANY
        * ONLY
May I ask you, whenever you get around to it, to double check with Optimade spec if this is allowed please? Unless it is a legal Optimade query, we will not be able to use it. Moreover, Matthew Evans has done a private API for us and CONTAINS operator in the way I showed above is how we use it (e.g., author.names CONTAINS "..."). I might be wrong, but may be the "authors" is the list here and "authors.name" is actually a string? I think Mattew Evans is the best person to answer that, I haven't looked to the Optimade spec for a while now 🙂

Once again, thank you for your quick response. I'll keep on checking to see if exact matches are available for us to use 🙂

Best,
Mehmet.
________________________________
From: Andrius Merkys <andrius.merkys at gmail.com>
Sent: 17 November 2025 14:21
To: Giritli, Mehmet (STFC,RAL,SC) <mehmet.giritli at stfc.ac.uk>
Cc: cod-bugs at ibt.lt <cod-bugs at ibt.lt>
Subject: Re: [Cod-bugs] Authors matching in Optimade API

Dear Mehmet,

On 2025-11-17 16:08, Mehmet Giritli - STFC UKRI wrote:
> I am trying to use Optimade API to search using author names. But I
> noticed a behaviour which is a bit unexpected.
>
> Consider this query:
>
> authors.name HAS "A. Day"
>
> The spec for HAS operator says that this should only match if there is
> an exact match, i.e., there is an author which is equal to "A. Day", in
> the record.
>
> Here is COD's output:
> https://www.crystallography.net/cod/optimade/references?filter=authors.name%20HAS%20%22A.%20Day%22 <https://www.crystallography.net/cod/optimade/references?filter=authors.name%20HAS%20%22A.%20Day%22>
>
> The single record above includes an author "A. Dayalan". However, I
> think this should not be the case.  It seems like "HAS" implementation
> behaves like what I would expect from "CONTAINS" implementation. But,
> this means that there is no way to get only the records which contain
> values we are explicit about.

You are right, here the COD's OPTIMADE implementation deviates slightly
from the OPTIMADE specification. Given our backend, implementing support
for such queries to the letter has proven to be too complicated, thus at
the moment we have this deviation. I think it would be nice to at least
emit a warning from our side.

> That said, I also noticed that CONTAINS is not implemented for
> author.names at the moment:
>
> https://www.crystallography.net/cod/optimade/references?filter=authors.name%20CONTAINS%20%22A.%20Day%22 <https://www.crystallography.net/cod/optimade/references?filter=authors.name%20CONTAINS%20%22A.%20Day%22>

In this case the filter string should be:

authors.name HAS CONTAINS "A. Day"

(Notice the addition of 'HAS'). 'CONTAINS' alone is an operator acting
on strings, not on lists.

> To sum up my request, do you think it is possible to:
>
>  1.
>     Change implementation of "author.names HAS" so that it allows only
>     exact matching.

This should be doable, it just needs additional work from our side.

>  2.
>     Implement "author.names CONTAINS" which is basically the current
>     implementation you have for HAS.

As explained, 'author.names CONTAINS' should be changed to 'author.names
HAS CONTAINS', as 'author.names CONTAINS' should not work on any
OPTIMADE implementation.

> I hope it made sense. Please let me know if there is a way to overcome
> this situation currently (e.g., to exact match an author) in your API.

Thanks for contacting us regarding this issue.

Best wishes,
Andrius

--
Andrius Merkys
Vilnius University Institute of Biotechnology, Saulėtekio al. 7
LT-10257 Vilnius, Lithuania

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.crystallography.net/pipermail/cod-bugs/attachments/20251117/7147e658/attachment-0001.htm>


More information about the Cod-bugs mailing list