From oarcelus at cicenergigune.com Mon Nov 29 17:40:07 2021 From: oarcelus at cicenergigune.com (Oier Arcelus) Date: Mon, 29 Nov 2021 15:40:07 +0000 Subject: [Cod-bugs] 'next' link does not save queried response_items Message-ID: Hi, I am trying to use the OPTIMADE API implementation of COD, through the pymatgen's OptimadeRester class. I just realized that the 'next' links that appear as responses do not conserve the queried response_fields. For example, from this query: https://www.crystallography.net/cod/optimade/v1/structures?filter=(elements%20HAS%20ALL%20%22Co%22,%20%22Li%22,%20%22O%22)&response_fields=lattice_vectors,cartesian_site_positions,species,species_at_sites The 'next' link is: https://www.crystallography.net/cod/optimade/v1.0.0/structures?page_limit=10&page_offset=10&filter=%28elements%20HAS%20ALL%20%22Co%22%2C%20%22Li%22%2C%20%22O%22%29 And then when the OptimadeRester class tries to find the some of the response_fields, it fails, as they are no longer present in the response from the COD. Is this a bug? Or is it a known thing in the implementation, for me it makes it much harder to keep the information consistent across pagination. Best regards, Oier. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrius.merkys at gmail.com Mon Nov 29 17:58:34 2021 From: andrius.merkys at gmail.com (Andrius Merkys) Date: Mon, 29 Nov 2021 17:58:34 +0200 Subject: [Cod-bugs] 'next' link does not save queried response_items In-Reply-To: References: Message-ID: Hi Oier, On 2021-11-29 17:40, Oier Arcelus wrote: > I am trying to use the OPTIMADE API implementation of COD, through the > pymatgen?s OptimadeRester class. I just realized that the ?next? links > that appear as responses do not conserve the queried response_fields. [trimmed for brevity] > And then when the OptimadeRester class tries to find the some of the > response_fields, it fails, as they are no longer present in the response > from the COD. > > Is this a bug? Or is it a known thing in the implementation, for me it > makes it much harder to keep the information consistent across pagination. Yes, this is a bug. Thanks for reporting it, I will look into it ASAP. Response fields (and all other parameters) have to stay the same between pages. Best wishes, Andrius -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From grazulis at ibt.lt Tue Nov 30 07:28:44 2021 From: grazulis at ibt.lt (=?UTF-8?Q?Saulius_Gra=c5=beulis?=) Date: Tue, 30 Nov 2021 07:28:44 +0200 Subject: [Cod-bugs] 'next' link does not save queried response_items In-Reply-To: References: Message-ID: On 2021-11-29 17:40, Oier Arcelus wrote: > > am trying to use the OPTIMADE API implementation of COD, through the > pymatgen?s OptimadeRester class. I just realized that the ?next? links > that appear as responses do not conserve the queried response_fields. > > ? > > For example, from this query: > > ? > > https://www.crystallography.net/cod/optimade/v1/structures?filter=(elements%20HAS%20ALL%20%22Co%22,%20%22Li%22,%20%22O%22)&response_fields=lattice_vectors,cartesian_site_positions,species,species_at_sites > > > ? > > The ?next? link is: > > ? > > https://www.crystallography.net/cod/optimade/v1.0.0/structures?page_limit=10&page_offset=10&filter=%28elements%20HAS%20ALL%20%22Co%22%2C%20%22Li%22%2C%20%22O%22%29 > > > ? > > And then when the OptimadeRester class tries to find the some of the > response_fields, it fails, as they are no longer present in the > response from the COD. > > ? > > Is this a bug? Or is it a known thing in the implementation, for me it > makes it much harder to keep the information consistent across pagination. > Its a good question.... I think OPTIMADE spec. does not specify how the server should behave regarding the 'response_fields' (but we should double check). If it is not in the OPTIMADE spec yet, maybe we should include it into the spec.? I any case, it seems that retaining 'response_fields' (and possibly other relevant QS parameters) is a good think and should be easy to do. I forward Andrius, maybe he can have a closer look at the implementation and enhance it. For now, a possible workaround will probably be to attach the 'response_fields' to every 'next' URL; this should also work in the future if the 'next' URL will get a copy of its own 'response_fields', since, AFAIU, duplicated QS parameter is not an error. Or you can check of the 'response_fields' is already present, to be 200% sure. Regards, Saulius -- Dr. Saulius Gra?ulis Vilnius University, Life Science Center, Institute of Biotechnology Saul?tekio al. 7, LT-10257 Vilnius, Lietuva (Lithuania) phone (office): (+370-5)-2234353, mobile: (+370-684)-49802, (+370-614)-36366 -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrius.merkys at gmail.com Tue Nov 30 08:21:45 2021 From: andrius.merkys at gmail.com (Andrius Merkys) Date: Tue, 30 Nov 2021 08:21:45 +0200 Subject: [Cod-bugs] 'next' link does not save queried response_items In-Reply-To: References: Message-ID: <8d89477c-a746-f382-e8e7-356cb63773e2@gmail.com> Hi all, On 2021-11-30 07:28, Saulius Gra?ulis wrote: > Its a good question.... I think OPTIMADE spec. does not specify how the > server should behave regarding the 'response_fields' (but we should > double check). > > If it is not in the OPTIMADE spec yet, maybe we should include it into > the spec.? > > I any case, it seems that retaining 'response_fields' (and possibly > other relevant QS parameters) is a good think and should be easy to do. > I forward Andrius, maybe he can have a closer look at the implementation > and enhance it. Matthew has opened an issue in the OPTIMADE specification repository [1]. I suggest continuing the discussion there in order to make possible the involvement of all OPTIMADE specification developers. [1] https://github.com/Materials-Consortia/OPTIMADE/issues/391 Best, Andrius -- Andrius Merkys Vilnius University Institute of Biotechnology, Saul?tekio al. 7 LT-10257 Vilnius, Lithuania -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. From oarcelus at cicenergigune.com Thu Dec 2 17:05:27 2021 From: oarcelus at cicenergigune.com (Oier Arcelus) Date: Thu, 2 Dec 2021 15:05:27 +0000 Subject: [Cod-bugs] Slow queries through OPTIMADE API Message-ID: Hi, As a continuation of my previous question about the URL parameters on the 'next' links, I followed the workaround of Saulius Grazulis and add the URL parameters manually to the 'next' URL. And it worked. However now I am finding that COD takes very long to generate a response from the queries I am making. I posted the doubts in https://matsci.org/t/connection-to-cod-optimade-api-is-very-slow/39556/11 and there was some discussion, but finally the suggestion was to contact this email directly. Basically, the doubt is if including response_fields that require returning the 'P1'-d cell take longer to generate. I did some small tests with different queries that I re-post here in case. Queries with response_fields and nelements https://www.crystallography.net/cod/optimade/v1/structures?filter=(elements%20HAS%20%20ALL%20"Li")%20AND%20(nelements=3)&response_fields=lattice_vectors,cartesian_site_positions,species,species_at_sites (3,8 sec) http://www.crystallography.net/cod/optimade/v1/structures?filter=(elements%20HAS%20%20ALL%20"Li",%20"Co")%20AND%20(nelements=3)&response_fields=lattice_vectors,cartesian_site_positions,species,species_at_sites (1min) http://www.crystallography.net/cod/optimade/v1/structures?filter=(elements%20HAS%20%20ALL%20"Li",%20"Co",%20"O")%20AND%20(nelements=3)&response_fields=lattice_vectors,cartesian_site_positions,species,species_at_sites (1min5sec) Queries with only response_fields http://www.crystallography.net/cod/optimade/v1/structures?filter=(elements%20HAS%20%20ALL%20"Li")&response_fields=lattice_vectors,cartesian_site_positions,species,species_at_sites (7.2 sec) http://www.crystallography.net/cod/optimade/v1/structures?filter=(elements%20HAS%20%20ALL%20"Li","Co")&response_fields=lattice_vectors,cartesian_site_positions,species,species_at_sites (1min14sec) http://www.crystallography.net/cod/optimade/v1/structures?filter=(elements%20HAS%20%20ALL%20"Li","Co","O")&response_fields=lattice_vectors,cartesian_site_positions,species,species_at_sites (1min11sec) Queries without any of the former http://www.crystallography.net/cod/optimade/v1/structures?filter=(elements%20HAS%20%20ALL%20"Li") (1sec) http://www.crystallography.net/cod/optimade/v1/structures?filter=(elements%20HAS%20%20ALL%20"Li","Co") (1sec) http://www.crystallography.net/cod/optimade/v1/structures?filter=(elements%20HAS%20%20ALL%20"Li","Co",%20"O") (1,1sec) You can see that querying for more elements in the compositions takes much longer, with little difference on wether I ask for a limited nelements or no. Is there a specific reason behind this? Best regards, Oier. -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean. -------------- next part -------------- An HTML attachment was scrubbed... URL: From andrius.merkys at gmail.com Fri Dec 3 11:20:44 2021 From: andrius.merkys at gmail.com (Andrius Merkys) Date: Fri, 3 Dec 2021 11:20:44 +0200 Subject: [Cod-bugs] Slow queries through OPTIMADE API In-Reply-To: References: Message-ID: Hi Oier, On 2021-12-02 17:05, Oier Arcelus wrote: > However now I am finding that COD takes very long to generate a response > from the queries I am making. I posted the doubts in > https://matsci.org/t/connection-to-cod-optimade-api-is-very-slow/39556/11 > and there was some discussion, but finally the suggestion was to contact > this email directly. Basically, the doubt is if including > response_fields that require returning the ?P1?-d cell take longer to > generate. I did some small tests with different queries that I re-post > here in case. You are right to think that requiring P1 cells causes the delays. In the (T)COD currently we do not cache the restored P1 cells, thus they are reconstructed on-the-fly. Dropping 'cartesian_site_positions', 'species' and 'species_at_sites' from 'response_fields' will definitely speed up your queries. Thus if you do not need these properties, please exclude from from 'response_fields'. As for the execution times you cite, I can only speculate that first 10 structures with Li are not too complicated, but whenever you ask for both Li and Co, among the first 10 hits you get a couple of larger structures with more atoms. Therefore the response times increase considerably. Hope this helps, Andrius -- Andrius Merkys Vilnius University Institute of Biotechnology, Saul?tekio al. 7 LT-10257 Vilnius, Lithuania -- This message has been scanned for viruses and dangerous content by MailScanner, and is believed to be clean.