[Cod-bugs] Missing Journal Information

Robert McMeeking - STFC UKRI robert.mcmeeking at stfc.ac.uk
Sun Aug 10 18:52:42 EEST 2025


Hi Saulius

Here are a list of “recovered” article titles. It need a bit of work. For instance there appear to be problems rendering Unicode character after transfer from the linux server to my laptop.

Will get back with further details later when I get to check things out on Monday

Regards

Bob



From: Saulius Gražulis <grazulis at ibt.lt>
Sent: 07 August 2025 14:42
To: McMeeking, Robert (STFC,DL,SC) <robert.mcmeeking at stfc.ac.uk>; 'Antanas Vaitkus' <antanas.vaitkus90 at gmail.com>
Cc: cod-bugs at ibt.lt
Subject: Re: [Cod-bugs] Missing Journal Information

On 2025-08-07 11:33, Robert McMeeking - STFC UKRI wrote:
I have done some tests on some of the entries with article title issues. The ones I checked do appear to have titles. I will try to get the corrections to you quite soon.
We can find COD entries without titles in the COD SQL database. I can send you a list of such entries or an SQL query if you would like.


I have notices problems with a number of the dois. My scripts appear to have problems containing any if characters: <>()

Lol – I had exactly the same problem :).

I think we need to us 'urlencode' for them before sending them to the doi.org as a DOI request, it worked for me. I can send yo may shell hack for this if that would help.
I can see how these characters might give problems in a Unix environment. But I assume I should be able to fix the problems. Having said that I am a bit surprised that these characters are allowed in valid dois!

Yes, DOIs are more permissive than even URLs. No idea why they did that, but that's what we have. Let me know if you would like to look at any of my hacks on this.

Regards,
Saulius

PS. The 40k+ bibliographies for COD entries with missing years or page numbers were downloaded. I'll have to look at them and sort them out (some  have failed since the DOIs are not from journals but from university repos, some journals no longer hand our page numbers or do not include them into DOI-derived bibliography files...). We'll see how it works but a lot of COD entries can be fixed now.

S.G.

--

Dr. Saulius Gražulis

Vilnius University Institute of Biotechnology, Saulėtekio al. 7

LT-10257 Vilnius, Lietuva (Lithuania)

mobile: (+370-684)-49802, (+370-614)-36366

--
This message has been scanned for viruses and
dangerous content by MailScanner<http://www.mailscanner.info/>, and is
believed to be clean.

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.crystallography.net/pipermail/cod-bugs/attachments/20250810/610a9615/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: saulius_noart_tit.text
Type: application/octet-stream
Size: 84215 bytes
Desc: saulius_noart_tit.text
URL: <http://lists.crystallography.net/pipermail/cod-bugs/attachments/20250810/610a9615/attachment-0001.obj>


More information about the Cod-bugs mailing list