Let me clear the way for a flow of words by first apologising to a critic of my blogging method. Thanks AH, for your private communication. I am guilty as charged. I do indeed, tend to sit down and start writing about whatever seems important to me. No, what I write is not meant to be funny, though I can see that in a laugh-starved world it could happen by accident. And, no, not all my readers drop out after two paragraphs: almost thirty per cent of the hardy souls are still there after 15 minutes, and surely not all of those left the machine on my page while coffee making or answering an urgent call of nature. But I am flattered that you chose to write and please feel free to make  additional constructive comments on where I might locate myself. And do remember that not all Russian programmers are trying to bring down the civilized world as we know it!

And thanks to the rest of you while I got that off my chest. Let me now take you back to 1993. I am working on the first internet-related project that we ever received, and part of the work involves interviewing university librarians about their future. Most were unimpressed by technology arguments, but one, I suspect the great and long-sighted Mel Collier, most recently at Leuven, said that in a properly networked world the researcher and the graduate student and the undergraduate could all take the university library home with them, but it might look slightly different in terms of access and facilitation, according to who you were. And then, a week or so ago, I was talking to Jan Reichelt, co-partner in the creation of Mendeley and its former manager after the Elsevier acquisition. He and his colleagues are behind Kopernio, (www.kopernio.com) the plug in that allows researchers and others to record all of the permissions they have been granted by their libraries, take them home and use them as if they were sitting in the campus library building. And this is not unique – there are other systems like Unpaywall.org around the place. But if Kopernio gets the widespread adoption that I believe it will, then it is a gamechanger – in the psychology of the researcher/student, in the sense of where research may properly be done, and in the personality of the library in the minds of its users.

Publishers should be queuing up to work with Kopernio. In the age of SciHub and downloadable papers on ResearchGate, students who can find everything they need online while drinking cocoa at home, or researchers, who can use the library through the weekend to meet a project deadline, without infringing copyright but while increasing the satisfaction ratios of library contracts, are very valuable to librarian and publisher alike. And the fact that it has taken 25 years to get to this point underscores a very well understood relationship: perceptions of the extent of change in any networked domain are easy to make and, if we have enough imagination, the impacts can soon be appreciated and understood. Making the changes themselves takes two decades and more, and calibrating what those changes mean takes even longer. Truly the networked world is now very fast at identifying change, often very slow in adopting it fully and the hopeless  at anticipating what happens next.

But Kopernio is important in another sense. It marks a further stage in the elision of roles in scholarly communication. The library that the researcher takes home is not only Science Direct and Springer Nature – it is PubMed and Plos One, and of course the whole world of ab initio OA publishing. There will inevitably be a dominant technology in this space, and I can easily envisage Kopernio in that space. At that point, one of the major publishers, migrating anxiously away from journals towards research data workflow and management, will want to buy it to automate the permissions behind discovery processes in researcher workflow. And then we are on the verge of a much larger change process. The new style publisher will sell the research department the workflow software, the database technology, the means of connecting to stored knowledge, and, increasingly, the tools for mining, sifting and analysing the data. Some of these will be implemented at university, library or research project or individual level. But the outcomes of research in the form of reports, findings, communication with funders, and eventually research articles and publishable papers, will all be outputs of the research process software, and there will little point in taking the last named elements apart to send them to a third party for remote publication. There is no mystery about technical editing that cannot be accomplished in a library or a research department or by copying publishers and using freelance specialists online. And most other so—called publishing functions, from copy editing to proof reading are semi-automated already and will be fully robotic very soon.

There is a trail here which goes from Ubiquity Press  (www.ubiquitypress.com) to science.ai. And it is happening, under the odd label of Robotic Process Automation (RPA) in every business market that I monitor. It is not really AI – it is machine learning and smart rule-based systems which are far commoner than real AI. Back in 1993 we used to call internet publishing “musical chairs” – the game was to look at the workflows and decide, when the music of newly networked relationships stopped, who was sitting in whose chair. In those days we thought the Librarian was left standing up. But with the advantage of time I am no longer so sure. The Library seems on the verge of becoming the Publisher (note the huge growth of University Presses in the US and UK), while the former journal publisher becomes the systems supplier and service operator. Simply making third party content available may be too low a value task for anyone to do commercially in an age of process automation.

Now, Miss AH, was that  wildly funny? If it was, please tell me the joke! But I do promise more RPA next time.

I am a passionate fan of the World of Open. Without a network ethic of Open we will strangle the world of connectivity with restriction and force ourselves into re-regulating just those things which we have so recently decongested. Yet not all the things we want to designate as Open will be open just because we put the word in front of the previous descriptor. In most instances being Open does not necessarily do the trick on its own – someone has to add another level of value to change Open meaning “available” to Open meaning “useful”. So, the argument that Open Access journals are an appropriate response to the needs of researchers and to a wider public who should be enjoying unfettered access to the fruits of taxpayer-funded research would seem to be a given. But creating real benefits beyond such general statements of Good seems very hard, and the fact that researchers cannot see tangible benefits beyond occupying the moral high ground probably connects with the grindingly slow advance of Open Access to around a quarter of the market in a decade.

I feel much the same about Open Citations. Reading the latest Initiative for Open Citations report at i4Oc.org, I find really good things:

“The present scholarly communication system inadequately exposes the knowledge networks that already exist within our literature. Citation data are not usually freely available to access, they are often subject to inconsistent, hard-to-parse licenses, and they are usually not machine-readable.”

And I see an impressive number of members, naturally not including either Clarivate or Elsevier. Yet using the argument above I would say that either of these is most likely to add real value to Open Citations, and certainly more likely than most of the members of the club. What we have here, all the time, is a constant effort to try to emulate a world which has largely now passed by, and we do it by trying to build value propositions from wholly owned artifacts or data elements, thus turning them into proprietory systems. This urge to monopoly is clearly being superseded: what has happened is that the valuation systems by which markets and investors measure value has not caught up with the way users acknowledge value.

Outside of the worlds of research and scholarly communication it seems clear that the most impressive thing you can say about the world of data is “Use it and lose it”. The commoditization of data as content is evident everywhere. The point about data is not Big Data – a once prominent slogan that has now diminished into extinction – but actionable data. The point is not collecting all the data into one place – it can stay wherever it rests as long as we can analyse and utilise it. The point is the level of analytical insight we can achieve from the data available, and this has much to do with our analytics, which is were the real value lies. Apply those proprietory analytics back into the workflow of a particular sector – the launch music around Artifacts in Healthcare in Cambridge MA was ver notceable last week – and value is achieved for an information system. And one day, outside of copyright and patents, and before we get to the P&L account, someone will work out how we index those values and measure the worth of a great deal of the start-up activity around us.

So from this viewpoint the press release of the week came from Clarivate Analytics and did not concern Open at all directly. It concerned a very old-fashioned value indeed – Brand. If the world of scholarly communication is really about creating a reputation marketplace, the ISI, Eugene Garfield’s original vehicle for establishing citation indexing from which to promulgate the mighty Impact Factor, is the historical definition point of the market in scholarly reputation. By refounding it and relaunching it, Clarivate seem to me to be not just saying how much the market needs that sort of research right now, but to be aiming at the critical value adding role: using the myriad of data available to redefine measurement of reputation in research. In many ways Open Citations will assist that, but the future will be multi-metric, the balance of elements in the analytics will be under greater scrutiny than ever before, and ISI will need to carry the whole marketplace with them to achieve a result. That is why you need a research institute, not just a scoring system. And even then the work will need a research institute to keep it in constant revision – unlike the impact factor the next measure will have to be developed over time, and keep developing so that it cannot be influenced or “gamed”. In the sense I have been using it here, ISI becomes the analytical engine sitting on top of all of the available but rapidly commoditising research data.

We have come very quickly from a world where owning copyrights in articles and defending that ownership was important, to this position of commoditized content and data as a precursor to analysis. But we must still be prepared for further shortening of the links and cycle re-adjustments. Citations of evidential data, and citations on Findings-as-data without article publishing will become a flood rather than the trickle it is now. Add in the vast swathes of second tier data from article publishing in India, China or Brazil. Use analytics not just for reputational assessment, but also for trend analysis, repeat experiment verification and clinical trials validation. We stand in a new place, and a re-engineered ISI is just what we need beside us.

 

« go backkeep looking »