The sunny but sometimes chill air of Harrogate this week was a good metaphor for the scholarly communications marketplace. Once the worshippers at the shrine of the Big Deal, the librarians and information managers who form the majority of the 950 or so attendees now march to a different tune. From the form of the article to the nature of collaboration this was a confident organization talking about the future of the sector. And at no point was this a discussion about more of the same. Three sunny days, but for publishers present there was an occasional chill in the wind.

I started the week with a particular purpose in mind, which was all about the current state of collaboration. I was impressed by the Hypothes.is announcement with Highwire (www.highwire.org). There are now some 3000 journals using open source annotation platforms like the not-for-profit Hypothes.is to encourage discoverable (and private) annotation. Not since Copernicus, when scholars toured monasteries to read and record annotations of observations of the galaxies in copies of his texts, have we had the ability to track scholarly commentary on recent work and work in progress so completely. And no sooner had I begun talking about collaboration as annotation than I met people willing to take the ideas further, into the basis of real community-building activity.

It seems to me that as soon as the journal publisher has imported an annotation interface then he is inviting scholars and researchers into a new relationship with his publishing activity. And for anyone who seeks a defence against the perceived threat of ResearchGate or Academia.edu the answer must lie in building patterns of collaborative annotation into the articles themselves, and becoming the intermediary in the creation of the community dialogue at the level of issues in the scholarly workflow. So it seemed natural that my next conversation was with the ever-inventive Kent Anderson of Redlink, who was able to show me Remarq, in its beta version and due to be formally launched on 1 May. Here discoverable annotations lie in the base of layers of service environments which enable any publisher to create community around annotated discussion and turn it into scholarly exchange and collaboration. We have talked for many years about the publishing role moving beyond selecting, editing, issuing and archiving – increasingly, I suspect, the roles of librarians – and moving towards the active support of scholarly communication. And this, as Remaeq makes clear, includes tweets, blogs, posters, theses, books and slide sets as well as articles. Services like Hypothes.is and Remarq are real harbingers of the future of publishing when articles appear on preprint servers and in repositories or from funder Open Access outlets, where the subject classification of the research is less important than who put up the research investment.

And, of course, the other change factor here is the evolution of the article (often ignored – for some reason we seem to like talking about change but are reluctant to grip the simple truth that when one thing changes – in this case the networked connectivity of researchers – then all the forms around it change as well, and that includes the print heritage research article). Already challenged by digital inclusivity – does it have room for the lab video, the data, the analytics software, the adjustable graphs and replayable modelling? – it now becomes the public and private annotation scratchpad. Can it be read efficiently by a computer and discussed between computers? We heard reports of good progress on machine readability using Open Science Jupiter Notebooks, but can we do all we want to fork or copy papers and manipulate them while still preserving the trust and integrity in the system derived from being able to identify what the original was and being always able to revert to it. We have to be able to use machine analysis to protect ourselves from the global flood of fresh research – if the huge agenda was light anywhere then it was on how we absorb what is happening in India, China, Brazil and Russia into the scholarly corpus effectively. But how good it was to hear from John Hammersley of Overleaf, now leading the charge in connecting up the disconnected and providing the vital enabling factor to some 600,000 users via F1000 and thus in future the funder-publisher mills of Wellcome and Gates, as well as seeing Martin Roelandse of Springer Nature demonstrating that publishers can potentially join up dots too with their SciGraph applicationfor relating snippets, video, animations sources and data.

Of course, connectivity has to be based on common referencing, so at every moment we were reminded of the huge importance of CrossRef and Orcid Incontrovertible identity is everything, I was left hoping that Orcid can fully integrate with the new CrossRef Events data service, using triples in classical mode to relate references to relationships to mentions. Here again, in tracking 2.7 million events since service inception last month, they are already demonstrating the efficacy of the New Publishing – the business of joining up the dots.

So I wish UKSG a happy 40th birthday – they are obviously in rude health. And I thank Charlotte Rouchie, closing speaker, for reminding me of Robert Estienne, who I have long revered as the first master of metadata. In 1551 he divide the bible into verses – and to better compare Greek with Latin, he numbered them. Always good to recall revolutionaries of the past!

PS. In my last three blogs I have avoided, I hope, use of the word Platform. Since I no longer know what it means, I have decided to ignore it until usage clarifies it again!

Three years ago industry commentators in B2B began what has become a parrot-cry – “Look at the workflow!” – and I admit to being more guilty than most. All of a sudden we were looking at compelling applications where publishers/information providers/content companies were discovering that they had data which really did facilitate decision-making or in other ways enable corporate workflows to function more productively, more effectively, cheaper – and quicker. And so, from payroll to procurement, from risk management to assured compliance, we have seen a wonderful rash of data-rich applications, with more to come as machine learning and AI sharpens the cutting edge of what we can do, and the formula subtly alters from “data as content injected into workflow adds value to workflow systems software” to “workflow systems software selects and licenses third party data as content to support software-driven solutions”.

Time to take stock? I think so. I still see liens who will always believe that their content is a more valuable part of the mix than anyone’s system software. I work with B2B players who passionately believe that they should be fully integrated as users of software and content, and who produce a good deal of software themselves, but I work with very few companies who combine long workflow systems software – the sort that goes from the beginning of a process to the end – with having all of the data content needed to fuel the system and satisfy all of the decision point needs on the way. I recall with great pleasure the IP Manager system built by Thomson Reuters IP (now Clarivate Analytics) to support pharma patent lawyers in managing the workflow of new patent activity, where the huge resources of the company fired up the decision process on claims and infringement. And then, at Lexis Nexis Risk, the purchase of ChoicePoint in a company that built its own Hadoop-derived database systems gave opportunities to roll out decision-making systems for US domestic insurers, using its own data with federal and state data readily available under public licensing schemes.

But you notice that these examples are both very large enterprises and a few years old. Today I find fewer dramatic examples of people doing both, and more and more examples of the systems software and the data coming together independently. I had this in mind when looking at the deal which Thomson Reuters and SAP announced two weeks ago. This is undoubtedly a great deal for both parties, since using the Thomson Reuters World Check database to put some real teeth into the SAP Business Partner Screening service employs the market leading data source of PEPs and other folks we mustn’t trade with into the filtering systems of one of the enterprise software majors. I am sure that the royalties will be a high margin delight, and the customers very happy, but those customers are SAP customers, I presume. And Thomson Reuters here are driving one whole element of the SAP service business, So who “owns” the end customer – SAP because the service can only be used by a SAP licensed user? Or do Thomson Reuters have an implicit “ownership” – once SAP’s clients are into this service it will be very hard to change data source , especially in an instance where there is no better one available. But SAP’s client is still only indirectly Thomson Reuters customer, and would only become one if Thomson Reuters decided to build a service that emulated the SAP workflow, or the SAP client decided to go back to using a less sophisticated Thomson – driven enquiry service.

And then my worries were exacerbated by a really interesting conversation with Aravo Solutions (Aravo.com). I would describe this company as a “lurker” – a seventeen year old start-up only now coming into its own time. Being so far in front of the game usually results in extinction, but in this case it has produced an exciting player writing custom and modular workflow software and applying it mostly in fintech markets. And at every stage licensing in best of breed content to supply its functionalities with content from which solutions can be derived. In light of this its licensing partners are unsurprising: amongst them are Accuity (RBI), Dow Jones, D&B, Kompany, Lexis Nexis, Arachnys and Thomson Reuters. Powerful and valuable companies all, but none of them owning the end-user relationship with Aravo’s clients, who include GE, Unilever etc etc.

I would not argue for a moment that you cannot run a rapid growth, high margin business on data licensing. And look at the rapidly failing B2B magazine markets, once the heartland of the sector. They owned the customer, in that they had a direct subscription relationship with him, but they did not have a relationship with him, they did not know he was changing his nature until it was too late, and they continued to send formatted print and online products to him long after the point of relevance was lost. My point simply is that if your new business model is based on being a third party in a licensing relationship, how do you know what is working and what is not? Is your ability to innovate this limited by your software partners understanding of what is happening. And as the complexity of Big Data subject to AI and machine learning grows greater can your partners control your margins as well while you still have to re-invest in more data enrichment to keep your place in the market?

Being the data content partner is not a bed of licensing roses. Things are changing really fast now. Some bigger players can migrate to full service offerings. Others will buy Aravo’s peers and seek niche dominance. But for very many smaller B2B players who have firmly implanted themselves as data suppliers a very uncomfortable situation is developing. They may not be able to own customer relationships or data access pricing. The new position is called Powerlessness.

keep looking »