Dear reader, I am aware that I have been a poor correspondent in recent weeks, but in truth I have been doing something I should have done long ago: gaining some experience of AI companies, talking to their potential customers and reading a book. Lets start at the end and work backwards. 

The book that has eaten the last week of my life is Edward Wilson-Lee’s fine new publication, The Catalogue of Shipwrecked Books, which describes the eventful life of Christopher Columbus’ illegitimate son, Hernando, and his attempts to build a universal library of human knowledge. Hernando collected printed works, including pamphlets and short works, in an age when many Scholars then still regarded all print as meretricious rubbish. He built a catalogue of his collection, and then realised that he could not search it effectively unless he knew what was in the books, so started compiling summaries – epitomes – and then subject indexing, as well as inventing hieroglyphs to describe the physical properties. In other words, in the 1520s in Seville he built an elaborate metadata environment, but was eventually defeated by the avalanche of new books pouring out of the presses of Venice and Nuremburg and Paris. Wilson-Lee very properly draws many parallels with the early days of the Internet and the Web. 

As I closed this wonderful book, my mind went back to an MIT Media Lab talk in 1985 given by Marvin Minsky. We need reminding how long the central ideas of AI have been with us. At the end of his talk, the Father of AI kindly took questions, and a tame librarian in the front row asked “Professor, If you were looking back from some inconceivably distant date, like, say, 2020, what would surprize you that you have in 2020 but which we do not have now?”. After a thoughtful moment, the great man replies “Well, I guess that I would praise your wonderful libraries, but  still be surprized that none of the books spoke to each other”. At that he left the room, but from then the idea of  books interrogating books , updating each other and creating fresh metadata and then fresh knowledge in the process of interaction has been part of my own Turing test. So I find it easy to say that we do not have much AI in what we call the information industry. We have a meaningless PR AI, a sort of magic dust we sprinkle liberally (AI-enhanced, AI-driven, AI- enabled etc) but few things pass the “books speaking to books and realising things not known before” test.

And yet we can and we will. The key questions are, however: will current knowledge ownership permit this without a struggle, and will there be a dispute over the ownership of the results of these interactions? This battle is already shaping up in academic and commercial research, so it was dispiriting to find when talking to AI companies that it seems there is really no business model in place yet enabling co-operation. Partly this is a problem of perception. Owners and publishers see the AI players as technicians adding another tier of value under contract – and then going away again. The AI software developers see themselves as partners, developing an entirely new generation of knowledge engine. And neither of them will really get anywhere until we all begin to accept the implications of the fact that no one, not even Elsevier, as enough stuff in one place to make it work at scale. And while one can imagine real AI in broad niches — Life Sciences – the same still applies. And if we try it in narrow niches, how do we know that we have fully covered the crossovers into other disciplines which have been so illuminating for researchers  in this generation? In our agriscience intelligent system how much do we include on food packaging, or consumer market research, or plant diseases, or pricing data? 

So what happens next? In the short term it is easy to envisage branded AI – Elsevier AI, Springer Nature AI? I am not sure where this gets us. In the medium term I certainly hope to see some data sharing efforts to invest in AI partnerships and licence data across the face of the industry. It is true that there are some neutral players – Clarivate Analytics for example and in some ways Digital Science – who are neutral to the knowledge production cycle and have hugely valuable metadata collections. They could be a vital building block in joint ventures with AI players, but their coverage is still narrow, and in the course of the last month I even heard a publisher say “I don’t know why we let Clarivate use our data – we don’t get anything for it!”. 

Of course, unless we share our data we are not going to get anywhere. And given the EU Parliament rejection of data metering and enhanced copyright protection last week all these markets are wide open for for massive external problem solving – who remembers Google Scholar? The solution is clear – we need a collaborative model for data licensing and joint ownership of AI initiatives. We have to ensure that data software entrepreneurs get a payback and that investment and data licensing show proper returns, just as Hernando rewarded the booksellers who collected his volumes all across Europe. In a networked world collaboration is often said to the the natural way of working. It is probably the only way that AI can be fully implemented by the scholarly communications world. Hernando died knowing his great scheme had failed. AI will succeed if it shows real benefits to research and those who fund it. As it succeeds it will find other ways of sourcing knowledge if those who commercially control access today are not able to find a way of leading the charge, and not dragged along in its wake. 

Since I last wrote a piece here I am older by three conferences and an exhibition. And no wiser for having spoken twice on cyber-security, a subject that baffles me every time I stand up to talk about it. The simple truth is that the world is changing in the networks at a pace that bewilders, yet the visions we have of where we are going hang before us like a tantalising but currently unattainable vision. Thus, if you ask me about the future of education, I can spin you a glowing tale of individuals learning individually, at their own pace, yet guided by the learning journey layer out by their teachers, who have now become their mentors. The journey is self-diagnostic and self-assessing, examinations have become redundant and we know what everyone knows and where their primary skills lie. Or in academic or industrial research, projects are driven by results, research teams recruited on that basis, and their reputation is scored in terms of the value their peers set on their accomplishments. The results of research are logged and cited in ways that make them accessible to fellow researchers in aligned fields – by loading and pointing to evidential data, or noting results and referencing them on specialized or community sites, or by conventional research reporting. Peer review is continual, as research remains valid until it is invalidated and may rise and fall in popularity more than once. And so on through business domains, medicine and healthcare, agriculture and the whole range of human activity…

But at this point, when I talk about the growing commonality of vision, the role of workflow analysis, RPA, what happens next with machine learning, the eventual promise of AI, a hand shoots up and I find myself answering questions from the ex-CFO/now CEO about next years budget, and when will the existing IT investment pay back, and can this all be outsourced and surely we don’t need to do any more than buy the future when it arrives? And of course these questions are all very pertinent. We all need to assure revenues and margins next year if we are to see any part of this future. And next years revenues will come from products and services which will look more like last years than they do like the things we shall be doing in 2025, even if we had an idea of what those might be. It is one thing knowing something about the horizons, quite another to design a map to get there. So at every point we seek every way we can to buttress future-proofing, and at the moment I am seeing a spate of that in acquisitions. Just as last year putting the word “Analytics” at the end of your name (Clarivate, Trevino) added a billion to the exit valuation, so this year the dotai suffix has proved to be a real M&A draw.

But those big Analytics sales were made, and will be onsold to people who want to expand their data and services holdings. The .ai sales are transplants from the seedbed, and far earlier stages of transplantation are involved. Having worked for some years as an advisor to Quayle Munro (now, as an element of Houlihan Lokey, part of one of the largest global M&A outfits) I realise that smaller and smaller sales may not be considered a good thing, but I cannot resist the idea that seeking some future tech developments into your incubator environment is going to have some really beneficial long term effects. It already has at Digital Science. As Clarivate lerans from what Kopernio knows it will help . As the magic of Wizdom.ai rubs off on T&F, it will help there.

But, again, we are begging a hundred questions. Can you really future proof by buying innovation? Well, only to a limited effect, but by having innovators inside you can learn a lot, at least from their different perspective on your existing customers. Don’t you need to keep them from being crushed by the managerial bureaucracy of the rest of the business? Yes, but why not try to fee up the arthritic bits rather than treating the flexible bits? What if you have bought the wrong future tech? Even the act of misbuying will give you useful pointers the next time round, but if you have bought the right people they will be able to change direction. What if software people and text publishing people do not get on? They will need to be managed – this is your test – since if we fail the future will be conditioned entirely by software giants licensing data from residual fixed income publishers.

Are there any conditioning steps I should be taking to ease into this future? Yes, forget ease and go faster. Look first at your own workflow. To what extent is workflow automated? Do you have optimum ways of processing text? Are people or machines taking the big burdens on proof reading, or desk editing or manuscript preparation? Is your marketing as digital as it could be? Are you talking the language of services, and designing solutions for your users, or are you giving your users reference sources and expecting them to find the answers? Indeed, do you talk the language of solutions, or the ritual language of format – book, journal, page, article. Are we part of the world our users are entering, or are we stuck in the world they are exiting?  The exhibition I attended this month was the London Book Fair. I love it in all its inward-looking entrancement with itself, and its love affair with the title Publisher, the profession for which no qualification other than skill at explaining away unsuccess has ever been required. I can only take one day since I rapidly become depressed. But still there were very sparky moments – an impromptu discussion with the Chennai computer typesetter TNQ (www.tnq.co.in) about their ProofControl 3.0 service told me that these guys are on the ball. But moments like this were rare. More often I felt I was watching the future –  of the industry in 1945!

« go backkeep looking »