So there was a word for it after all. Some kindly soul at a conference last week, seeing that I was unable to describe the strange digital burbling that took place when you dialled up a database in 1979 and inserted the telephone handset into the accoustic coupler, kindly shouted out the correct expression – the noise was “gribbling” and I was delighted to be reunited with a term which should never have been lost. And it allows me to remark, if I have not lost you already, that it is a mature industry whose terms of art, invented for a purpose, have now fallen into disuse because the processes they describe have become redundant. I expect to have to explain to my children how my typographer’s ruler works, or what slug setting, or galleys, or heavy leading or hot metal meant. The fact that the first generation digital expressions are already themselves redundant (who last saw an accoustic coupler?) tells an important story.

And that story is particularly relevant to the fascinating conference that I was attending. Last week’s seminar on “Ready for Web 3.0?” organized by ALPSP and chaired by Louise Tutton of Publishing Technologies was just what the doctor ordered in terms of curing us of the idea that we still have time to consider whether we embrace the semantic web or not. It is here, and in scholarly publishing terms it is becoming the default embedded value, the new plateau onto which we must all struggle in order to catch our breath while building the next level of value-add which forms the expectation of users coming to grips with a networked information society today. And from the scholarly world it will spread everywhere. I will put my own slides from the introductory scene-setting on this site, but if you can find any of the meaty exemplar presentations from ALPSP (it is worth joining them if they are going to do more sessions of this quality) or elsewhere then please review them carefully. They are worth it.

Particularly noteworthy was a talk by Professor Terri Attwood and Dr Steve Pettifer from the University of Manchester (how good to see a biochemistry informatician and a computer scientist sharing the same platform!). They spoke about Utopia Documents, a next generation document reader developed for the Biochemical Journal which identifies features in PDFs and semantically annotates them, seamlessly connecting documents to online data. All of a sudden we are emerging onto the semantic web stage with very practical and pragmatic demonstrations of the virtues of Linked Data. The message was very clear: go home and mark-up everything you have, for no one now knows what content will need to link to what in a web of increasing linkage universality and complexity. At the very least every one who considers themselves a publisher, and especially a science publisher, should read the review article by Attwood, Pettifer and their colleagues in Biochemical Journal (Calling International Rescue: Knowledge Lost in the Literature and information Landslide  http://www.biochemj.org/bj/424/0317/bj4240317.htm) Incidentally, they cite Amos Bairoch and his reflections on Annotation in Nature Precedings (http://precedings.nature.com/documents/3092/version/1) and this is hugely useful if you can generalize from the problems of biocuration to the chaos that each of us faces in our own domains.

Two other aspects were intriguing. Utopia Documents had some funding from the European Commission, EPSRC, BBSRC, the University of Manchester and, above all, the BJ’s publisher, Portland Press. One expects the public bodies to do what they should be doing with the taxpayer’s cash: one respects a small publisher putting its money where its value is. And in another session, on the semantic web collaboration between the European Respiratory Society and the American Thoracic Society, called felicitously “Breathing Space”, we heard that the collaborators created some 30% of the citations in respiratory medicine, and that their work had the effect of “helping their authors towards greater visibility”. Since that is why the industry exists, it would seem that the semantic promise  underpins the original publication promise. Publishers should be creating altars for the veneration of St Tim Berners Lee and dedicating devotions to the works Shadbolt and Hall, scholars of Southampton.

Sadly they are not, but coming out of this day of intense knowledge sharing one could not doubt that semantic web, aka Linked Data, had arrived and taken up residence these several years in scientific academe. Now if it will only bite government information and B2B then we shall be on our way. And, as Leigh Dodds of Talis reminded us, we shall have to learn a new language on that way. Alongside new friends like ontologies  and entity recognition and RDF, add RDFa, SKOS (simple knowledge organizing systems to you!), XCRI education mark-up, OpenCalais (go to Thomson Reuters for more), triples, Facebook Open Graph, and Google Rich Snippets. Even that wonderful old hypertext heretic Ted Nelson got quoted later in the day: “Everything is deeply intertwingled”.  And lets remember, this is not a “lets tackle these issues at our own pace when we think the market is ready” sort of problem: it is a “we are sinking under the weight of our own data and the lifeboat was needed yesterday” sort of a problem. Publishers must tackle it: if we learn how to resolve it without intermediaries then we certainly shall not need publishers.

The question , when it came , was loaded in a way that I had not guessed at in advance , though I knew that its appearance was inevitable . I was speaking at an excellent MarkLogic breakfast briefing ( the slides are on this site ) last week and had chosen Super-distribution as my theme. I wanted to explore the argument, which I now encounter fairly regularly , that simply turning content into “workflow” is insufficient . Few content owners have enough content for complete workflow sequences . Ergo , third party and client content must be imported and used in conjunction with the process tools and content supplied by the solutions vendor . Best way to make this work is to open up the APIs , allow major customers to customize to their own workflow under JV or service agreements  , and learn from this how to mass-customize for smaller clients . This speeds up the development track for solution development , and utilizes the experience and technolgy savvy of major customers , who likewise get the benefit of learnings from third party users . For the content provider it can provide a lock-in , a market differentiation from other content providers , and a defence against that most feared of competitors – one’s own customers .

So , my questioner asked , you really do mean that most content has little worth in isolation and that paywalls are unlikely to succeed ? “Yes , I do ” was the answer and almost before it was out of my mouth I heard an echo of a conversation that must be happening across the information provider world right now , between senior commercial managers like my questionner and their group main board colleagues .” Information commoditized ?” , say the latter , “tell me this isn’t true . Tell me it applies to network johnny-come-latelys like the Murdochs in collapsing markets like newspapers . And tell me that it will never apply to the wonderful content we bought last year at 12X EBITDA and which we so badly needed to complete our dataset , enable us to expand in Central Asia and illustrate  the profound difference between ourselves and our hated competitor”.

 

And my friend , if he knows what is good for him , will say ” Just so ” and “I could not agree more” , but increasingly he will try to insert into the conversation things like ” Should we really be trying to build workflow on our own : might we look for allies at IBM , SAP or Oracle ?” or ” Maybe our historical hated competitor is really our future best friend ?” or ” Surely collaborating on tools with Autonomy or its ilk makes more sense than pretending we can re-invent and own the history of software ? ” Then he can reasonably say ” This is the last squeeze of the Lemon if that is representing the content model – and now at least we know about the development track that takes us to the next good place . And our business must be based on margin improvement and future visibility of returns , not upon some historic fixation with content which is increasingly remote to a network-based service industry .”

Will they listen ? I don’t know , but I am certain that the newspaper world was deaf to this dialogue . And I was very interested to see approval for Project Canvas in the UK last week . This creates a platform for the web integration of all free to air television in the UK . The Murdochs will inevitably feel that this competitively impacts their Sky franchise , but presumably , since it is clear that neither the Times nor the Sun can claim ( remember “it was the Sun wot done it ?”) to have delivered the UK coalition government , their political influence is deflating at the same rate as their readership .

Finally , on the same platform was Andy Stevens of IOPP giving a spendid example of agile publishing using MarkLogic to create mobile content sets around their journals data . As they say , check it out  (http://www.marklogic.com/news-and-events/news.html).

« go backkeep looking »