A sudden thought. Doing an interview with some consultants yesterday (we are fast approaching the season when some major STM assets will come back into the marketplace) I was asked where I had estimated Open Access would be now when I had advised the House of Commons Science and Technology Committee back in 2007 on the likely penetration of this form of article publishing. Around 25%, I answered. Well, responded the gleeful young PhD student on the end of the telephone, our researches show it to be between 5-7%. Now, I am not afraid of being wrong (like most forecasters, I have plenty of experience of it!). But it is good to know why and I suspect that I have been writing about those reasons for the last two years. Open Access, defined around the historic debate twixt Green and Gold, when Quixote Harnad tilted at publishers waving their arms like windmills, is most definitely over. Open is not, if by that we begin to define what we mean by Open Data, or indeed Open Science. But Open Access is now open access.

In part this reflects the changing role of the Article. Once the place of publisher solace as the importance of low impact journals declined, it is now the vital source of the things that make science tick – metadata, data, abstracting, cross-referencing, citation, and the rest. It is now in danger of becoming the rapid act at the beginning of the process which initiates the absorption of new findings into the body of science. Indeed some scientists (Signalling Gateway provided examples years ago) prefer simply to have their findings cited – or release their data for scrutiny by their colleagues. Dr Donald Cooper of the University of Colorado, Boulder, used F1000Research to publish a summary of data collected in a study that investigated the effect of ion channels on reward behavior in mice .In response to public referee comments he emphasized that he published his data set in F1000Research “to quickly share some of our ongoing behavioral data sets in order to encourage collaboration with others in the field”. (http://f1000.com/resources/Open-Science-Announcement.pdf)

I have already indicated how important I think post-publication peer review will be in all of this. So let me now propose a four-stage Open Science “publication process” for your consideration:

1. Research team assembles the paper, using Endnote or another process tool of choice, but working in XML. They then make this available on the research programme or university repository, alongside the evidential data derived from the work.

2. They then submit it to F1000 or one of its nascent competitors for peer review at a fee of $1000. This review, over a period defined by them, will throw up queries, even corrections and edits, as well as opinion rating the worth of the work as a contribution to science.

3. Depending upon the worth of the work, it will be submitted/selected for inclusion in Nature, Cell, Science or one of the top flight branded journals. These will form an Athenaeum of top science, and continue to confer all of the career-enhancing prestige that they do today. There will be no other journals.

4. However, the people we used to call publishers and the academics we used to call their reviewers will continue to collect articles from open sources for inclusion in their database collections. Here they will do entity extraction and other semantic analysis to make what they will claim as the classic environments which each specialist researcher needs to have online, while providing search tools to enable users to search here, or here plus all of the linked data available on the repositories where the original article was published – or search here, on the data, and on all other articles plus data that have been post-publication reviewed anywhere. They will become the Masters of Metadata, or they will become extinct. This is where, I feel, the entity or knowledge stores that I described recently at Wiley are headed. This is where old-style publishing gets embedded into the workflow of science.

So here is a model for Open Science that removes copyright in favour of CC licenses, gives scope for “publishers” to move upstream in the value chain, and to increasingly compete in the data and enhanced workflow environments where their end-users now live. The collaboration and investment announced two months ago between Nature and Frontiers (www.frontiersin.org), the very fast growing Swiss open access publisher seems to me to offer clues about the collaborative nature of this future. And Macmillan Digital Science’s deal on data with SciBite is another collaborative environment heading in this direction. And in all truth, we are all now surrounded by experimentation and the tools to create more. TEMIS, the French data analytics practice, has an established base in STM (interestingly their US competitor, AlchemyAPI, seems to work most in press and PR analysis). But if you need evidence of what is happening here, then go to www.programmableweb.com and look at the listings of science research APIs. A new one this month is BioMortar API “standardized packages of genetic patterns encoded to generate disparate biological functions”. We are at the edge of my knowledge here, but I bet this is a metadata game. Or ScholarlyIQ, a package to help publishers and librarians sort out what their COUNTER stats mean (endorsed by AIP), or ReegleTagging API, designed for the auto-tagging of clean energy research, or, indeed, OpenScience API, Nature Publishing’s own open access point to searching its own data.

And one thing I forgot. Some decades ago, I was privileged to watch one of the great STM publishers of this or any age, Dr Ivan Klimes, as he constructed Rapid Communications of Oxford. Then our theme was speed. In a world where conventional article publishing could take two years, by using a revolutionary technology called fax to work with remote reviewers, he could do it in four months. Dr Sam Gandy, an Alzheimer’s researcher, is quoted by F1000 as saying that his paper was published in 32 hours, and they point out that 35% of their articles take less than 4 days from submission to publication. As I prepare to stop writing this and press “publish” to instantly release it, I cannot fail to note that immediacy may be just as important as anything else for some researchers – and their readers.

Spring came late to Berlin this year, as elsewhere in Europe. But with the Spargel festival just starting, the trees in bud on Unter den Linden, the German courts ruling that you cannot re-sell an ebook and the German Government’s technical advisors indicating that government-funded research must be Open Access, it was clearly time to be there for the 10th annual Publishers’ Forum. Developed by Helmut von Berg and his colleagues at Klopotek, this has now clearly emerged as one of the leading places in Europe to talk about the future of what we are increasingly calling “networked publishing”. The meeting has moved from the Brandenburg Gate and the Pariserplatz back to the regenerating West Berlin of the Kurfurstendamm, but the urge to get to the roots of progressive development in what we once called the book business has not diminished.

By design and accident (loss of a keynoter) I played to more halls in this meeting than in any of the previous five that I have attended. Leave that to one side: my slideset is available under downloads on this site and on the conference site at www.publishersforum.de you will find slides, summaries, images, videos and references (including a very interesting tweetstream at #publishersforum) as these meetings get increasingly blanket-documented with linked description, comment and commentary. Data, in fact. An audience of 350 people at work with speakers, organizers, and media to discuss and share. Collaboration. And that was the theme of the meeting – Collaboration in the Age of Data adds up to Networked Publishing.

And from these sessions it is now clear where we are headed This Spring is definitive in ways that other Springs have not quite been. In every previous year you could be sure, here in thoughtful, conservative Germany, that someone would say that we wee jumping the gun, that format would survive fragmentation, that the “book would never die”. No such voices this week. In an audience that loves books and lives by them, I felt an absolute certainty that while “book as comforting metaphor” would survive, my friends and colleagues in the body of the hall knew that they had entered the Age of Data. We described network publishing as allusive, particulate, and above all, linked. We talked about workflow: our customer’s workflow as well as our own. This was the age of Metadata as well as the Age of Data. Speaker after speaker spoke of the potential to release new value from content as data, and the need for systems and services to support that monetization potential.

And the feedback loop was everywhere in evidence. The user and the networked power of users has completely shifted the balance from the editorial selectivity of gatekeeper producers to the individualized requirements of users. We once Pushed where now the increasingly Pull. But loyalty was not sacrificed on the way: if you provide solutions that fit user needs exactly then you can experience what Jan Reicert of Mendeley described in a private session as “amazing user love”. On the main agenda, Brian O’Leary spoke, with his usual lucid intelligence, on the disaggregation of supply, and amongst publishers Dan Pollock (formerly Nature, now Jordans) effectively defined the network publishing challenge, (replete like the auto industry with lack of standards) while Fionnuala Duggan of Coursesmart tracked the way in which the textbook in digital form becomes a change agent in conservative teaching societies while enabling the development of new learning tools. Kim Sienkiewicz of IIl demonstrated the semantic web at work in educational metadata. And Christian Dirschl of Wolters Kluwer Germany updated us on the continued development of the Jurion project, a landmark in semantic web publishing for lawyers.

Alongside the publishers stood the Enablers. Publishing seldom realises the value that it gets from its suppliers. Indeed, one of my current mantras is that the importance of software in the industry is now so great that few content players are not also software developers, and that the relationships they enter into with third parties are often no longer supplier agreements, but really partnership and often strategic alliance agreements, and need to be recognized as such. They not only add value, but they materially affect the valuation of the content players themselves. It is no accident that it was Uli von Klopotek who opened this event for his company, and it was gratifying to see on the platform a range of services that are symptomatic of the re-birth described here. Hugh McGuire from Pressbooks in Canada exemplifies that enablement, as does Martin Kaltenbock of Austria’s Semantic Web Company. Jack Freivald of Information Builders, Adam DuVander of Progammable Web, and Anna Lewis and Oliver Brooks of ValoBox were each able to demonstrate further value additionality through an elaboration of networked publishing. The result was a rich gulasch suppe of networked expedients ( far more nutritional than the prevalent currywurst of this city!).

The conference agenda spoke of momentum. Laura Dawson (Bowker), a prescient commentator, noted how far we had gone in her Open Book presentation. And if we still lack standards, we have people like BISG and Editeur on this agenda struggling towards them. One of the most attractive features of the old book business was its anarchic and “cottage industry” flavour. I think it will retain many anarchic and small business qualities in the network, but it will be increasing bounded by standards of networked communication.

keep looking »