Jan
9
Post-Pub and Preprint -The Science Publishing Muddle
Filed Under B2B, Big Data, Blog, data analytics, healthcare, Industry Analysis, internet, Publishing, Reed Elsevier, Search, semantic web, STM, Uncategorized, Workflow | 2 Comments
New announcements in science publishing are falling faster than snowflakes in Minnesota this week, and it would be a brave individual who claimed to be on top of a trend here. I took strength from Tracy Vence’s review, The Year in Science Publishing (www.the-scientist.com), since it did not mention a single publisher, confirming my feeling that we are all off the pace in the commercial sector. But it did mention the rise, or resurrection, of “pre-print servers” (now an odd expression, since no one has printed anything since Professor Harnad was a small boy, but a way of pointing out that PeerJ’s PrePrints and Cold Spring Harbor’s bioRxiv are becoming quick and favourite ways for life sciences researchers to get the data out there and into the blood stream of scholarly communication). And Ms Vence clearly sees the launch of NCBI’s PubMed Commons as the event of the year, confirming the trend towards post-publication peer review. Just as I was absorbing that I also noticed that F1000, which seems to me to still be the pacemaker, had just recorded its 150,000th article recommendation (and a very interesting piece it was about the effect of fish oil on allergic sensitization, but please do not make me digress…)
The important things about the trend to post-publication peer review are all about the data. Both F1000 and PubMed Commons demand the deposit or availability of the experimental data alongside the article and I suspect that this will be a real factor in determining how these services grow. With reviewers looking at the data as well as the article, comparisons are already being drawn with other researcher’s findings, as well as evidential data throwing up connections that do not appear if the article alone is searched in the data analysis. F1000Prime now has 6000 leading scientists in its Faculty (including two who received Nobel prizes in 2013) and a further 5000 associates, but there must be questions still about the scalability of the model. And about its openness. One of the reasons why F1000 is the poster child of post publication peer review is that everything is open (or, as they say in these parts, Open). PubMed Commons on the other hand has followed the lead of PeerJ’s PubPeer, and demanded strict anonymity for reviewers. While this follows the lead of the traditional publishing model it does not allow the great benefit of F1000: if you know who you respect and whose research matters to you, then you also want to know what they think is important in terms of new contributions. The PubPeer folk are quoted in The Scientist as saying in justification that “A negative reaction to criticism by somebody reviewing your paper, grant or job application can spell the end of your career.” But didn’t that happen anyway despite blind, double blind, triple blind and even SI (Slightly Intoxicated) peer reviewing?
And surely we now know so much about who reads what, who cites what and who quotes what that this anonymity seems out of place, part of the old lost world of journal brands and Open Access. The major commercial players, judging by their announcements as we were all still digesting turkey, see where the game is going and want to keep alongside it, though they will farm the cash cows until they are dry. Take Wiley (www.wiley.com/WileyCDA/pressrelease), for example, whose fascinating joint venture with Knode was announced yesterday. This sees the creation of a Knode – powered analytics platform provided as a Learned Society and industrial research service, allowing Wiley to deploy “20 million documents and millions of expert profiles” to provide society executives and institutional research managers with “aggregated views of research expertise and beyond”. Anyone want to be anonymous here? Probably not, since this is a way of recognizing expertise for projects, research grants and jobs!
And, of course, Elsevier can use Mendeley as a guide to what is being read and by whom. Their press release (7 January) points to the regeneration of the SciVal services, “providing dynamic real-time analytics and insights into the… (Guess What?)… Global Research Landscape”. The objective here is one dear to governments in the developed world for years – to help research management to benchmark themselves and their departments such that they know how they rank and where it will be most fruitful to specialize. So we seem to be quite predictably entering an age where time to read is coming under pressure from volumes of available research articles and evidential data, so it is vital to know, and know quickly, what is important, who rates it, and where to put the most valuable departmental resources – time and attention-span. And Elsevier really do have the data and the experience to do this job. Their Scopus database of indexed abstracts all purpose written to the same taxonomic standard now covers some 21,000 journals from over 5000 publishers. No one else has this scale.
The road to scientific communication as an open and not a disguised form of reputation management will have some potholes of course. CERN found one, well-reported in Nature’s News on 7 January (www.nature.com/news under the headline “Particle Physics papers set free”. CERN’s plan to use its SCOAP project to save participating libraries money, which was then to be disbursed to force journals to go Open Access met resistance, but from the APS, rather than the for profit sector. Meanwhile the Guardian published a long article (http://www.theguardian.com/science/occams-corner/2014/jan/06/radical-changes-science-publishing-randy-schekman) arguing against the views of Nobel laureate Dr Randy Schekman, the proponent of boycotts and bans for leading journals and supporters of impact factor measurement. Perhaps he had a bad reputation management experience on the way to the top? The author, Steve Caplan, comes out in favour of those traditional things (big brands and impact factors), but describes their practises in a way which would encourage an un-informed reader to support a ban! More valuably, the Library Journal (www.libraryjournal.com/2014/01) reports this month on an AAP study of the half-life of articles. Since this was done by Phil Davis it is worth some serious attention, and the question is becoming vital – how long does it take for an article to reach half of the audience who will download it in its lifetime? Predictably the early results are all over the map: health sciences are quick (6-12 months) but maths and physics, as well as the humanities, have long duration half lives. So this is another log on the fire of argument between publishers and funders on the length of Green OA embargoes. This problem would not exist of course in a world that moved to self-publishing and post-publication peer review!
POSTSCRIPT For the data trolls who pass this way: The Elsevier SciVal work mentioned here is powered by HPCC (High Power Computing Cluster), now an Open Source Big Data analytics engine, but created for and by LexisNexis Risk to manage their massive data analytics tasks as Choicepoint was absorbed and they set about creating the risk assessment system that now predominates in US domestic insurance markets. It is rare indeed in major information players to see technology and expertise developed in one area used in another, though of course we all think it should be easy.
Nov
29
Neon City
Filed Under B2B, Big Data, Blog, data analytics, Education, eLearning, Financial services, Industry Analysis, internet, mobile content, Publishing, Search, semantic web, social media, Uncategorized, Workflow | 2 Comments
Perhaps the one thing that Korean cities like Busan and Seoul have in common with Hong Kong is the neon. Looking out of a Hong Kong Club window earlier this week around 8 pm I observed the office building light show, as each of the major buildings showed off their neon displays in a winking cacophony of soundless light. Very impressive, and a good backdrop to the intellectual light show the next day, as the opening speaker at the Business Information Industry Association (BIIA) joint event with the Hong Kong Knowledge Management Society and Hong Kong Polytechnic University took us on an intellectual journey into artificial intelligence that all of us following speakers struggled to emulate. But as an exercise in reconciling the thinking of CIO/CTO – level management with the current developments in data analytics the whole meeting could not have been better organized. As well as a vision of the potential futures in machine and system intelligence, it provided here and now guidance eon the reasons why we need to start and continue down this path, and the benefits we may expect to gain from doing so.
But let me start at the beginning. The first speaker, Ben Goertzel, is both an AI expert, and an innovator and entrepreneur. With his colleagues at Aidyia (www.aidyia.com), at Hong Kong Polytechnic University, and through his OpenCog Foundation (http://goertzel.org; http://opencog.org) he works both as a developer of new concepts, and of applications in financial services and trading markets. Aidiya “Aidyia is developing advanced artificial intelligence technology to model and predict financial markets. Aidyia’s predictive model will empower programmed trading systems for fund management.” He gave as good a demonstration as you could wish for if you accept Ray Kurzweil’s proposition that machine intelligence will exceed human intelligence by 2045. Yet much of his talk underlined the idea that you do not need belief: there is enough already in the marketplace to persuade us that we are progressively replacing certain tasks in society with machine intelligence, and that this is a beneficial process.
Here then was a session where the AGI element was centre stage, but you did not have to be a follower of Singularity theory or a robotics fanatic to carry away an enduring belief in the ability of men like Ben Goertzel to change entirely the basis upon which we look at the intelligence which we are currently building into the world of solutions and services. So it was heartening after this to hear Euan Semple drawing upon his valuable experience as head of social media at the BBC to persuade us that we need to grow up – and grow into our inheritance as the interpreters and re-users of the most valuable insights available in the unforced and natural communication within the social media network. In other words, social media are vital if we want to contextualize our analytics and give a reality check to what we are doing with data elsewhere.
Which nicely prepared us for the challenge of cloud computing, as presented by Professor Eric Tsui of HK Polytechnic University. But we were soon past conventional cloud environments Professor Tsui believes with great persuasive power that we shall soon exhaust the cost reduction and compliance benefits of the current cloud collaboration. He calls this the “adolescent cloud”, and excited all of us with a vision of the cloud playing a role in Open Innovation, and in Connectionist learning – a Knowledge Cloud. The Tsui theory that the cloud will become a key element in new business model development and rapid re-iteration of service models is an attractive one, and it blends the cloud as a huge data repository firmly into other strands of developmental and analytical thinking. And these mental fireworks had scarcely died down before Professor Nicolas Lesca of the Universite Claude Bernard at Lyon took the stage. His argument – and I suspect that we had the preliminaries of a presentation of several hours – is that data analytics now adds an extra dimension in a way that few of us had considered before. His arguments were all about the interpretation of weak signals, picking up messages from the data which might previously have never been heard or measured, let alone interpreted. How you amplify these signals, and separate noise from content, is the subject of Professor Lesca’s research, and his thinking had a clear resonance for the debate in the room.
For those whose heads were aching with ideas overload, it was good that the last speaker was the present writer, trying to sum up and pull these themes together. Is there a dichotomy between knowledge management and so-called “Big Data”? Not in this conference. Speakers simply added richness and complexity to the increasing importance of knowledge management subsuming all of these AI, social media, cloud computing and weak signalling themes. As a result marketplaces for information grow more rewarding as well as more complex, and the skills base around knowledge work becomes ever more demanding. I hope the Professors in our programme are as good at producing knowledge managers as they obviously are at knowledge research. And one last thought lingered in my mind. Several times in the day we hovered over search. The expression “needle in a haystack” was used, and we pointed out to each other how inappropriate it was. After all, if we knew we were looking for a needle, and that the place where it was to be found was in a haystack, then the job was done. Bring in the metal detector! Yet the first image I recall in the early days of search was of a huge bale of documentation in an advertisement for BRS Search with people crawling all over it – a veritable document haystack. If Knowledge Management has anything to do with that world then all is lost. If we have not disintegrated and disaggregated the document then we are never going to get to a point of data granularity where this new world has a chance of working. At the moment, though no one said it here in Hong Kong, Knowledge managers may be adherents to the brave new world when out of the office, but too many are prisoners of the wicked world of legacy document based systems when they get home.
Please check the websites of BIIA (www.biia.com) and HKKMS (www.hkkms.hk) for further reports and the slide sets from this really interesting meeting.
« go back — keep looking »