Jan
18
Workflow from the Bottom Up
Filed Under Big Data, Blog, Industry Analysis, internet, Publishing, Reed Elsevier, Search, semantic web, social media, STM, Uncategorized, Workflow | 5 Comments
Trends and trending analysis are one thing, making an impact on the way people work is often quite another. So while I respectfully write up the huge progress being made to provide large scale tools for analytical discovery in unimaginable quantities of data, a small portion of me remains skeptical about the impact of these developments in the short term on the working lives of professionals. Look at researchers in science and technology: you can readily imagine the impact of Big Data on Big Pharma, but can you so easily imagine what this will mean in materials science? Or can you see how the workbench performance of the individual researcher in neuroscience might be impacted? Its tough, and because it is tough we go back to saying that the traditional knowledge components will last the course. So if you have a good library, access to a reasonable collection of journals and the ability to network with colleagues then that is enough. Or Good Enough, as we keep saying.
So when I read the words “This is important not only for the supplementary data accompanying one’s experiment, but even negative results” I came alive immediately and read consciously what I had hitherto skipped. You see, in all the years that I have spoken with and interviewed researchers, when we get off the formal ground of OA or conventionally published articles, or the iniquities of publishers and the inadequacy of librarians, we get back to some stubborn issues that cling to the bottom of the bucket. One is what do you do with the remaining content derived from the research process which did not get into the article, where it was summarized and where conclusions were drawn from it. I mean the statistical findings, the raw computations. the observations and logs, the audio and video diaries, the discarded hypotheses etc. Vital stuff, if anyone is going to walk that way again. Even more vital is the detritus of failure: the experiment which never made a paper since it demonstrated what we already know, or where the model proved inadequate to demonstrate what we sought to show. Researchers going back to find why a generation of research went astray from a finding that proved fallible often need this content: in terms of detective fiction it is the cold case evidence. Yet more often than not it is not available.
So here is what I found in the nearly discarded press release. Nature Publishing’s Digital Science company (yes, them again!) have refinanced figshare (http://figshare.com) and yesterday they relaunched it. What does it do? It archives all the stuff I have been talking about, providing a Cloud environment with unlimited public public storage. They call it “a community-based open data platform for scientific research”. I call it a wonderful way of embedding research workflow into a researchable storage environment that eventually becomes a search magnet for researchers wanting to check the past for surprising correlations. At the moment it is just a utility, a safe place to put things. But if I just add a copy of the article itself then it becomes a record of a research process. Put hundreds of thousands of those together and then you have a Big Data playground. Use intelligent analytics and new insights can be derived, and science moves forward on the tessellate of previous experimentation – only quicker, with less effort and more productivity for the researcher. And much less is lost, including the evidence from the wrong turnings that turned out to be right turnings. (http://digital-science.com/press-releases/)
So will there be 20 of these? Well, there may be two, but if figshare gets an early lead perhaps there will only be one. After all , the reason researchers would come to value this storage would be having their content in close proximity to others in their field. And while early progress is likely to run quick in Life Sciences, this application has relevance in every field of study. And it also calls into question ideas of what “publishing” actually is. By storing and making available these data, are figshare “publishing” them. They are certainly not editing or curating them. Network access alters many things and here, once again, it catches publishing on the hop. If traditional publishers confine themselves to making margins solely from the first appearance of an article then traditional publishing in this sector is in severe difficulty, whatever happens to the Open Access debate. Elsevier and Nature clearly get it: go upstream in value terms or drown in commoditized content where you are. But does anyone else see it? And why not?
« go back — keep looking »