<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>DavidWorlock.com &#187; semantic web</title>
	<atom:link href="http://www.davidworlock.com/category/semantic-web/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.davidworlock.com</link>
	<description></description>
	<lastBuildDate>Mon, 06 Feb 2012 12:07:56 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>KISS &#8211; but don&#8217;t Tell</title>
		<link>http://www.davidworlock.com/2012/01/kiss-but-dont-tell/</link>
		<comments>http://www.davidworlock.com/2012/01/kiss-but-dont-tell/#comments</comments>
		<pubDate>Sun, 29 Jan 2012 20:09:32 +0000</pubDate>
		<dc:creator>dworlock</dc:creator>
				<category><![CDATA[B2B]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Financial services]]></category>
		<category><![CDATA[healthcare]]></category>
		<category><![CDATA[Industry Analysis]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[mobile content]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[STM]]></category>
		<category><![CDATA[Thomson]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://www.davidworlock.com/?p=1096</guid>
		<description><![CDATA[&#8220;Keep it Simple, Stupid&#8221; was an acronym I brought home from the first management course I ever attended yet it has taken me years to find out what it really means. There are, clearly, few things more complex than simplicity, and one man&#8217;s &#8220;Simple&#8221; is another man&#8217;s Higgs Boson. So I was very energised to [...]]]></description>
			<content:encoded><![CDATA[<p>&#8220;Keep it Simple, Stupid&#8221; was an acronym I brought home from the first management course I ever attended yet it has taken me years to find out what it really means. There are, clearly, few things more complex than simplicity, and one man&#8217;s &#8220;Simple&#8221; is another man&#8217;s Higgs Boson. So I was very energised to have a call last week from an information industry original who has been offering taxonomy and classification services to the information marketplace since 1983. When I first met Ross Leher in the late 1980s we were both wondering how far we would have to go into the 1990s until information providers recognized that they needed high quality metadata to make their content discoverable in a networked world. Ross had sold his camera shop to take the long bet on this, but he worked at his new cause with a near religious persuasion, as I realised when I went to see him in the 1990s at his base in Denver, Colorado. Denver at that time was home to IHS, whose key product involved researching regulatory material from a morass of US government grey literature. Denver people did metadata. It was a revolution waiting to happen.</p>
<p>So when I heard his voice on the phone last week my first emotion was relief &#8211; that he had not simply given up and retired to Florida &#8211; and then agreement. Yes, we were 15 years too early. And many of the people we thought were primary customers, like the Yellow Page companies and the phone books and the industrial directories &#8211; are now either dead or dying, or in the trauma of complete technological makeover. Ross&#8217;s company, WAND Inc (<a href="http://www.wandinc.com">www.wandinc.com</a>) is now very widely acknowledged as a market leading player in horizontal and multi-lingual taxonomy and classification development. They are the player you go to if you have to classify content, if you are in a cross-over area between disciplines (he has a great case study around taxonomies for medical image libraries), and if you have real language problems (&#8220;make this search work just as effectively in Japanese and Spanish&#8221;). What they do is really simple.</p>
<p>Your taxonomy requirement is going to start with broad terms that define your content and its area of activity. These can then be narrowed and specified to give additional granularity in any specific field. These classifications can be incorporated into the WAND Preferred Term Code, given a number, and used in a programmatic, automated way to classify and mark up your content (<a href="http://www.datafacet.com">www.datafacet.com</a>). Preferred terms can be matched to synonyms, and the codes can be used to extend the process to very many different languages. So someone whose company, for example, was created in Spanish can be found in the same list as someone who has a Japanese outfit, as the result of a search made by a Chinese user working in Chinese.</p>
<p>And from synonyms we can extend the process  to extended terms themselves, and then map the WAND system to third party maps &#8211; think of UNSPSC, Harmonized Codes or NAICS, as well as those superficial and now dwindling Yellow Page classifications. WAND can isolate and list attributes for a term, and can then add brand information. All of these activities add value to commoditized data, and one would think that the newspaper industry at least would have been deep into this for 15 years. Yet few examples &#8211; Factiva is an honourable example &#8211; exist which demonstrate this.</p>
<p>Not the least interesting part of Ross&#8217;s account of the past few years was the interest now shown by major enterprize software and systems players in this field of activity. Reports from a variety of sources (IDC, Gartner) have high-lighted the time being wasted in  internal corporate search. Both Oracle and Microsoft have metadata initiatives relevant to this, and it still seems to me more likely that Big Software will see the point before the content industry itself. With major players like Thomson Reuters (Open Calais) deeply concerned about mark-up, there are signs that an awareness of the role of taxonomy is almost in place, but as the major enterprize systems players bump and grunt competitively with the major, but much smaller, information services and solutions players, I think this is going to be one of the competitive areas.</p>
<p>And there is a danger here. As we talk more and more about Big Data and analytics, we tend to forget that we cannot discard all sense of the component added value of our own information. We know that our content is becoming commoditized, but that is not improved by ignoring now conventional ways of adding value to it. We also know that the lower and more generalized species of metadata are becoming commoditized; look for instance at the recent Thomson Reuters agreement with the European Commission to widen the ability of its competitors to utilize its RICs equity listings codes. This type of thing means that, as with content, we shall be forced to increase the value we add through metadata in order to maintain our hold on the metadata &#8211; and content &#8211; which we own.</p>
<p>And, one day, the only thing worth owning &#8211; because it is the only thing people search and it produces most of the answers that people want &#8211; will be the metadata itself. When that sort of sophisticated metadata becomes plugged into commercial workflow and most discovery is machine to machine and not person to machine we shall have entered a new information age. Just let us not forget what people like Ross Leher did to get us there.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidworlock.com/2012/01/kiss-but-dont-tell/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Manufacturing/Motoring/Media</title>
		<link>http://www.davidworlock.com/2012/01/manufacturingmotoringmediamadness/</link>
		<comments>http://www.davidworlock.com/2012/01/manufacturingmotoringmediamadness/#comments</comments>
		<pubDate>Thu, 26 Jan 2012 22:32:53 +0000</pubDate>
		<dc:creator>dworlock</dc:creator>
				<category><![CDATA[B2B]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Financial services]]></category>
		<category><![CDATA[healthcare]]></category>
		<category><![CDATA[Industry Analysis]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[STM]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://www.davidworlock.com/?p=1084</guid>
		<description><![CDATA[Here we sit, in a poor benighted island, slowly sinking into economic anonymity, in a great world where economic growth seems to be a property of lands we once called &#8220;under-developed&#8221;. A worthy come-uppance, and a suitable subject for Davos this week. Yet, as a persistent optimist, I somehow glimpse a glowing future for my [...]]]></description>
			<content:encoded><![CDATA[<p>Here we sit, in a poor benighted island, slowly sinking into economic anonymity, in a great world where economic growth seems to be a property of lands we once called &#8220;under-developed&#8221;. A worthy come-uppance, and a suitable subject for Davos this week. Yet, as a persistent optimist, I somehow glimpse a glowing future for my children&#8217;s children. Information services and solutions lie close to the heart of developmental growth, and I have written here repeatedly (too often for some readers!) about the necessary connection between injecting data/content into workflow and the regeneration of a post-industrial economy. For some reason the information industry has its eyes fixed on pure information usage (sometimes called &#8220;research&#8221;). In some areas, though &#8211; credit rating, risk management, automated financial trading systems, scientific research - we have come out of the bunker and begun to look at the way applied intelligence, often now derived from Big Data and analytics, can change the way that we view the operational logic of whole sectors of commercial and industrial life.</p>
<p>Now, lets pull back a step further and see how information services change networked industry and society at large. I only have space for two examples. The first was driven home to me on Monday at a dinner given by the Real Time Club. The speaker, Dr Siavash Mahdavi (<a href="http://en.wikipedia.org/wiki/Siavash_Haroun_Mahdavi">http://en.wikipedia.org/wiki/Siavash_Haroun_Mahdavi</a>), spoke on 3D printing, and by the time he had finished, and we had examined printed hip joints and shoe inserts amongst other examples the penny was beginning to drop for me. We are moving in the network from manufacturing by extrusion processes through moulds, the industrial revolution pre-digital world, to additive manufacturing, creating products in software and instructing printing devices to build them in extremely thin 2D layers one on top of the other until the desired shapes and structures are created. Medical implants have had the publicity here, but gold jewellery was mentioned as an application. This is a design &#8211; intensive, network efficient manufacturing world in which design and the actual printer can be in totally different places. Printing can take place using any materials which can be chemically &#8211; adapted to the process. Customization (the running shoe insert designed for the imprint and weight distribution of your own foot) and personalisation are at the centre of this. Every product can be made for you. However, it remains a requirement that everything we know about the performance, qualities and expectations of an artificial hip are brought to bear in the network upon the design process, as the information services world creates the bullets for manufacturing workflow to fire. And all this is going strong now: the lead engineering player in 3D printing in the UK is Renishaw (<a href="http://www.renishaw.com/en/additive-manufacturing-news--15505">http://www.renishaw.com/en/additive-manufacturing-news&#8211;15505</a>) (and with eery coincidence  it announced today a strong trading year, with sales up 11%).</p>
<p>If this is not bizarre enough, I stumbled upon a Google story this week about automated motoring. Apparently Google&#8217;s own patented technology had racked up 200,000 autonomous driverless miles by the end of last year. This may just be another Google enthusiasm which runs out of steam, but it does have a history (<a href="http://en.wikipedia.org/wiki/Autonomous_car">http://en.wikipedia.org/wiki/Autonomous_car</a>), and a great deal of real research, and my bet is that it will happen in this over-crowded isle a lot quicker than the UK estimate of 2056. Extending the network to our over-populated motorways may be the only way to squeeze more capacity from infrastructure we do not have the space to rebuild, and to control scarce parking resource. Driving my car to the motorway and then surrendering control to a system that governs inter-car distance and speed until I leave is a likely first stage. And as the car becomes part of the network, then its ability to intelligently appraise where it is, where it is going and how it is feeling becomes a natural extension of a world of autos which are already computers on wheels. Information service solutions will be vital to feed this activity: important players like ITOWorld (<a href="http://www.itoworld.com">www.itoworld.com</a>) already assemble critical geospatial data, matched at the vital micro level by services like Elgin (<a href="http://www.elgin.gov.uk/" target="_blank">http://www.elgin.gov.uk/</a>) who can tell you about every road repair in Britain. At the moment this is part of the world of local government and planning: tomorrow it will have to be part of the knowledge base of your motor car.</p>
<p>When I think about examples like these I become more and more convinced that the new world of information service knowledge and intelligence will be more important than the old one, patrolled by intermediaries like librarians, and governed by quite irrelevant business models like advertising. And here is a world where the use shapes the content, and where suppliers are involved in developing solutions for sectors or even individual companies. Here the information services and solutions players have forgotten whether they are &#8220;content&#8221; or &#8220;software&#8221; players, because it has no bearing on the end result, and they had to have both elements to play in any case.</p>
<p>So who will do this stuff well? Undoubtedly the Indians and the Chinese and the Brazilians amongst others. But in many ways this future vision levels out a lot of the inequalities of the old and new worlds. You do not need a great deal of cheap labour to compete here. Capital too will have a different importance if you can custom manufacture close to the point of use, and avoid shipping and warehousing. I quite fancy the chances of this old island: good with design, strong start-up culture, great software development skills, good financial services investment culture, strong presence in information and education markets globally. Or at least I would, if our politicians did not think that modernity was returning to the railway investment mode of Britain in the 1840s, or aping the French and the Japanese high speed trains of the 1960s and 1970s. The infrastructure requirement here would be to create the most intensive high bandwidth broadband coverage in Europe. Fat chance of that while politicians think there are more votes to be got by shaving 34 minutes off the journey time from London to Birmingham!</p>
<p>Some of my friends call this type of article &#8221;futuristic madness&#8221; (and that was the polite one!). But, to me, the real madness lies in taking the formats of the Gutenberg age (books, newspapers etc), carefully wrapping them in software and delivering them in facsimile form across the network &#8211; and then calling these eFormats Innovation!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidworlock.com/2012/01/manufacturingmotoringmediamadness/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Workflow from the Bottom Up</title>
		<link>http://www.davidworlock.com/2012/01/workflow-from-the-bottom-up/</link>
		<comments>http://www.davidworlock.com/2012/01/workflow-from-the-bottom-up/#comments</comments>
		<pubDate>Wed, 18 Jan 2012 20:46:09 +0000</pubDate>
		<dc:creator>dworlock</dc:creator>
				<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Industry Analysis]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[Reed Elsevier]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[STM]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://www.davidworlock.com/?p=1075</guid>
		<description><![CDATA[Trends and trending analysis are one thing, making an impact on the way people work is often quite another. So while I respectfully write up the huge progress being made to provide large scale tools for analytical discovery in unimaginable quantities of data, a small portion of me remains skeptical about the impact of these [...]]]></description>
			<content:encoded><![CDATA[<p>Trends and trending analysis are one thing, making an impact on the way people work is often quite another. So while I respectfully write up the huge progress being made to provide large scale tools for analytical discovery in unimaginable quantities of data, a small portion of me remains skeptical about the impact of these developments in the short term on the working lives of professionals. Look at researchers in science and technology: you can readily imagine the impact of Big Data on Big Pharma, but can you so easily imagine what this will mean in materials science? Or can you see how the workbench performance of the  individual researcher in neuroscience might be impacted? Its tough, and because it is tough we go back to saying that the traditional knowledge components will last the course. So if you have a good library, access to a reasonable collection of journals and the ability to network with colleagues then that is enough. Or Good Enough, as we keep saying.</p>
<p>So when I read the words &#8220;This is important not only for the supplementary data accompanying one&#8217;s experiment, but even negative results&#8221; I came alive immediately and read consciously what I had hitherto skipped. You see, in all the years that I have spoken with and interviewed researchers, when we get off the formal ground of OA or conventionally published articles, or the iniquities of publishers and the inadequacy of librarians, we get back to some stubborn issues that cling to the bottom of the bucket. One is what do you do with the remaining content derived from the research process which did not get into the article, where it was summarized and where conclusions were drawn from it. I mean the statistical findings, the raw computations. the observations and logs, the audio and video diaries, the discarded hypotheses etc. Vital stuff, if anyone is going to walk that way again. Even more vital is the detritus of failure: the experiment which never made a paper since it demonstrated what we already know, or where the model proved inadequate to demonstrate what we sought to show. Researchers going back to find why a generation of research went astray from a finding that proved fallible often need this content: in terms of detective fiction it is the cold case evidence. Yet more often than not it is not available.</p>
<p>So here is what I found in the nearly discarded press release. Nature Publishing&#8217;s Digital Science company (yes, them again!) have refinanced figshare (<a href="http://figshare.com">http://figshare.com</a>) and yesterday they relaunched it. What does it do? It archives all the stuff I have been talking about, providing a Cloud environment with unlimited public public storage. They call it &#8220;a community-based open data platform for scientific research&#8221;. I call it a wonderful way of embedding research workflow into a researchable storage environment that eventually becomes a search magnet for researchers wanting to check the past for surprising correlations. At the moment it is just a utility, a safe place to put things. But if I just add a copy of the article itself then it becomes a record of a research process. Put hundreds of thousands of those together and then you have a Big Data playground. Use intelligent analytics and new insights can be derived, and science moves forward on the tessellate of previous experimentation &#8211; only quicker, with less effort and more productivity for the researcher. And much less is lost, including the evidence from the wrong turnings that turned out to be right turnings. (<a href="http://digital-science.com/press-releases/">http://digital-science.com/press-releases/</a>)</p>
<p>So will there be 20 of these? Well, there may be two, but if figshare gets an early lead perhaps there will only be one. After all , the reason  researchers would come to value this storage would be having their content in close proximity to others in their field. And while early progress is likely to run quick in Life Sciences, this application has relevance in every field of study. And it also calls into question ideas of what &#8220;publishing&#8221; actually is. By storing and making available these data, are figshare &#8220;publishing&#8221; them. They are certainly not editing or curating them. Network access alters many things and here, once again, it catches publishing on the hop. If traditional publishers confine themselves to making margins solely from the first appearance of an article then traditional publishing in this sector is in severe difficulty, whatever happens to the Open Access debate. Elsevier and Nature clearly get it: go upstream in value terms or drown in commoditized content where you are. But does anyone else see it? And why not?</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidworlock.com/2012/01/workflow-from-the-bottom-up/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Take the Program to the Data</title>
		<link>http://www.davidworlock.com/2012/01/take-the-program-to-the-data/</link>
		<comments>http://www.davidworlock.com/2012/01/take-the-program-to-the-data/#comments</comments>
		<pubDate>Thu, 12 Jan 2012 20:24:07 +0000</pubDate>
		<dc:creator>dworlock</dc:creator>
				<category><![CDATA[B2B]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Financial services]]></category>
		<category><![CDATA[healthcare]]></category>
		<category><![CDATA[Industry Analysis]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[Reed Elsevier]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://www.davidworlock.com/?p=1058</guid>
		<description><![CDATA[Its Big Data week, yet again. In the last two months we have seen all of the dramas and confusions attendant upon emerging markets, yet none of the emerging clarity which one might expect when a total sea change is taking place in the way in which we extract value from data content. Then this [...]]]></description>
			<content:encoded><![CDATA[<p>Its Big Data week, yet again. In the last two months we have seen all of the dramas and confusions attendant upon emerging markets, yet none of the emerging clarity which one might expect when a total sea change is taking place in the way in which we extract value from data content. Then this week, with all the aplomb of an elephant determined not to be left behind in a world which has apparently decided that the hula hoop is the only route to sanity, Oracle announced its enterprize Big Data solution. Again. Only now it is called the Big Data Appliance. It started shipping on Tuesday. And the world will never be the same again.</p>
<p>At the heart of the Oracle launch is a Hadoop license. This baby elephant lies at the heart of almost everything. The two Hadoop &#8211; based commercializations, have both raised finance in the lead-up to 2012: Cloudera ($40m) and Hortonworks ($20m), while other sector players like MapR who also exploit Hadoop found 2011 a really good time to raise money. And this had a radiating effect on the whole data handling sector. Neo 4j, a database technology (NeoTechnology, based in Malmo and Menlo Park) for  graph storage and resolution raised $10m in a round led by Fidelity. Meanwhile, Microsoft signed a deal with Horton works, IBM said it would launch Hadoop in the Cloud, EMC (Greenplum) went for MapR, Dell announced a Hadoop-based initiative, and the world waits and wonders what Hewlett Packard will do, now that it has Autonomy for analytics.</p>
<p>So now we have plenty of initiatives, and, as usual, not much idea of who the next generation of users will be. The first generation speak for themselves. We can see the benefits that Facebook derive from being able to used Hadoop-based tools to find connections and meanings in their content that would have been impossible to cost-effectively reveal in a prior age. And the same would be true of such unlikely bedfellows as the Department of Homeland Security, or Walmart, or Sony (think Playstation Network), or the Israeli Defence Force, or the US insurance industry (via Lexis Risk), or Lexis Nexis (who announced a Big Data integration with MarkLogic), let alone the two players who effectively started all this: Yahoo! (Hadoop) and Google (MapReduce). So asking where it goes next is a legitimate question, but one which can only be answered if we accept that the next group of users are never going to recreate  the Google server farms in order to break into these advantageous processing environments. The next group of intensive users will have their XML content on MarkLogic, or their graphical data on Neo 4j. They will want to use the US census data remotely (so will contract with Amazon for process time on the Amazon web presence), and will use a large variety of third party content held in similar ways. Some of their own content will still be held locally on MySQL databases &#8211; like Facebook &#8211; while others will be working in part or fully in the Cloud, and combining that with their own NoSQL applications. But the essential point here is that no one will be building huge data warehousing operations governed by rigid and mechanistic filing structures. Literally, we are increasingly leaving the data where it is, and bringing the analytical software to it, in order to produce results that are independent of any single data source.</p>
<p>And this too produces another sort of revolution. The front door to working in this way is now the organizational software itself. When Lexis Risk announced at the end of last year that they were going to take HPCC open source, a number of critics saw that as turning their back to an exploitation opportunity. Yet it makes very real sense in the context of Oracle, Microsoft and IBM seeking to build their own &#8220;solutions&#8221;. Some businesses will want to run their own solutions, and will make a choice between open source Hadoop and open source HPCC. Others in systems integration will seek out open source environments to create unique propositions. But since it was always unlikely that Lexis Risk was going to challenge the enterprize software players in their own bailiwick, then open source is a way of getting a following, harvesting vital feedback, and earn not insignificant returns in servicing and upgrading users.</p>
<p>I am also delighted to see that other winners seem likely to be MarkLogic, since I have been proud of working with them and speaking at their meetings for a number of years. For publishers and information providers, it is now clear that XML remains the route forward. But MarkLogic 5 is clearly being positioned as the information service providers socket for plugging into the Big Data environment. Anyone who believes that scientists will NOT want to analyse all data in a segment, or engineers source all relevant briefs with their ancilliary information, or lawyers cross examine all documentation regardless of location, or pharma companies examine research files in the context of contra-indications should stop reading now and take up fishing. My observation is that Big Data is like Due Diligence: once someone does it, even if the first results are not impressive, all competitors have to do it. The risk of not trying to find the indicative answer by the most advanced methods is too great to take.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidworlock.com/2012/01/take-the-program-to-the-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Seven Pillars of Wisdom</title>
		<link>http://www.davidworlock.com/2012/01/seven-pillars-of-wisdom/</link>
		<comments>http://www.davidworlock.com/2012/01/seven-pillars-of-wisdom/#comments</comments>
		<pubDate>Thu, 05 Jan 2012 21:24:05 +0000</pubDate>
		<dc:creator>dworlock</dc:creator>
				<category><![CDATA[B2B]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[data protection]]></category>
		<category><![CDATA[eBook]]></category>
		<category><![CDATA[Education]]></category>
		<category><![CDATA[eLearning]]></category>
		<category><![CDATA[healthcare]]></category>
		<category><![CDATA[Industry Analysis]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[mobile content]]></category>
		<category><![CDATA[Pearson]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[Thomson]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.davidworlock.com/?p=1045</guid>
		<description><![CDATA[My holiday reading, courtesy of Skip Pritchard who gave it to me, has been Michael Korda&#8217;s vast biography of T E Lawrence, and despite my familiarity with the story, I have found it an entrancing experience. Lawrence is almost impossible to reconstruct, since he shone a different light in the direction of every individual he [...]]]></description>
			<content:encoded><![CDATA[<p>My holiday reading, courtesy of Skip Pritchard who gave it to me, has been Michael Korda&#8217;s vast biography of T E Lawrence, and despite my familiarity with the story, I have found it an entrancing experience. Lawrence is almost impossible to reconstruct, since he shone a different light in the direction of every individual he met, and one is left feeling that nowhere does a real Lawrence exist. So very like the information game, then! Every observer sees a different fraction of play, and no one can predict the outcome. This comment is meant to mask my residual guilt at reading my book while my knee mended and not writing pages of forecasts and predictions for the amusement of readers, and to confirm my frailties as a prophet of anything.</p>
<p>Lawrence wrote &#8220;The Seven Pillars of Wisdom&#8221;, one of the world&#8217;s unread classics (and almost unreadable in parts: he lost the only copy of the full manuscript on Reading train station and had to recreate 200,000 words, during which he clearly became bored.) In 800 words I can communicate seven thoughts &#8211; not so much Pillars  as pillows, and not predictions but observations of this unknowable industry. Here goes:</p>
<p>1.  Some think its about content and others that it is about platforms and technology. For me it is still about communications, and the greatest challenge is still holding people&#8217;s attention, having gained their recognition. Even Facebook hits a plateau. The gods remain Reputation, Identity, and Attention.</p>
<p>2. You are either a communication company or you are not. News Corp is a format company. It does newspapers, film and television and has little corporate bandwidth for non-format communications. This cannot be changed by executive whim, and the collapse of Beyond Oblivion, its music initiative, before the holidays (<a href="http://www.guardian.co.uk/technology/2012/jan/04/music-service-beyond-oblivion-folds">http://www.guardian.co.uk/technology/2012/jan/04/music-service-beyond-oblivion-folds</a>), as well as the veil of silence around the performance of The Daily on the iPad, following on as they do the oblivion that was My Space, demonstrates all of this very well. Yet Mr Murdoch has signed on to Twitter. There is no evidence yet that the world can be saved with a single Tweet. There is no evidence yet that traditional media and information businesses can recreate themselves in new marketplaces without either starting afresh somewhere else  or by buying a new business and moving into it. Boinc.</p>
<p>3. Apple, according to MacRumors (<a href="http://www.macrumors.com/2012/01/03/apples-january-media-event-to-involve-digital-textbooks-and-education/">http://www.macrumors.com/2012/01/03/apples-january-media-event-to-involve-digital-textbooks-and-education/</a>), is about to enter the textbook market, maybe with Pearson and certainly via the iPad. This was apparently a dearly held dream of Steve Jobs, at least according to Walter Isaacson, who is shaping up to be not just the biographer but also the Delphic oracle. I have some doubts &#8211; not about the iPad as a display device, but about whether markets want textbooks re-invented. Learners would like learning re-invented, and made easier and more compelling. Textbooks are an extinct format. And learning should operate equally well on whatever platform you have available. What a waste of all this energy around eLearning if we abolish the old formats like textbooks and replace them with rigid device platforms. And yet I am sure that the analysts are right &#8211; there are only a few global growth markets and education is the largest.</p>
<p>4. Then I had a great comment from Brad Patterson at EduLang (<a href="http://www.edulang.com">www.edulang.com</a>). He points out that 500 million people are trying to learn English and only 50 million can afford textbooks, online or otherwise. So his business model for his interesting TOEFL and TOIEC Simulators is &#8220;pay what you can&#8221;, with half going to a reading charity. In many ways this is very neat &#8211; it reaches out to 450 million people with a trust relationship, and could be a really interesting business model to watch. Above all, how encouraging it is to see someone moving the goalposts &#8211; we did not score many goals in regular business model configurations so lets applaud the courage of someone doing something different.</p>
<p>5. Semantic Web technology and deployment in mass markets is getting closer and closer. I took part in the beta of Garlik (<a href="http://www.garlik.com">www.garlik.com</a>) some 3 years ago, partly because of an interest in technology around identity, and partly out of interest in technologies derived from the University of Southampton Computer Science department, and blessed by such eminences as Wendy Hall, Nigel Shadbolt &#8211; and Sir Tim Berners Lee himself. Two days before Christmas Garlik was sold to Experian, in a move that I think was as significant as Reuters buying ClearForest all those years ago. Garlik protects personal identity through web search, was founded by the men who built the UK online banks Egg and First Direct, and backed by Doughty Hanson. This is a straw in a wind which will go galeforce.</p>
<p>6. But if the Semantic Web is going to be so clever, and linked data will recreate so many service environments, where is it now? Well, look at the obvious places. In most of our economies building and construction is the largest sector in terms of activity and players, large and small, and has great companies serving it with supplier and materials information. Thus, in a US market replete with Reed Construction, Hanley Wood and McGraw-Hill. But what if a semantic web-based environment were able to search all online catalogues and directories to produce a sweeping coverage of suppliers and products that was at once more detailed and more comprehensive than any directory-style database, and could include more metadata from suppliers and users to create a continually developing industry specification site, deliverable and self-formatting to every platform and device? That is what interests me about MaterialSource, (<a href="http://www.materialsource.com/about">http://www.materialsource.com/about</a>) as well as its use of SPARQL, Semantic Web Pages for faceted and graph-based browsing, smartphone and tablet Apps using HTML5, ontologies etc, etc. If they do it, someone will have to buy them!</p>
<p>7. I keep on thinking about the neglect of audio, so I was delighted to see SoundCloud (<a href="http://soundcloud.com/">http://soundcloud.com/</a>). There has to be room for an audio portal, and a community for sharing sound and cross-referencing its sources and users. I anticipate that they know things about users that Beyond Oblivion didn&#8217;t.</p>
<p>Last words of a predictive nature before I get back to real work. A correspondent asks &#8220;what technology are you following in 2012!&#8221; Since I say every week that I am not following technologies but users, I take mild offense at this, but I do admit to a penchant for 3D printing. Now that really could have an impact. Especially in medical workflow. I have also been asked by a venture capitalist who should know better what is likely &#8220;to be certain&#8221; to succeed this year. He is a serious man so I owe him a serious answer: anything that saves more time and money than it costs. The prime example this year in the UK has been Shutl, a delivery logistics service that gets your online purchases to you physically (average delivery time in London was 90 minutes, with a cost of £5). Is that all the queries? I am beginning to feel like an Agony Aunt!</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidworlock.com/2012/01/seven-pillars-of-wisdom/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Science is a Network</title>
		<link>http://www.davidworlock.com/2011/11/science-is-a-network/</link>
		<comments>http://www.davidworlock.com/2011/11/science-is-a-network/#comments</comments>
		<pubDate>Sun, 27 Nov 2011 19:14:45 +0000</pubDate>
		<dc:creator>dworlock</dc:creator>
				<category><![CDATA[B2B]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Education]]></category>
		<category><![CDATA[eLearning]]></category>
		<category><![CDATA[Industry Analysis]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://www.davidworlock.com/?p=994</guid>
		<description><![CDATA[The working lives of scientists are of greater interest today than at any time in human history. They seem, by closing the time gap between speculation  and remediation, to have completely changed roles in society. The person in the white lab coat is no longer obtuse, threatening or just eccentric &#8211; the scientist will now, with [...]]]></description>
			<content:encoded><![CDATA[<p>The working lives of scientists are of greater interest today than at any time in human history. They seem, by closing the time gap between speculation  and remediation, to have completely changed roles in society. The person in the white lab coat is no longer obtuse, threatening or just eccentric &#8211; the scientist will now, with a wave of his network, solve global warming, feed the unfed and cure us all of the illnesses we have yet to contract. The other day I was sent a fascinating article on Open Science by a researcher and software developer plainly angry that &#8220;Open Science&#8221; is getting such a popular exposure (<a href="http://gigaom.com/2011/10/31/why-the-world-of-scientific-research-needs-to-be-disrupted/">http://gigaom.com/2011/10/31/why-the-world-of-scientific-research-needs-to-be-disrupted/</a>) while the normal benefits of regularly networked science are being ignored. And it gets one thinking, because it raises a set of issues about the relationships of professionals and their lives in networked societies that has real consequences for all of us.</p>
<p>After I read the above note I then read Jack Stilgoe&#8217;s review of Michael Nielson&#8217;s book in the Guardian (26.11.2011). While I have yet to read the book, my head is already in the debate in a micro-sample of three views and you, if indeed you are, make up a fourth. Whether you pass your views on to others or not, we are participating in a rapid sharing process which must have effects of its own on communication. If we were scientists and practising what Michael Nielson preaches we would be sharing our thinking, and our results, in very much the same way, standing aside from the competitive sides of our nature to create progress by collaboration within the network. Question: when we say that living in a networked society will cause all sorts of changes to the way we communicate and act, do we mean that these will be changes for the better in our fundamental characteristics as people? Dear Reader, are you an optimist about the improvability of mankind through communication &#8211; in which case Facebook may be the saviour of the race? Or, do you believe, like some philosophers of evolution, that the changes that occur will be random mutations, from which some, over time, will become built into the  prevalent response mode of network users?</p>
<p>This week I have been thinking a great deal about teachers as well as scientists. Teachers now accept the potential gains from sharing content in a way which would have been anathema to their predecessors. We now have approaching (early next year) 2 million teachers from all over the world sharing their own treasured and successful routines with each other on TES Connect (<a href="http://www.tes.co.uk/teaching-resources/">http://www.tes.co.uk/teaching-resources/</a>). This is a huge demonstration of altruism, and a strong desire to be recognized by peers. In appealing to his fellow scientists to adopt Open Science, Michael Nielsen seeks that same altruism, and argues well for the effectiveness of collaboration, but he is doing so in a context where peer recognition is baked into the way scientists report and publish. Of itself, the network will not change that, and all players (scholars, publishers, schoarly societies and librarians) have colluded willingly with the transfer of the networking of the paper-based world into the digital network with great enthusiasm.</p>
<p>So is there no effective collaborative science? Certainly there is. A very good example which I seem to have been writing about for a decade is Signalling Gateway (<a href="http://www.signaling-gateway.org/">http://www.signaling-gateway.org/</a>), where users greatly appreciate the need to share results &#8211; and analytical techniques and tools &#8211; in a very rapid time frame , but where participant research teams seem to retain identity (and probably funding sources). Nothing is more competitive in research than access to the money. Yet collaboration is present, and in neuroscience, or the Polymath mathematics project, or in the human genome  research programme, there are good examples of  collaborative success and altruistic sharing. So, if you think this is a desirable outcome, how do you breakdown the conservatism of scientists?</p>
<p>Much as you breakdown the conservatism of teachers, I imagine. You help them to create local, team or institution -based networking which returns real rewards in terms of workflow and productivity. Just as the school budget and timetable system, and resource sharing  amongst a community of schools to raise standards through shared content have made a real impression on how schools run and teachers teach (I was impressed this week to see that every US state has now adopted iSchool standards which allow for virtual education systems) so I know that as research teams build better internal network usage and more effective control of content, then the confidence required for Michael Nielson&#8217;s wider aims will emerge. So hopefully no government will start flinging funds at Open Science: it would be better spent mandating network compliance on the use of lab chemicals and ensuring that networked analytics were available to ensure that what is known to the network at present can be shared by all participants in the network.</p>
<p>And these are thoughts for publishers and information providers too. We are often faced with a radical urge to change emanating from the top of a deeply conservative community of users. Our task, surely, is to work on the infrastructure and let the profession in question take care of the timing. This can be hugely frustrating, but like Michael Nielsen, we too cannot force a model of change on marketplaces.</p>
<p>Michael Nielsen&#8217;s book is &#8220;Reinventing Discovery: the new era of networked science&#8221; (Princeton University Press). I note with pleasure that it was sponsored by George Soros, a man who has done more good than most on this planet, and whose belief in Sir Karl Popper&#8217;s Open Society theories, ingested from the great man himself at LSE, have been a lifelong inspiration. But every change has a precurser, and putting Open in front of something does not change anything. A recent Washington Post article on Virtual Schools was contributed by my best reader/editor:  <a href="http://www.washingtonpost.com/local/education/virtual-schools-are-multiplying-but-some-question-their-educational-value/2011/11/22/gIQANUzkzN_story.html?wprss">http://www.washingtonpost.com/local/education/virtual-schools-are-multiplying-but-some-question-their-educational-value/2011/11/22/gIQANUzkzN_story.html?wprss</a>=</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidworlock.com/2011/11/science-is-a-network/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Voice is Another Country</title>
		<link>http://www.davidworlock.com/2011/11/voice-is-another-country/</link>
		<comments>http://www.davidworlock.com/2011/11/voice-is-another-country/#comments</comments>
		<pubDate>Sat, 19 Nov 2011 18:13:52 +0000</pubDate>
		<dc:creator>dworlock</dc:creator>
				<category><![CDATA[B2B]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[eBook]]></category>
		<category><![CDATA[Education]]></category>
		<category><![CDATA[eLearning]]></category>
		<category><![CDATA[Financial services]]></category>
		<category><![CDATA[healthcare]]></category>
		<category><![CDATA[Industry Analysis]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[mobile content]]></category>
		<category><![CDATA[news media]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.davidworlock.com/?p=982</guid>
		<description><![CDATA[Its obvious, isn’t it? Any voice application is bound to be a winner. We all love being spoken to in leisure or learning moments. What is the easiest way in which to absorb information? Have it spoken to you. From the audio book to the sat nav machine, voice works. As humans, we can project [...]]]></description>
			<content:encoded><![CDATA[<p>Its obvious, isn’t it? Any voice application is bound to be a winner. We all love being spoken to in leisure or learning moments. What is the easiest way in which to absorb information? Have it spoken to you. From the audio book to the sat nav machine, voice works. As humans, we can project so much onto a voice. Its “colour” gives instant clues, and even the road directions to Southend-on-Sea can become injected with implied threat or promise. And hearing things is restful, even absorbing. Having a novel read in one ear can be superbly engrossing, and while there is always the risk of being alienated by the reader’s interpretation, chances are that the audio book will be the way we “see” that text, once we have heard it, for ever. I have an old record of T S Eliot reading The Waste Land which I can no longer play because I have no form of media that will play it. So I naturally became an early user of the App, which has 9 versions of the poem being read, including the poet himself. Most of them are far better, but because I heard it first, when I read the poem aloud myself, I find that I use the poet’s cadence and timing. In other words, voice imprints and can be unforgettable.</p>
<p>Which brings me to Siri. The Apple iPhone voice App has now had three months of shrill publicity (<a href="http://www.transhumanistic.com/2011/10/new-iphone%E2%80%99s-killer-app-%E2%80%93-voice-controlled-personal-assistant/" target="_blank">http://www.transhumanistic.com/2011/10/new-iphone%E2%80%99s-killer-app-%E2%80%93-voice-controlled-personal-assistant/</a>) and (<a href="http://www.youtube.com/watch?v=3uo5CUgEYKI&amp;noredirect=1" target="_blank">http://www.youtube.com/watch?v=3uo5CUgEYKI&amp;noredirect=1</a>). <span style="font-family: Calibri;"><span style="font-size: small;"><br />
</span></span></p>
<p>Given its ability with natural language searching, which gives it a degree of “intelligence”, reviewers think this should be a winner, and I agree on one level. On another I have some reservations, and these are largely concerned with our apparent inability to position and market voice services effectively.</p>
<p>Twenty years ago a senior executive at Random House told me that I was wasting my time with “Multimedia”, which was what we were then working on for CD-ROM. All the market wanted, he said, were good audio readings to play in the car on long distance travel, and he introduced me to his bright young manager who was providing just that. That manager told me two things that have stuck with me: one was the now obvious reflection that publishers were rubbish at marketing anything at all, and this would never change since they believed that they could sell anything. The second was that voice markets appeared to him to be finite: you quickly reached the voice susceptible segment, then growth got very hard. It is a thought that comes back as even Barnes and Noble discover digital<span style="font-family: Calibri;"><span style="font-size: small;"> (</span></span><a href="http://www.publishersweekly.com/pw/by-topic/industry-news/bookselling/article/49567-barnes--noble-sees-bright-future-in-digital.html" target="_blank">http://www.publishersweekly.com/pw/by-topic/industry-news/bookselling/article/49567-barnes&#8211;noble-sees-bright-future-in-digital.html</a>). And who would have thought that would happen!<span style="font-family: Calibri;"><span style="font-size: small;"><br />
</span></span></p>
<p>My young friend of then is now the manager of an important media venture fund, so I will preserve his anonymity. And I do not want to argue that eBook or digital versioning is similarly finite. But I do want to suggest that voice is a vital component of the network and thus of digital service provision, that we grossly neglect its impact in product and service development, and that but for two unfortunate voice misuse environments we would be using a great deal more in more intelligent environments. I am told for example that voice search is now a really easy application to roll out in many service contexts. However, the reason given for its relatively modest showing is the prevalence of hugely annoying telephone voice menu systems, which daily have reasonable people howling in frustration. Having discovered a rare four tier example this week in a hospital group, I am tempted to initiate an award scheme for organizations who employ human beings to answer the phone. The second is automated public service messaging in airports and elsewhere, but in terms of both the problem is not voice, but marketing. I even encountered an airport lounge in my October travels which announced, every five minutes, that no flight departure announcements would be made and that passengers should consult the information screens!</p>
<p>For all of these reasons the future of voice is vital. Siri may point the direction towards intelligent guidance, but completely voice-directed computing has been feasible for a long time and must be a part of the five year scenario. And you do not need to have a Babelfish in your ear to believe in voice/language text translation, which the network is begging for in countless sectors and which is increasingly feasible at a basic level. Slowly we will edit out poor voice practises and it will become rare for web environments to lack audio components as it is for them now to lack video activity. I have had the pleasure recently to work with a group in Dublin who are creating virtual environments to help students pass tests in proficiency in spoken languages. There is an early example at <a href="http://www.examspeak.com" target="_blank">http://www.examspeak.com</a> but there is much more to come. The network is the ideal environment for voice-based training, language learning and virtual voice service development. Eventually the digital communications revolution will come full circle and re-integrate voice as the critical element in networked communications that it always has been, and we shall wonder why this component took so long to fall into place.</p>
<p>And then, we shall call the health insurer through the network and hear his computer say, “Forget all those options and numbers – tell me how I can help&#8221;!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidworlock.com/2011/11/voice-is-another-country/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>I can see so clearly now&#8230;</title>
		<link>http://www.davidworlock.com/2011/10/i-can-see-so-clearly-now/</link>
		<comments>http://www.davidworlock.com/2011/10/i-can-see-so-clearly-now/#comments</comments>
		<pubDate>Thu, 13 Oct 2011 21:36:47 +0000</pubDate>
		<dc:creator>dworlock</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[eBook]]></category>
		<category><![CDATA[Industry Analysis]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[Reed Elsevier]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[STM]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://www.davidworlock.com/?p=903</guid>
		<description><![CDATA[In case anyone has doubts, this is a continuing stream of (un)consciousness arising from my earlier Dogpatch thoughts about innovation and STM. And, of course, in my enthusiasm for the new, I neglected some of the &#8220;slightly older but just as valid&#8221; new. Thanks everyone for reminding me of this. We shall go there anon, but I [...]]]></description>
			<content:encoded><![CDATA[<p>In case anyone has doubts, this is a continuing stream of (un)consciousness arising from my earlier Dogpatch thoughts about innovation and STM. And, of course, in my enthusiasm for the new, I neglected some of the &#8220;slightly older but just as valid&#8221; new. Thanks everyone for reminding me of this. We shall go there anon, but I wanted to start at the STM Association dinner the night before the events described in my last blog. There I had the pleasure of sitting next to Rhonda Oliver, now running publishing at the Royal College of Nursing, but doing so after leaving Portland Press, where she was CEO. And it was Portland Press, a distinguished but not yet world dominant player in biochemistry publishing, that I first learnt of really interesting forays ito the world of semantic-based publishing. Here is what I wrote about them in this blog last year:</p>
<p>&#8220;Particularly noteworthy was a talk by Professor Terri Attwood and Dr Steve Pettifer from the University of Manchester (how good to see a biochemistry informatician and a computer scientist sharing the same platform!). They spoke about Utopia Documents, a next generation document reader developed for the Biochemical Journal which identifies features in PDFs and semantically annotates them, seamlessly connecting documents to online data. All of a sudden we are emerging onto the semantic web stage with very practical and pragmatic demonstrations of the virtues of Linked Data. The message was very clear: go home and mark-up everything you have, for no one now knows what content will need to link to what in a web of increasing linkage universality and complexity. At the very least every one who considers themselves a publisher, and especially a science publisher, should read the review article by Attwood, Pettifer and their colleagues in Biochemical Journal (Calling International Rescue: Knowledge Lost in the Literature and information Landslide  <a href="http://www.biochemj.org/bj/424/0317/bj4240317.htm">http://www.biochemj.org/bj/424/0317/bj4240317.htm</a>). Incidentally, they cite Amos Bairoch and his reflections on Annotation in Nature Precedings (<a href="http://precedings.nature.com/documents/3092/version/1">http://precedings.nature.com/documents/3092/version/1</a>) and this is hugely useful if you can generalize from the problems of biocuration to the chaos that each of us faces in our own domains.&#8221;</p>
<p>And the reference to Steve Pettifer recalled to mind my old friend Jan Velterop, once agent-provocateur in Springer&#8217;s thrust into OA (how grateful they should be to him now, given that his work drew them alongside BMC, and thus to real growth in this year of OA and eBooks compensating for negative trends elsewhere). Dr Pettifer advises Utopia Documents  (<a href="http://getutopia.com">http://getutopia.com</a>), who have been developing in parallel to Labiva and Mendeley in the workflow space for PDFs. Each is different, though they have common attributes. The fact that there are now three environments in this space is a strength for all of them. Isolated good ideas rarely work out. Constantly re-iterated solutions &#8220;invented&#8221; separately in several places shows a sector responding to the same calls from many customers &#8211; &#8220;Help me out of here &#8211; I am losing control!&#8221;.</p>
<p>Utopia Documents is also running a public trial on Elsevier&#8217;s SciVerse environment. This is critical, and prompts a question: if Nature and Elsevier see this, why doesn&#8217;t everyone else? And I think this may be in part because we have been confusing the workflow utility of PDF handling with the strange world of scientific networking. In one of the many frank and helpful comments made by Annette Thomas in the interview I referred to earlier this week, she remarked that much of what Nature had done to &#8220;create&#8221; networking between scientists had shown very modest results. She said that while scientists showed a modest appetite for networking via news and blog comments, she thought that Nature Networks did not succeed because they lacked the immediacy and involvement of workflow tools, and it was more likely that in this context real contact between self-formed interest groups would take place. Here she seems to be moving closer to the Mendeley (<a href="http://www.mendeley.com">www.mendeley.com</a>) position, but with a qualification. She clearly feels that you build the utilities first, and then see how interest groups develop their own dynamic using the shared information created by the toolset. Crowd-sourcing a la Mendeley is good, but self determination may be better.</p>
<p>Thinking about Portland Press and Jan Velterop also took me back to Jan&#8217;s company, Academic Concept Knowledge Ltd (AQnowledge &#8211; <a href="http://aqnowledge.tumblr.com">http://aqnowledge.tumblr.com</a>). The semantic search environment created here is now embedded in Utopia Documents. But this is not what strikes me most emphatically about Jan&#8217;s work in recent years. Here is a hugely experienced academic research publisher who is not format bound and can think beyond the book, the journal, and even the article. Integrating antibodies-online.com, with its 300,000 antibodies and related products for concept matching shows that he and his team are creating a small player with an eye for data and for what research workflow really entails. By putting together all of the laboratory supply sources and the raft of descriptive material that they generate AQnowledge may be doing more for using article stores as a live element in workflow than any of their peers. Yet it has taken a company like BioRAFT  (<a href="http://www.bioraft.com">www.bioraft.com</a>) to push this home with compliance information, demonstrating once again that we are in the sectoral tools age of workflow, unable as yet to envisage the full desktop of tools and utilities, or the way they link together, or indeed the Electronic Lab Manual to which they in all probability lead.</p>
<p>Finally, STM now has major players &#8211; think of MarkLogic, TEMIS and SilverChair to name but three &#8211; quite capable of deploying the technology to drive towards the Big Data vision which I referenced in my previous piece. So, with all of this in the wings, why do the publishers still want to pursue the parochial and eschew the visionary?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidworlock.com/2011/10/i-can-see-so-clearly-now/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The Road to Dogpatch Labs</title>
		<link>http://www.davidworlock.com/2011/10/the-road-to-dogpatch-labs/</link>
		<comments>http://www.davidworlock.com/2011/10/the-road-to-dogpatch-labs/#comments</comments>
		<pubDate>Tue, 11 Oct 2011 20:21:17 +0000</pubDate>
		<dc:creator>dworlock</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[eBook]]></category>
		<category><![CDATA[Industry Analysis]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[mobile content]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[STM]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://www.davidworlock.com/?p=897</guid>
		<description><![CDATA[This week is Frankfurt, and thus the pleasure of interviewing Annette Thomas, Macmillan CEO on the STM conference agenda, traditional forerunner of the Frankfurt Book Fair. And I find a hint of nostalgia in the conference programme which precedes our event. It has a traditional flavour. For whenever STM publishers sit down to discuss the [...]]]></description>
			<content:encoded><![CDATA[<p>This week is Frankfurt, and thus the pleasure of interviewing Annette Thomas, Macmillan CEO on the STM conference agenda, traditional forerunner of the Frankfurt Book Fair. And I find a hint of nostalgia in the conference programme which precedes our event. It has a traditional flavour. For whenever STM publishers sit down to discuss the twin evils of Open Access and Peer Review (or those who slight it) they do so with a lip-smacking relish which is more akin to tucking into Christmas turkey than a logical discussion of the issues facing scholarly communication. Indeed I sometimes wonder if &#8220;science publishing&#8221; has gone off on its own, leaving &#8220;scholarly communication&#8221; to the scholars.</p>
<p>Let me try to illustrate what I mean. The looming crisis in STM, in my warped view, is the data crisis. In every other sector it is rapidly becoming clear that increasingly sophisticated data mining and extraction techniques will come into play as users seek to extract new meaning from existing files, and further discovery as they cross search those files with currently unstructured content held elsewhere. STM, it seems to me, is peculiarly susceptible to this Big Data syndrome, for behind the proprietory content stores of perfectly preserved published research articles &#8220;owned&#8221; by publishers lies the terra incognito of research data and findings held in labs and on research networks. Future scholars will want to search everything together, and will be impatient with barriers which prevent this. Once the tools and utilities which comprise research workflow become generally available and the techniques and value of semantic searching locks into this, the urge becomes irresistible, and scholarly article data gets versioned, commoditized, &#8220;outed &#8220;. It does not really matter if it is located on the open web, the closed web, or in the cloud or in a university repository.</p>
<p>The implications of this are vast. Scholars want to be published by prestigious branded journals as a way of being noted: they also want to be searched in the bloodstream of science. They will make sure they are everywhere, and that their data is where it needs to be as well. The metadata may note that this article was Gold OA and that one was published by Science, but this may be of most interest to the filtering interface in the workflow environment, which uses the information to rank or value results. And  there is a finding from 25 years ago which continues to haunt me in STM,  which alleges that most searches are performed not to find claims or results, but to discover, check and compare experimental methodologies and techniques. In a world where regulation and compliance grew ever more powerful, this is unlikely to diminish.</p>
<p>So I have come to feel that Open Access (one participant asked me what market share it would eventually have, and was appalled when I said 15% &#8211; before it becomes wholly irrelevant) and Peer Review (increasingly all research validation exercises will be multi-metric, so even the traditional argument collapses) are more about the preservation of publishers than the future of scholarly communication. Not that I object to that preservation, but I really did sit up as Annette Thomas, in her interview, began to describe some of the game changing activity that Digital Science, child of Nature, is doing as an investor in a variety of workflow-enhancing technologies built by bench researchers for themselves (<a href="http://digital-science.com/products" target="_blank">http://digital-science.com/products</a>).</p>
<p>And in particular the announcement, made during the session, that Labtiva, a Digital Science investment at Harvard (sited in Dogpatch Labs) was launching ReadCube as an App (<a href="http://www.readCube.com" target="_blank">http://www.readCube.com</a>). If anything bespeaks workflow then it is the App. And what does this one do? It allows researchers to order their current world of articles as a personal content library, free and Cloud-based, with features like a filing system for PDFs, fast download from a university or institutional login, the ability to save and re-read annotations, cite and create references and a personalised recommendation services. In other words, a smart App, worthy of the world of iPad, which solves the distressing everyday issues of finding what you once downloaded and recalling what you once thought about it, and finding more of the same. What could be more simple? But in simplicity like this there is a form of beauty. An App is definable as a workload tool which takes clumsy pieces of multi-stage routine out daily interactions with work &#8211; and makes sure you do not have to remember next time the cumbersome process you had to perform to do that.</p>
<p>So, whatever the  introspective mood in the room, here is one publisher setting off on the migration to new values, determinedly seeking the pain points in the researchers&#8217; working life and seeking to solve them. And indeed, other publishers (including Elsevier with their SciVerse and SciVal developments) are heading in the same direction. Yet the contrast between this and the generality of players in the sector is profound. At one point in the meeting I found myself in a discussion about what was going right with STM in a difficult marketplace dependent on government finance. Well, said one very knowledgeable source, we are doing a great deal with eBooks, selling them into places we never thought we would reach. Enhanced with video or audio? No, just reversioning of text. And library subscriptions are holding up really quite well, said another, and the market seems to have been able to absorb some limited price increases. And so I took away a picture of a sector holding its breath and hoping that things would revert to normal, and traditional business models would prevail. But we all knew in our hearts that when &#8220;normal&#8221; came back it would be different. Postponing the trek down the road to Dogpatch Labs only loses first mover advantage, the experience born of re-iteration, and ensures that it will be more difficult to change successfully in the long term.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidworlock.com/2011/10/the-road-to-dogpatch-labs/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Dog Days in the Data Mine</title>
		<link>http://www.davidworlock.com/2011/09/dog-days-in-the-data-mine/</link>
		<comments>http://www.davidworlock.com/2011/09/dog-days-in-the-data-mine/#comments</comments>
		<pubDate>Thu, 22 Sep 2011 18:17:26 +0000</pubDate>
		<dc:creator>dworlock</dc:creator>
				<category><![CDATA[B2B]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Financial services]]></category>
		<category><![CDATA[healthcare]]></category>
		<category><![CDATA[Industry Analysis]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[Reed Elsevier]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[STM]]></category>
		<category><![CDATA[Thomson]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://www.davidworlock.com/?p=873</guid>
		<description><![CDATA[It reminds one superficially of mineral extraction. Who owns the seam of diamonds &#8211; the miner or the landowner? When rights are not clear or landownership in dispute? But this business of text or data mining is not really like that at all, and I was reminded this week by blogging contributions from two old friends [...]]]></description>
			<content:encoded><![CDATA[<p>It reminds one superficially of mineral extraction. Who owns the seam of diamonds &#8211; the miner or the landowner? When rights are not clear or landownership in dispute? But this business of text or data mining is not really like that at all, and I was reminded this week by blogging contributions from two old friends that who owns the results of data extraction, from thousands or millions of unstructured files, where the data retrieved from individual datasets may be tiny (well within most fair usage provisions) but the contribution to the whole value may be huge, remains at issue. Play this in the context of Big Data and real questions emerge.</p>
<p>Lets go back to the beginning. Here are a couple of top of head examples of life on the planet that give a clue to what is worrying me:</p>
<p>* According to research quoted by the UK&#8217;s National Centre for Text Mining &#8220;fewer than 7.84% of scientific claims made in a full text article are reported in the abstract for that article&#8221;. This, they point out, makes cross-searching of articles using data mining and extraction techniques very important to science research. Fortunately the JISC organization which licences all journal article content from publishers on behalf of UK universities permits researchers to data mine these files, and no doubt this was agreed with the publishers within the license(?). But the question in my mind is this: who owns the product created by the data mining, and is this a new value which can be resold to someone else?</p>
<p>* Lexis Risk Management use many hundreds of public and private US data resources in their Big Data environment to profile people and companies. Both private and public data is researched, and, of course, it will often be the case that unique connections will be thrown up which encourage or discourage users from doing business with the data subject. Clearly Lexis own the result of the custom sweep of the data, and clearly it needs to be updated and amended over time as a result of fresh data becoming available, or more data being licensed into the mine. But do Lexis, or any other data extractor, own the result of the extraction process? They are able to sell a value derived from it, and that value emerges directly from the search activity and the weighting of the answers that they have accomplished. But do they own or need to own the content (which may be different in ten minutes time when another search is done on the same subject)? And can the insurance company who buys that result as part of their risk management model resell the data content itself to a third party?</p>
<p>I have put up two examples because I do not wish to polarize the argument into publishers v government. The issue arises in the UK, as the media lawyer&#8217;s lawyer, Laurie Kaye has pointed out, because the Hargreaves Review of copyright law recommends the retention of rights with the data miner &#8211; so you can make new products by recombining other people&#8217;s data. The UK government has adopted this recommendation with its usual emphatic &#8220;maybe&#8221;. Elsewhere in the world of August which I deserted to take a holiday, the UK government has come out with a storming approval of Open Data, and, as Shane O&#8217;Neill has repeatedly pointed out in his blogs, this contrasts sharply with the content retention policies pursued by UK civil servants, even now creating a Public Data Corporation in order to frustrate the political drive of its masters (how easily a licensing authority becomes a restricting body!).</p>
<p>There are two really troubling aspects of this to me. In the first instance we are not going to get the data revolution, the Berners Lee dream of linked data, the creation of hybrid workflow content modelling, or the Big Data promise of new product and service development unless there is a primary assumption in our society that all Open Web content, and all government or taxpayer funded content is available for data cross searching, unless there are national security considerations. And that it is a standard expectation for data leasing that discovery from multiple files creates new services for the person putting the intellectual effort into that discovery, and hopefully new wealth and employment in our society. If we simply continue to debate copyright as if it connotes the transfer of real world rights into the digital network then we shall constrain the major hope of intellectual property development this century.</p>
<p>And the second thing? Well, I am realist enough to know, after 20 years of lobbying this point, that it is unreasonable to expect the UK government to change its attitude to an information society in my lifetime. So maybe we can undermine these guardians of &#8220;my information is my power&#8221; by saying that we do not want their content &#8211; just the right to search it. After all if it is good enough for the universities and the progress of science, it should be good enough for Ordnance Survey and the Land Registry!</p>
<p><strong>References</strong></p>
<p>Making Open Data Real (<a href="http://www.data.gov.uk/opendataconsultation">www.data.gov.uk/opendataconsultation</a>)</p>
<p>The Public Data Corporation (<a href="http://discuss.bis.gov.uk/pdc/">http://discuss.bis.gov.uk/pdc/</a>)</p>
<p>Response to the Hargreaves Report (<a href="http://www.bis.gov.uk/assets/biscore/innovation/docs/g/11-1199-government-response-to-hargreaves-review">http://www.bis.gov.uk/assets/biscore/innovation/docs/g/11-1199-government-response-to-hargreaves-review</a>)</p>
<p>National Centre for Text Mining (<a href="http://www.bis.gov.uk/assets/biscore/innovation/docs/g/11-1199-government-response-to-hargreaves-review">http://www.bis.gov.uk/assets/biscore/innovation/docs/g/11-1199-government-response-to-hargreaves-review</a>)</p>
<p>Laurence Kaye (<a href="http://laurencekaye.typepad.com/">http://laurencekaye.typepad.com/</a>)</p>
<p>Shane O&#8217;Neill (<a href="http://www.shaneoneill.co.uk/">http://www.shaneoneill.co.uk/</a>)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidworlock.com/2011/09/dog-days-in-the-data-mine/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

