<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>DavidWorlock.com</title>
	<atom:link href="http://www.davidworlock.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.davidworlock.com</link>
	<description></description>
	<lastBuildDate>Tue, 15 May 2012 19:15:04 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Decline and Fall of the Google Empire: Revisited</title>
		<link>http://www.davidworlock.com/2012/05/decline-and-fall-of-the-google-empire-revisited/</link>
		<comments>http://www.davidworlock.com/2012/05/decline-and-fall-of-the-google-empire-revisited/#comments</comments>
		<pubDate>Fri, 11 May 2012 18:13:41 +0000</pubDate>
		<dc:creator>dworlock</dc:creator>
				<category><![CDATA[B2B]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Financial services]]></category>
		<category><![CDATA[healthcare]]></category>
		<category><![CDATA[Industry Analysis]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[mobile content]]></category>
		<category><![CDATA[news media]]></category>
		<category><![CDATA[online advertising]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[STM]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://www.davidworlock.com/?p=1247</guid>
		<description><![CDATA[I have been waiting to write this post for four months. Ever since I wrote a piece with this title in January 2011 friends and colleagues have been asking &#8220;And now&#8230;?&#8221;, and this has intensified since Google&#8217;s results announcement in January 2012. 25% revenue growth? Breaking $10 billion revenue in a single quarter? In anyone elses&#8217; [...]]]></description>
			<content:encoded><![CDATA[<p>I have been waiting to write this post for four months. Ever since I wrote a piece with this title in January 2011 friends and colleagues have been asking &#8220;And now&#8230;?&#8221;, and this has intensified since Google&#8217;s results announcement in January 2012. 25% revenue growth? Breaking $10 billion revenue in a single quarter? In anyone elses&#8217; results statement this would have been sparkling news in a recession. Google&#8217;s shares dropped 10% on the news. And then the analysis. Cost-per-click &#8211; Google&#8217;s revenue from advertizers &#8211; fell 8% in the quarter, and the same amount in the previous quarter. This is a company still totally dependent on advertising. Imagine a newspaper company whose yield from classifieds fell 8% per quarter to see the wonderful way in which &#8220;velocity&#8221;, as Larry Page describes growth, disguises performance.</p>
<p>When I last wrote on this subject I was trying to describe an advertising-based search company that was trying to kick the habit and migrate elsewhere. Clearly Android, now on 250 million handsets, is the most obvious escape hatch. Analysts forecast that 2012 will see Android account for 12% of gross revenues, which demonstrates that migration is slow and old habits die hard. So if my grandchildren do not grow up thinking of Google as a phone company, as I suggested in the original blog, what will they think of the mature Google, shuffling along in the carpet-slippers of 10% growth? Well, they could imagine it as an operating system &#8211; Chrome is still growing strongly and Chrome OS has not been fully exploited. Or they could think of it as a social network environment: Google+ is now up to 90 million members, still a fraction of Facebook, but up from 40 million the previous quarter. Indeed, social networking may be a &#8220;must win&#8221;, or at least a &#8220;must compete strongly&#8221; environment for Google if the search-advertising market is to be prolonged long enough for these other options to emerge from under the strategy umbrella. With Google taking the axe to so many of its product development fields directly related to search, this requirement is exacerbated.</p>
<p>However, what really gets me writing this evening is the strong suspicion that Google themselves think that the answer is elsewhere. An interview with Ben Fried, the Google CIO, in the Wall Street Journal yesterday has him saying that the Cloud is reaching a tipping point (<a href="http://blogs.wsj.com/cio/2012/05/10/google-cio-ben-fried-says-cloud-tipping-point-is-at-hand/?mod=google_news_blog">http://blogs.wsj.com/cio/2012/05/10/google-cio-ben-fried-says-cloud-tipping-point-is-at-hand/?mod=google_news_blog</a>). Google clearly feel that Cloud computing, in the age of ubiquitous broadband (whenever that happens), will be their route to a business base in individual and small business sectors. As Google has used the Cloud to take costs out of its own core business, which given the comments above it has needed to do, so it can use its global data centre coverage to do the same for others. In this world, where we can fondly imagine two remotely sited workers watching each other&#8217;s real time edits on a document in Google Docs, small development teams can access a wide range of tools and pursue the sort of &#8220;fail fast&#8221;, constantly re-iterating, development strategies beloved of major corporates.</p>
<p>But this is a place where the competition is established, hot and strong, and despite Google&#8217;s history as a solutions developer, Apple and Microsoft go back further. iCloud, dependent on a syncing environment rather than the broadband, moves all the files to the Cloud, with users retaining copies and, as Steve Jobs is always quoted as saying, demoting &#8220;the PC to be just a device&#8221;. There is a different philosophy of Cloud here, but one that seems more based on now than when. And then again there is Amazon, inspired, as was Google, by the long struggle to use the Cloud to solve its own back office issues, now offering AWS as a solution in the very markets that Google thinks should be its own.</p>
<p>So it cannot be just the Cloud that Google see as their exit-from advertising-dependence platform. But the Cloud and Big Data? This article&#8217;s timing is much influenced by the announcement of Google BigQuery, which, although semi-publicly trialled since December last year, was formally launched on 1 May (<a href="http://www.zdnet.com/blog/big-data/googles-bigquery-goes-public/405">http://www.zdnet.com/blog/big-data/googles-bigquery-goes-public/405</a>). Since it covers databases of up to two terabytes (seems big to me!), this has been described as a business intelligence tool by some commentators who expected larger database environments from the inventor of MapReduce (working in pedabytes), who kicked off this Big Data thing to begin with and are clearly working here as elsewhere from the &#8220;solve our own problems, then generalize to solve yours&#8221; standpoint indicated above. But here is a real irony: if you are working in a Big Data context much of what you will be looking for is indexed on Google, but not searchable in a Google Cloud context. Again, contrast Amazon, where they have now begun adding public databases to their Cloud offering, searchable in their EC2 (Electric Compute Cloud) context. Here are some of the first offerings:</p>
<ul>
<li>&#8220;<a href="http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2315"><strong>Annotated Human Genome Data provided by <em>ENSEMBL</em></strong><br />
</a>The Ensembl project produces genome databases for human as well as almost 50 other species, and makes this information freely available.</li>
</ul>
<ul>
<li><a href="http://developer.amazonwebservices.com/connect/kbcategory.jspa?categoryID=248"><strong>Various US Census Databases from <em>The US Census Bureau</em></strong><br />
</a>United States demographic data from the 1980, 1990, and 2000 US Censuses, summary information about Business and Industry, and 2003-2006 Economic Household Profile Data.</li>
</ul>
<ul>
<li><a href="http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2283"><strong>UniGene provided by <em>the National Center for Biotechnology Information </em></strong><br />
</a>A set of transcript sequences of well-characterized genes and hundreds of thousands of expressed sequence tags (EST) that provide an organized view of the transcriptome.</li>
</ul>
<ul>
<li><a href="http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2320"><strong>Freebase Data Dump from <em>Freebase.com</em></strong><br />
</a>A data dump of all the current facts and assertions in the Freebase system. <a href="http://www.freebase.com">Freebase</a> is an open database of the world’s information, covering millions of topics in hundreds of categories. Drawing from large open data sets like Wikipedia, MusicBrainz, and the SEC archives, it contains structured information on many popular topics, including movies, music, people and locations – all reconciled and freely available.&#8221;</li>
</ul>
<p>In all, Google now face a struggle. As they move to a new service environment, we need to remember that they created the original company not by inventing search but improving it. Page ranking was a big step forward in its day and created a meteoric growth company. From this they built an Empire, now maturing. Edward Gibbon, commenting upon the fall of Rome and the rise of its rivals, marked a certain point of no return. &#8220;If all the barbarian conquerors had been annihilated in the same hour, their total destruction would not have restored the empire of the West: and if Rome still survived, she survived the loss of freedom, of virtue, and of honour.&#8221;</p>
<p>Is this where Google now is, and can its still youthful originators recreate it?</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidworlock.com/2012/05/decline-and-fall-of-the-google-empire-revisited/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Fiddling with your Devices</title>
		<link>http://www.davidworlock.com/2012/05/fiddling-with-your-devices/</link>
		<comments>http://www.davidworlock.com/2012/05/fiddling-with-your-devices/#comments</comments>
		<pubDate>Tue, 08 May 2012 13:26:18 +0000</pubDate>
		<dc:creator>dworlock</dc:creator>
				<category><![CDATA[B2B]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[data protection]]></category>
		<category><![CDATA[eBook]]></category>
		<category><![CDATA[eLearning]]></category>
		<category><![CDATA[healthcare]]></category>
		<category><![CDATA[Industry Analysis]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[mobile content]]></category>
		<category><![CDATA[news media]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://www.davidworlock.com/?p=1234</guid>
		<description><![CDATA[In our shuffling ascending spiral motion up the great Tower of Time, we do denial at every turn of the stair. A year ago: &#8220;Devices are just display tech and will never replace real multi-functional office computing&#8221;. Today: &#8220;Everything goes to the Cloud&#8221;. Last year: &#8220;Everybody must build all the functionality into Apps&#8221;. Now: &#8220;Personalization will [...]]]></description>
			<content:encoded><![CDATA[<p>In our shuffling ascending spiral motion up the great Tower of Time, we do denial at every turn of the stair. A year ago: &#8220;Devices are just display tech and will never replace real multi-functional office computing&#8221;. Today: &#8220;Everything goes to the Cloud&#8221;. Last year: &#8220;Everybody must build all the functionality into Apps&#8221;. Now: &#8220;Personalization will overtake Apps before Apps take over publishing&#8221;. The result is familiar. Let me see if I can deepen the gloom and make the waters more muddy for a moment, knowing that our only hope of insight comes from bafflement and obscurity.</p>
<p>It seems to me for a start that publishers really do not like Apps. They are convenient, developers love them, they work at the subscription level, but as information products they are not very satisfying. Many of them lack the linkability which has now become a habit of mind for network users. They are certainly Workflow, and invaluable if you are buying a train ticket or booking an hotel. Elsewhere they are often Shortcuts to Nowhere. The statistics tell us that the vast majority of App downloads are never used twice. Since they are tied to devices and the formatting demanded by device manufacturers they do not meet the expectation that we encouraged the former print world to accept: go digital neutral and cover every channel of distribution. They work well for community and clubs, where they can act as a holding point for shared content and a jumping off point for discussion, but I am becoming so unsure of the hegemony of Devices that it is undermining my faith in Apps as well.</p>
<p>The last straw was a note on Pebble in the Guardian (8 May 2012). Pebble is a wristwatch lookalike device based on eInk and providing email and text access on Android and iPhone. (<a href="http://www.wired.co.uk/news/archive/2012-04/12/pebble-e-ink-smartwatch">http://www.wired.co.uk/news/archive/2012-04/12/pebble-e-ink-smartwatch</a>). This really hurts. I have been going round for years telling everyone that the reason my children do not seem to wear watches is that they are people of a modern age who work (albeit late) on network time. But despite the founders of Pebble raising £5 million in funding through product pre-purchases, this only convinces me even more that we are wrong if we start &#8220;publishing to&#8221; devices as if they were a platform or a channel. I think that our efforts need to be directed elsewhere, while we watch devices morph into new forms and bifurcate across functions. The day will come when we shall each have several (I have unfortunately already arrived) and they will be dedicated to use purposes in our lives &#8211; this for flying, that for taking to meetings, this other for holidays etc. The device spec will be governed by our purposes and requirements in these functions, not by any attempt to put every function into every device. The device in the car will have different requirements from the one in the kitchen, though of course some of the functionality will be the same.</p>
<p>All of which rather begs the question of what environments we should be publishing for if not specifically for Apps and devices. And the answer, of course, is the personalized Cloud. The environments we should be watching are Apple&#8217;s iCloud, and Amazon (AWS)&#8217;s CloudSearch. In this sense, current battles in the book sector are simply a kindergarten warm-up for the big battles out in the playground at lunch hour. Current popular neurosis about privacy (an odd but real phenomenon, since the security services have always had unfettered access to our deepest secrets, at least since Sir Francis Walsingham bought his first thumb screw) and the business drive to Cloud computing will come together in Personal Cloud. There I will have my library, my searchable subscriptions and, above all things, my Cloud Server. This will end all questions about the Web as a service venue &#8211; it will become a place for browse and research, not a full service zone. That I will control for myself, as well as all the data derived from it, and on that server I will decide what access to content derived from me and my activities that I give to third parties (using long available services like Paoga &#8211; <a href="http://www.paoga.com">www.paoga.com</a> - to do this). The device that measures my blood pressure files the results in my Cloud, gives me well-informed medical guidance from the selection of service vendors that I trust and subscribe to, but only releases my actual results to my physician at regular periodicity &#8211; and his monitoring devices tell him when we need to talk. Come to think of it, the Pebble strapped round my wrist could handle the pulse for a start!</p>
<p>I have too little space here to demonstrate the full extent of my ignorance more than superficially. My feeling from reading is that Amazon&#8217;s announcements last month now put them a little in the lead over Apple and Google (<a href="http://www.readwriteweb.com/cloud/2012/04/amazon-beats-google-to-a-cloud.php">http://www.readwriteweb.com/cloud/2012/04/amazon-beats-google-to-a-cloud.php</a>). Apple&#8217;s concern was content sharing across devices (<a href="https//www.apple.com/uk/iCloud">https//www.apple.com/uk/iCloud</a>). Googles of course was search, but clearly both Amazon and Google are alike in the vastness of their server farm environments and their ability to support global personal and corporate Cloud usage. And Amazon, having started AWS in 2006, may be said to have the experience, and the readiness to move into these new worlds. We are entering the age of &#8220;he is so old he can remember when Amazon was a bookseller&#8221;. An annual rental of CloudSearch costs 100 USD.</p>
<p>So has my once upon a time dream of the consolidated omni device completely faded? Probably so, though we are likely to be bewildered by the range of device offerings and their narrow differentiation for many years to come. Meanwhile, the next virtual world builds quietly in the Cloud, and demands our total attention.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidworlock.com/2012/05/fiddling-with-your-devices/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Open Up Your APIs!</title>
		<link>http://www.davidworlock.com/2012/04/open-up-your-apis/</link>
		<comments>http://www.davidworlock.com/2012/04/open-up-your-apis/#comments</comments>
		<pubDate>Fri, 27 Apr 2012 20:58:12 +0000</pubDate>
		<dc:creator>dworlock</dc:creator>
				<category><![CDATA[B2B]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[eBook]]></category>
		<category><![CDATA[Education]]></category>
		<category><![CDATA[eLearning]]></category>
		<category><![CDATA[Financial services]]></category>
		<category><![CDATA[Industry Analysis]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[mobile content]]></category>
		<category><![CDATA[news media]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[STM]]></category>
		<category><![CDATA[Thomson]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://www.davidworlock.com/?p=1229</guid>
		<description><![CDATA[In this industry five years is enough to benchmark fundamental change. This week I have been at the 9th Publishers&#8217; Forum, organized as always by Klopotek, in Berlin. This has become, for me, a must attend event, largely because while the German information industry is one of the largest in Europe, German players have been [...]]]></description>
			<content:encoded><![CDATA[<p>In this industry five years is enough to benchmark fundamental change. This week I have been at the 9th Publishers&#8217; Forum, organized as always by Klopotek, in Berlin. This has become, for me, a must attend event, largely because while the German information industry is one of the largest in Europe, German players have been marked by a conservative attitude to change, and a cautious approach to what their US and UK colleagues would now call the business model laws of the networked information economy. At some level this connects to a deep German cultural love affair with the book as an object, and how could that not be so in the land that produced Gutenburg? On another level, it demonstrates that German business needs an overwhelming business case justification to institute change, and that it takes a time for these proofs to become available. Which is not to say that German businesses in this sector have not been inventive. An excellent two part case study run jointly by Klopotek and de Gruyter was typical: de Gruyter are the most transformed player in the STM sector because they have seized upon distribution in the network and selling global access as a fast growth path, and Klopotek were able to supply the necessary eCommerce  and back office attributes to make this ambition feasible. And above all, in a room of more than 300 newspaper, magazine and book executives, we were at last able to fully exploit the language and practice of the network in information handling terms. This dialogue would have been impossible in Germany five years ago. A huge attitudinal change has taken place. Now we can deploy our APIs and allow users to get the value and richness of our content, contextualised to their needs, instead of covering them with the stuff and hoping they get something they want.</p>
<p>In some ways the Day 2 Keynote from Andrew Jordan, CTO at Thomson Reuters GRU business, exemplified the extent of this. The incomparable Brian O&#8217;Leary had started us off on Day 1 in good guru-ish style by placing context in its proper role and reminding us that it is not content as such but its relationships that increasingly concern us. You could not listen to him and still believe that content was the living purpose of the industry, or that the word &#8220;publishing&#8221; had not changed meaning entirely. With Michael Healy of CCC and  Peter Clifton of +Strategy following him to hammer home the new world of collaboration and licencing, and the increasing importance of metadata in order to identify and describe tradeable entities, we were well on the way towards a recognition of new realities, ferried there before dinner by Jim Stock of MarkLogic using the connected content requirements of BBC Sport in an Olympic year to get us started in earnest on semantic approaches to discovery and our urgent needs to create appropriate platform environments to allow us to use our content fluently in this context.</p>
<p>So the ground was well-prepared for Andrew Porter. He took us on a journey from the acquisition of ClearForest by Reuters while it was being acquired by Thomson, to the use of this software by the new company to create OpenCalais, allowing third parties (over 60 of them) to get into entity extraction (events and facts, essentially) and then into the creation of complex cross-referencing environments, and finally to the use of this technology by Thomson Reuters themselves in the OneCalais and ContentMarketplace environments. So here was living proof of the O&#8217;Leary thesis, on a vast scale, building business-orientated ontologies, and employing social tagging in a business context. Dragging together the whole data assets of a huge player to service the next customer set or market gap. And no longer feeling obliged to wrap all of this in a single instance database, but searching across separately-held corporate datasets in a federated manner using metadata to find and cross-reference entities or perform disambiguation mapping. Daniel Mayer of Temis was able to drive this further and provide a wide range and scale of cases from a technology provider of note. The case was made &#8211; whether or not what we are now doing is publishing or not, it is fundamentally changed once we realize that what we know about what we know is as important as our underlying knowledge itself.</p>
<p>And of course we also have to adjust our business models and our businesses to these new realities &#8211; patient Klopotek have been exercising expertise in enabling that systems re-orientation to take place for many years. And we must recognize that we have not arrived somewhere, but that we are now in perpetual trajectory. One got a real sense of this from an excellent presentation to a very crowded room by Professor Tim Bruysten of richtwert on the impact of social media, and, in another way, from Mike Tamblyn of Kobo when he spoke of the problems of vertical integration in digital media markets. And, in a blog earlier this week, I have already reported on the very considerable impact of Bastiaan Deplieck of Tenforce.</p>
<p>Speaking personally, I have never before attended a conference of this impact in Germany. Mix up everything in the cocktail shaker of Frank Gehry&#8217;s great Axica conference centre alongside the Brandenburg Gate, with traditional book publishers rubbing shoulders with major information players, and chatting to software gurus, industry savants, newspaper and magazine companies, enterprize software giants and business service providers and you create a powerful brew in a small group. Put them through seperate German and English streams, then mix them up in Executive Lounge seminars and discussion Summits and the inventive organizers give everyone a chance to speak and to talk back. This meeting had real energy and, for those who look for it, an indication that the changes wrought by the networked economy and its needs in information/publishing terms, now burn brightly in the heart of Europe.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidworlock.com/2012/04/open-up-your-apis/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Link Arms for Linked Data</title>
		<link>http://www.davidworlock.com/2012/04/link-arms-for-linked-data/</link>
		<comments>http://www.davidworlock.com/2012/04/link-arms-for-linked-data/#comments</comments>
		<pubDate>Wed, 25 Apr 2012 20:41:00 +0000</pubDate>
		<dc:creator>dworlock</dc:creator>
				<category><![CDATA[B2B]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[data protection]]></category>
		<category><![CDATA[Education]]></category>
		<category><![CDATA[Financial services]]></category>
		<category><![CDATA[healthcare]]></category>
		<category><![CDATA[Industry Analysis]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[mobile content]]></category>
		<category><![CDATA[news media]]></category>
		<category><![CDATA[Reed Elsevier]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[Thomson]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://www.davidworlock.com/?p=1217</guid>
		<description><![CDATA[Now we are entering the post-competitive world (with a few exceptions!) it is worth pausing for a moment to consider how we are going to get all of the content together  and create the sources of linked data which we shall need to fuel the service demand for data mining and data extraction. Of course, [...]]]></description>
			<content:encoded><![CDATA[<p>Now we are entering the post-competitive world (with a few exceptions!) it is worth pausing for a moment to consider how we are going to get all of the content together  and create the sources of linked data which we shall need to fuel the service demand for data mining and data extraction. Of course, this is less of a problem if you are Thomson Reuters or Reed Elsevier. Many of the sources are relationships that you have had for a long time. Others can be acquired: reflect on the work put in by Complinet to source the regulatory framework for financial services prior to its acquisition by Thomson Reuters, and reflect that relatively little of this data is &#8220;owned&#8221; by the service provider. Then you can create expertise and scale in content sourcing, negotiating with government and agency sources, and forming third party partnerships (as Lexis Risk Management did with Experian in the US). But what if you lack these resources, find that source development and licensing would create unacceptable costs, but still feel under pressure to create solutions in your niche which will reflect a very much wider data trawl than could be accomplished using your own proprietory content?</p>
<p>The answer to this will, perhaps, reflect developments already happening in the education sector. Services like Global Grid for Learning, or the TES Connect Resources which I have described in previous blogs give users, and third party service developers (typically teacher&#8217;s centres or other &#8220;new Publishers&#8221;) the ability to find quality content and re-use it, while collaborations like Safari  and  CourseSmart allow customization of existing textbook products. So what sort of collaborations would we expect to find in B2B or professional publishing which would provide the quarries from which solutions could be mined? They are few and far between, but, with real appreciation for the knowledge of Bastiaan Deblieck at TenForce in Belgium, I can tell you that they are coming.</p>
<p>Lets first of all consider Factual Inc (<a href="http://www.factual.com">www.factual.com</a>). Here are impeccable credentials (Gil Elbiaz, the founder, started Applied Semantics and worked at Google) and a VC-backed attempt to corner big datasets, apply linkage and develop APIs for individual applications. The target is the legion of mash-up developers and the technical departments of small and medium sized players. Here is what they say about their data:</p>
<p>&#8220;Our data includes comprehensive <a href="/data/t/global">Global Places</a> data, with over 60MM entities in 50 countries, as well as deep dives in verticals such as <a href="/data/t/restaurants-us">U.S. Restaurants</a> and <a href="/data/t/health-care-providers-us">U.S. Healthcare Providers</a>. We are continually improving and adding to our data; feel free to <a href="/product/data">explore</a> and <a href="/partners/register">sign up</a> to get started!</p>
<p>Factual aggregates data from many sources including partners, user community, and the web, and applies a sophisticated machine-learning technology stack to:</p>
<ol>
<li>Extract both unstructured and structured data from millions of sources</li>
<li>Clean, standardize, and canonicalize the data</li>
<li>Merge, de-dupe, and map entities across multiple sources.</li>
</ol>
<p>We encourage our partners to provide edits and contributions back to the data ecosystem as a form of currency to reduce the overall transaction costs via exchange.&#8221;</p>
<p>As mobile devices proliferate, this quarry is for the App trade, and here is, in the opinion of Forbes (19 April 2012), another Google in potential in the field of business intelligence (<a href="http://www.forbes.com/sites/danwoods/2012/04/19/how-factual-is-building-an-data-stack-for-business/2/">http://www.forbes.com/sites/danwoods/2012/04/19/how-factual-is-building-an-data-stack-for-business/2/</a>).</p>
<p>But Los Angeles is not the only place where this thinking is maturing. Over in Iceland, now that the banking has gone, they are getting serious about data. DataMarket (<a href="http://datamarket.com">http://datamarket.com</a>), led by Hjalmar Gislason from a background of startups and developing new media for the telco in Iceland, offers a very competitive deal, also replete with API services and revenue sharing with re-users. Here is what they say about their data:</p>
<p>&#8220;DataMarket&#8217;s unique data portal &#8211; DataMarket.com &#8211; provides access to thousands of data sets holding hundreds of millions of facts and figures from a wide range of public and private data providers including the United Nations, the World Bank, Eurostat and the Economist Intelligence Unit. The portal allows all this data to be searched, visualized, compared and downloaded in a single place in a standard, unified manner.</p>
<p>DataMarket’s data publishing solutions allow data providers to easily publish their data on DataMarket.com and on their existing websites through embedded content and branded versions of DataMarket’s systems, enabling all the functionality of DataMarket.com on top of their own data collections.&#8221;</p>
<p>And finally, in Europe we seem to take a more public interest-type view of the issues. Anyway, a certain amount of impetus seems to have come from the Open Data Foundation, a not-for-profit which also has a connection and has helped to stimulate sites like OpenCharities, OpenSpending (how does your government spend your money?), and OpenlyLocal, designed to illuminate the dark corners of UK local and regional government. All of these sites have free data, available under a creative commons-style licence, but perhaps the most interesting, still in beta, is OpenCorporates. Claiming to have data on 42,165,863 companies (as of today) from 52 different jurisdictions is is owned by Chrinon Ltd, and run by Chris Taggart and Rob McKinnon, both of whom have long records in the Open data field. This will be another site where the API service (as well as a Google Refine service) will earn the value-add revenues (<a href="http://api.opencorporates.com/">http://api.opencorporates.com/</a>). Much of the data is in XML, and this could form a vital source for some user and publisher generated value add services. The site bears a recommendation from the EC Information Society Commissioner, Nelly Kroes, so we should also record that TenForce (<a href="http://www.tenforce.com/">http://www.tenforce.com/</a>) themselves are leading players in the creation of the Commission&#8217;s major Open Data Portal, which will progressively turn all that &#8220;grey literature, the dandruff of bureaucracy, back into applicable  information held as data.</p>
<p>We seem here to be at the start of a new movement, with a new range of intermediaries coming into existence to broker our content to third parties, and to enable us to get the licences and services we need to complete our own service developments. Of course, today we are describing start-ups: tomorrow we shall be wondering how we provided services and solutions without them.</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidworlock.com/2012/04/link-arms-for-linked-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Who Needs Editors?</title>
		<link>http://www.davidworlock.com/2012/04/who-needs-editors/</link>
		<comments>http://www.davidworlock.com/2012/04/who-needs-editors/#comments</comments>
		<pubDate>Sun, 22 Apr 2012 21:49:17 +0000</pubDate>
		<dc:creator>dworlock</dc:creator>
				<category><![CDATA[B2B]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Industry Analysis]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[mobile content]]></category>
		<category><![CDATA[news media]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.davidworlock.com/?p=1209</guid>
		<description><![CDATA[Its the language that gets you first. CEO in &#8220;brutal cull&#8221; of Johnston Press editors (http://www.guardian.co.uk/media/2012/apr/12/scotsman-editor-in-chief-johnston-press) is a great way to treat editors and subs as they have always treated the world &#8211; with a degree of lofty disdain. And I did not really catch on to the deep underlying question until I read Peter [...]]]></description>
			<content:encoded><![CDATA[<p>Its the language that gets you first. CEO in &#8220;brutal cull&#8221; of Johnston Press editors (<a href="http://www.guardian.co.uk/media/2012/apr/12/scotsman-editor-in-chief-johnston-press">http://www.guardian.co.uk/media/2012/apr/12/scotsman-editor-in-chief-johnston-press</a>) is a great way to treat editors and subs as they have always treated the world &#8211; with a degree of lofty disdain. And I did not really catch on to the deep underlying question until I read Peter Preston&#8217;s commentary on this (Observer 22 April, 2012). I usually regard that great ex-Guardian editor as my sanity check, so it was a real shock to find that he had it completely wrong too. No commentary that I have seen has grasped the essence of what Ashley Highfield is doing by this mass firing of senior (and very expensive) editorial potentates at Johnston, or what it realistically recognizes about the nature of news online.</p>
<p>Let me first declare an initial prejudice. During a five year tenure as non-executive Chairman of Fish4, when it was owned by the regionals themselves, it was my observation that Editors were an embattled barrier to digital progress. This is a dangerous generalization, but invariably Editors wanted to run Web presence as if it were the newspaper, were reluctant in those days to allow their own digital media to scoop the paper, used their role as protectors and developers of the brand to diminish and hold down their digital presence, and all too often regarded digital as a subordinate medium which must reflect and emulate print, not create an entirely new approach to the way in which news and comment is digested and responded to by its ultimate users.</p>
<p>So I love Ashley for doing this. It would have been a shade better if he had used Cromwell&#8217;s words when dismissing the Rump Parliament &#8211; &#8220;I beseech you in the bowels of Christ be gone &#8230;&#8221; and, pointing to the green eyeshade rather than the mace &#8211; &#8230;&#8221;and take that bauble with you.&#8221; But one cannot have it all. At a stroke, some mighty expenses have returned to the bottom line and a space has been cleared where the CEO can set to work re-inventing the company. So why will he get closer to getting it right without editors than with them?</p>
<p>It is the nature of digital news services to replace the editor by the reader. The key considerations are concerned with collecting information and relating it to the interests and needs of a targetted audience. Writing stories needs sub-editorial skills, but a great deal of future story creation will be automated (I have already commented here on Narrative Science and Selerity). The critical marketing input will be the interfaces offered to users to customize and personalize the content flow. The key feature of that activity will be the mark-up, tagging and metadata added to the content in process of uploading. The editorial function will be ensuring its accessibility by everyone, whatever their angle of approach. The skill will come in making those interfaces appetising &#8211; a marketing role and not an editorial one if ever I saw one. And a role performed by the same marketing team who will manage the digital brand and explain what it is.</p>
<p>At this point I hear Mr Preston straining to get into the argument, for his article is all about the importance of the &#8220;leader&#8221; article, and the controversy which Polly Toynbee attracts with her views as a commentator in the Guardian. I have no doubt at all that Miss Toynbee, who is, or deserves to be, a national institution, will glide controversially forward through time until she reaches her own Diamond Jubilee. And online we shall have many of her ilk. Lots of bloggers, many outraged citizens, lots of local councillors defending the indefensible, and pressure and lobbying groups special pleading all over the place. And we shall have all of the social media and social tagging attributes that run alongside this. This flow of activity will be open to all and separate from the news flow &#8211; something which newspapers cannot seem to manage. In the process of story selection and arrangement throughout the paper, they editorially flavour the news, giving it a &#8220;meaning&#8221; to readers even though the reader is buying the proposition of fair and proper treatment.</p>
<p>Which brings me to the Editorial page itself. If the views available online are catholic and wide-ranging, and multi-sourced &#8211; then finding out what the newspaper or its online version thinks is irrelevant, and Mr Preston, in a circumspect way, seems to be approaching this view as well. I would go further and ask what place the Editorial column has had in the regional press in the past two decades. In truth it has been the most unread section of the paper and has no place at all online. Do I know what the view of the Bucks Free Press is on Mr Murdoch? No, and it would mean little if it did have a view. And I would find it out of place on my smartphone or tablet. Do you, like me, smile when you come across the comment column in the Waste and Pollution Management Journal and find them battling with the issues of the day? And, like me, you probably have that experience less often now, because the editorial pitch, the idea that the organ has to stand its brand value behind a clear profile of views and arguments, has almost gone, and with it went the need for the editors whose pride and joy the curation of those views once were.</p>
<p>Change has a price, but I could argue that Mr Murdoch, aided and abetted by his chums Mr Coulson and the Flame Haired Temptress, got there first by turning the Sun, the NoW and eventually the Old Thunderer itself into another way of expressing the powerful urges of a controlling proprietor. But he did not do this to the regional press or the B2B subscription magazine: they did it to themselves. Ashly Highfield has recognized that, and what he must do if he is to start over, and this clearing of the decks is a very appropriate starting point.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidworlock.com/2012/04/who-needs-editors/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Sunset in Fiesole</title>
		<link>http://www.davidworlock.com/2012/04/sunset-in-fiesole/</link>
		<comments>http://www.davidworlock.com/2012/04/sunset-in-fiesole/#comments</comments>
		<pubDate>Sun, 15 Apr 2012 19:12:21 +0000</pubDate>
		<dc:creator>dworlock</dc:creator>
				<category><![CDATA[Blog]]></category>
		<category><![CDATA[eBook]]></category>
		<category><![CDATA[Education]]></category>
		<category><![CDATA[eLearning]]></category>
		<category><![CDATA[Industry Analysis]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[STM]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://www.davidworlock.com/?p=1203</guid>
		<description><![CDATA[The 14th Fiesole Retreat for academic librarians, publishers and researchers continues to provide an accurate guage of the direction and rate of change. Looking down over Florence from the European University Institute is to be reminded that renaissance and reformation come to all who wait, only in the digital world they come quicker. So the [...]]]></description>
			<content:encoded><![CDATA[<p>The 14th Fiesole Retreat for academic librarians, publishers and researchers continues to provide an accurate guage of the direction and rate of change. Looking down over Florence from the European University Institute is to be reminded that renaissance and reformation come to all who wait, only in the digital world they come quicker. So the conference agenda had librarians morphing into anything but librarianship, publishing defending the indefensible, and scholarship apparently rooted in the minds of both as pursuing a very narrow track of priorities and activities. While a new world was clearly waiting in the wings, we were all reluctant to signal the Last Post. And that&#8217;s just the problem with these civilized events &#8211; they are so civilized!</p>
<p>Bruno Racine, French cultural politician and Director of the Bibliotheque Nationale de France, set the style from the kick-off. In an untroubled world, his great priorities, alongside building greater audio collections and newspaper archives, were developing the great French Gallica collection and furthering the cause of Europeana. Like a bandsman on the Titanic, so much in our media minds this week, this sounded like an invocation to keep playing. Fortunately the untroubled water was soon disturbed by Carol Tenopir, quiet revolutionary of many years standing, who started to throw some hand grenade facts into the water. Did we know how completely the scholars had deserted the library? Well, we do now. In a world where between 78 and 88 per cent of articles read are read digitally, 62% of those readings are in the laboratory, 26% at home and 10% while travelling. Only 2% are conducted on library premises. As each subsequent librarian presentation began with a picture of ever newer and more lavishly appointed buildings, one deep psychological gap yawned open. The scholars have gone nomadic, but the services that support them are rooted in expensive real estate.</p>
<p>But not always. In a brilliant demonstration of how lateral thinking is not confined to certain roles or age groups, Sylvia van Petegham, Chief Librarian at the University of Ghent, talked about relocating her library, or, rather her users MyLibrary, in the Cloud. She underlined the importance of the Amazon announcement on CloudSearch (I have a fantasy of my grandchildren saying that I am so old that I could remember when Amazon was a bookseller!). She spoke of what her team had learned through the Los Alamos SharedCanvas experimentation, and she emphasized many times the collaborative nature of the whole enterprize. In fact, when her conclusions emerged as &#8220;provide detailed metadata for free; publish for machines; create stable and durable links and URLs &#8220;I knew that I was listening to a publishing presentation after all. She said that when she first saw what Google could do &#8220;I became a Humble Librarian&#8221;. I find that very affecting, but not wholly true. I suspect that she found at that moment that the professional divisions of the real world had fallen away, and it was perfectly permissable for anyone to do anything now. Clear the way, we need to find this lady a place in the Titanic lifeboats right now!</p>
<p>But in some ways Sylvia&#8217;s theme had already been established. Deanna Marcum, now running Ithaka S+K after her years running the Library of Congress, got us thinking about Knowledge Navigators, and the importance of capturing the art and lore of collections specialists before it was lost. Mike Sweet of Credo had reminded the preconference that there is nothing wrong with discovery services that cannot be fixed in the reference layer, and Alix Vance of GeoScience World alongside Fiona Murphy of Wiley illustrated the collaborative nature of niche content provision. But it was one of the questions that triggered a key idea: do we Brand library services successfully? Then I knew that a prognostication of the first Fiesole meeting that I attended 12 years ago was becoming true: librarians were becoming publishers, but what on earth would publishers become?</p>
<p>If the presentations of Blaise Simque and Stephen Barr, respectively CEO and International President of Sage were anything to go by, then the answer would be &#8220;Really Nice People&#8221;. And responsible executives moving along the track of providing users with what they apparently want &#8211; several Open Access options, plenty of scope in pricing models to deal with individual or small scale users, quality peer review, and grateful authors willing to be interviewed on video expounding the importance of having risk capital available to support new journals. No trace here then of the facile commentary in last week&#8217;s Economist on journal publishing margins (for which that worthy journal should be deeply ashamed). Or of the price-gouging, excessive profitability commentary which has marked comment on this sector this year. Sara McCune Miller sold her air-con unit for 500$ USD in 1965 to found Sage, and has left the company in trust to three charities. The problem is not here at all. It lies in the formats to which companies like Sage have become subject (journal, article etc) and the necessity to keep the present business model going until a new one can be put in its place. And while we all pay obeisance to the primacy of the research article, do we not sometimes fear its commoditization? What happens when Mendeley or ReadCube become the interfaces of choice &#8211; less full text reading, better current awareness, more visualization? And a powerful diminution of quality control exercised by peer review as the only indicative guideline to quality itself? We are on the very thin edge of a very long wedge.</p>
<p>But publishing is relatively easy to do and offers low barriers to entry. Later on in the agenda Svante Kristensson, Director of Sweden&#8217;s Boras University library , showed what a creative publisher can do online with collections that demand the full scope of digital resources &#8211; the Swedish School of Textiles. And Gino Roncaglio of Tuscia University demonstrated how layering enables more productive scholarly eBooks &#8211; and eLibraries. As we came to a giddy end I reflected that the challenge of the linked data world has not yet fully sunk in &#8211; but that a good number of librarians are as close to identifying the range of user expectations in the network as their publishing colleagues are. But for the researchers in the last year who have spoken to me and doubtless many others of their need to discover quickly sources of unpublished articles which confirm experimental results, or find and use data underlying published experiments, or obtain lab videos on procedures, or to get updates on compliance and best practice procedures I have no answers. The problem is, as Carol Tenopir reveals, that they are all at home or in the lab researching.</p>
<p>The Fiesole Retreats, which only meet in Fiesole every few years, are wonderful wherever they meet and always cast light into the gloom in the way that small meetings usually do. The Charleston Company and Casalini Libri started all this: to them goes the honour of a lifeboat all to themselves.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidworlock.com/2012/04/sunset-in-fiesole/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Lost Chord</title>
		<link>http://www.davidworlock.com/2012/04/the-lost-chord/</link>
		<comments>http://www.davidworlock.com/2012/04/the-lost-chord/#comments</comments>
		<pubDate>Mon, 09 Apr 2012 20:20:29 +0000</pubDate>
		<dc:creator>dworlock</dc:creator>
				<category><![CDATA[B2B]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Financial services]]></category>
		<category><![CDATA[healthcare]]></category>
		<category><![CDATA[Industry Analysis]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[Reed Elsevier]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[STM]]></category>
		<category><![CDATA[Thomson]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://www.davidworlock.com/?p=1192</guid>
		<description><![CDATA[As we in the information services market start to get our thinking right about the influence of Big Data and our current obsession with workflow, then I am beginning to think that we will need to revise our whole approach to collaborative working in marketplaces. At the moment we are playing all the old tunes [...]]]></description>
			<content:encoded><![CDATA[<p>As we in the information services market start to get our thinking right about the influence of Big Data and our current obsession with workflow, then I am beginning to think that we will need to revise our whole approach to collaborative working in marketplaces. At the moment we are playing all the old tunes but none of them seem to quite fit the customer mood. Like that old vaudeville star, Jimmy &#8220;Schnozzle&#8221; Durante, we need to tinkle those ivories again and again until we find it. The Lost Chord!</p>
<p>So here is a sample of my keyboard doodling. I reason that we cannot &#8220;productize&#8221; information services for ever. Our customers are now too clever, and as we open our APIs and let them self-actualize or customize, we face real dangers. At the top end of most markets in most sectors the top 10 customers are well-equipped at the skills level, and are surrounded by systems integrators who can service them expensively but effectively. And amongst the medium and small enterprizes in our client base, the cost of doing anything but allow them to customize for themselves is prohibitive. And we are sitting in the middle of this, talking passionately about selling solutions and always seeking stickiness, while our client base shows dangerously independent tendencies.</p>
<p>There are two answers. We could sell less. Just licence everything, put the APIs in place, let the user community get on with it. For me, this is like sleep-walking on a cliff edge. Our only potent quality as service providers has been our knowledge of what users do with our data and how they work. Make the relationship one of pure licencing and we cut off the feedback loop and isolate ourselves from the way in which workflow software is being tweaked and refined, and the way our data grows, or diminishes, in importance as a result. Or we could go to the opposite extreme, way past the current middle ground where we build &#8220;solutions&#8221; and customers adopt and install them as applications, with all the difficulties described above. The &#8220;opposite extreme&#8221; is equally difficult, but at least keeps us in the game.</p>
<p>So what is the opposite extreme? Simply this: that we go on building solutions, but we increasingly customize them for our major customers, working in partnership with systems integrators and our software solution partners whose Big Data environment, or analytics, or data mining is part of the key to our service specification. Setting up our own systems integration, by alliance or as an in-house installation, could be vital to our ability to stay sticky, to bring the client&#8217;s own data and resources into play, and to learn where the market is going to go. I hear cries of &#8220;We are a content company, not a software house!&#8221;. Not so for the major players in B2B and STM, who have been fully invested in software for five years or so, and are more likely these days to buy a tool-set than a data-set. Much more cogent are the protests of those who do not want to get into ownership of major pieces of systems software: the answer there is strategic alliance. Discussing the pharma market the other day, where size is very important, I found myself advocating approaches to major customers for outsourcing large areas of non-research process which offered real productivity gains to the user, and gave the services solutions player and his systems software partner the ability to work inside the firewall and grow with the client need.</p>
<p>There may be 1000 major global clients across all verticals with whom this approach would work. It certainly works in government and financial services, traditionally the targets of the major players in Big Data software. But it again exposes two new problems. It leaves the bulk of the market behind  in medium and small players unable to afford this type of soup-to-nuts solutioning. This, again, is a real opportunity for solution packaging with a systems integrator, either externally or internally to the content player. This will enable 3-5 year contracts with upgrades, data updating and maintenance. And in some instances integration will go further and permit scaled down custom solutions that parallel what the major players are doing. The trick will be to start by seeking to sell in the standard integration package, and then respond to the smaller customer&#8217;s need for customization. And there is a market of small players and consortia where this type of solutioning has been working for some time. Its Education, and the service area to watch is Pearson Learning Solutions.</p>
<p>And the other problem for the bigger data content players? Simply that there are killer whales out there! As the major enterprize software vendors see what is happening, they will feel that this type of solutioning undermines some sacred territory. We see that with Oracle in particular, but also IBM and SAP are always ready to buy on a vast scale. Some of today&#8217;s Big Data ex-start-ups, in the 5-10 year old Valley vintages, will be absorbed into these big players, which could be difficult &#8211; or an opportunity &#8211; if your content solution is tied to that  newly acquired player. In fact, if the major content providers are not talking regularly to the mighty enterprize software players about how these worlds come together then they are less smart than I think they are. At the moment, in my experience, some at least of the enterprize software players are saying &#8220;We should probably buy some of them &#8211; but we have no experience of managing content.&#8221; If ever you find yourself saying &#8220;I never imagined that Springer or Elsevier or Wiley would end up as part of the solutions division at Oracle&#8221; then I hope that you will recall an article that went right to that point. And at least that would integrate all access at all points!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidworlock.com/2012/04/the-lost-chord/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Big Data: Six of the Best</title>
		<link>http://www.davidworlock.com/2012/04/big-data-six-to-watch/</link>
		<comments>http://www.davidworlock.com/2012/04/big-data-six-to-watch/#comments</comments>
		<pubDate>Tue, 03 Apr 2012 19:47:00 +0000</pubDate>
		<dc:creator>dworlock</dc:creator>
				<category><![CDATA[B2B]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[data protection]]></category>
		<category><![CDATA[Financial services]]></category>
		<category><![CDATA[Industry Analysis]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[Reed Elsevier]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[Thomson]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://www.davidworlock.com/?p=1181</guid>
		<description><![CDATA[So the UK government has decided to monitor every tweet and every email and every social network connection, all in the good cause of the greater security of the citizen. While I am up to my eyes in articles defending the civil liberties of the citizen (at least some of whom are more afraid of [...]]]></description>
			<content:encoded><![CDATA[<p>So the UK government has decided to monitor every tweet and every email and every social network connection, all in the good cause of the greater security of the citizen. While I am up to my eyes in articles defending the civil liberties of the citizen (at least some of whom are more afraid of the police than the terrorists) I see little commentary on the logistics of all of this, and at best guess estimates that owe more to powerful imagination than logistical reason. My mind goes to the software involved, and that prompts a wider question: while we are now familiar with Hadoop and the techniques used by the cloud-based systems of Yahoo!, Google, Amazon and Facebook, what deployable software is there in the market which works at a platform level and interfaces  information systems with very large data aggregations on the one side, and user interfaces on the other.</p>
<p>In the media and information services area the obvious answer is MarkLogic (<a href="http://www.marklogic.com">www.marklogic.com</a>). Now a standard for performance in its sector, MarkLogic chose media alongside the government sector as its two key areas of market exposure in the development years. Throughout those years I have worked with them and supported their efforts to &#8220;re-platform&#8221; the industry. MarkLogic 5.0 is just about as good as it gets for information services going the semantic discovery route, and the testimony to this is installations  in differing information divisions in every global and many national information service providers. So when MarkLogic roll out the consultancy sell these days, they do so with almost unparalleled experience of sector issues. I have no prior knowledge, but I am sure that they would be players in that Home Office contract.</p>
<p>Other potential players come from outside the media sector and outside of its  concentration on creating third party solutions. In other words, rather than creating a platform for a content holder to develop client-side solutions, their experience is directly with the end-user organization. Scanning the field, the most obvious player is Palantir  <a href="http://www.palantir.com">www.palantir.com</a>). A Palo Alto start-up of the 2004 vintage (Stanford and PayPal are in its genes), this company targetted government and finance as its key starter markets, and has doubled in size every year since foundation. It raised a further $90m  in finance in the difficult year of 2010, and informal estimates of its worth are now over $3 billion. It does very familiar things in its ability to cross search structured, unstructured, relational, temporal and geospatial data, and it now seems to be widening its scope around intelligence services, defense, cyber security, healthcare  and financial services, where its partner on quant services is Thomson Reuters (QA Studio). This outfit is a World Economic Forum 2012 Tech pick &#8211; we all love an award &#8211; and as we hurry along to fill in the forms for the UK intelligence service, I expect to find them inside already measuring the living space &#8211; and the storage capacity.</p>
<p>My next pick is something entirely different. Have a look at <a href="http://www.treato.com">www.treato.com</a>. This service, from First Life Research, is more Tel Aviv than Palo Alto, but it provides something that UK security will be wanting &#8211; a beautifully simple answer to a difficult question. Here the service analysed 160,00 US blog sites and health portals comment sections to try to trap down what people said about the drugs they were taking. They have now examined 600 m posts from 23 million patients commenting on 8500 drugs, and the result, sieved through a clinical ontology-based system, is aggregated patient wisdom. When you navigate this, you know that this will have to find a place in evidence-based medicine before too long, and that the global service environment is on the way. In the meanwhile, since the UK National Health Service cannot afford this, lets apply it to the national email systems, and test the old theory that the British only have two subjects, their symptoms and the weather.</p>
<p>We started with two Silicon Valley companies, so it makes sense next to go to New Zealand. Pingar (<a href="http://www.pingar.com">www.pingar.com</a>) starts where most of us start &#8211; getting the metadata to align and work properly. From automating meta tagging to automatic taxonomy construction, this semantic -based solution, while clearly one of the newest players on the pitch, has a great deal to offer. As with the other players I will come back to Pingar in more detail and give it the space it deserves but in the meanwhile I am very impressed by some indicative uses. Its sentiment analysis features will surely come in useful in this Home Office application, as we search to find those citizens more or less likely to create a breach of the peace. If there are few unique features &#8211; here or anywhere in these services, then there is a plenitude of tools that can make a real difference. Growing up in the shadow of MarkLogic and Palatir is a good place to be if you can move fast/agile.</p>
<p>But there are others. Also in the pack is Digital Reasoning (<a href="http://www.digitalreasoning.com">www.digitalreasoning.com</a>), Tim Estes&#8217; company from Franklin TN. Their Synthesys product has scored considerable success, in, guess where? The US government. Some analysts see them as Palantir&#8217;s closest competitor in size terms, and here is how they define the problem:</p>
<p>&#8220;Synthesys is the flagship product from Digital Reasoning that delivers Automated Understanding for Big Data. Enterprise and Government customers are awash with too much data. This data has three demanding characteristics – it is too big (volume), it is accumulating too fast (velocity) and it is located in many location and forms (variety). Solutions today have attempted to find ever better methods of getting the user to the “right” documents. As a result, data scientists and data analysts today are confronted with the dilemma of an ever-increasing need to read to understand. This is an untenable problem.&#8221;</p>
<p>I hear the UK department of spooks saying &#8220;hear, hear&#8221; so I guess we shall see these gentlemen in the room. But I must turn now to welcome a wonderfully exciting player, which, like Pingar, seems to have emerged at the right place at the right time. In 1985 I became a founder member of the Space Society. This could have been my recognition of the vital task of handling remotely sensed data, or the alluring nature of the Organizing Secretary who recruited me. She moved on, and so did I, ruefully reflecting that no software environment yet existing could handle the terabytes of data  that poured from even the early satellites. Now we have an order of magnitude more data, but at last practical solutions  like SpaceCurve (<a href="http://www.spacecurve.com">www.spacecurve.com</a>) from Seattle. Here is the conversation we all wanted then: pattern recognition systems, looking at parallel joins between distributed systems and indexing geospatial polygons&#8230; working on multi-dimensional, temporal, geospatial data, data derived from sensors, and analysis of social graphs. Now, if I thread together the third of the words on their website that I understand, I perceive that large scale geospatial has its budding solutions too, and its early clients have been in commodities (the goal of all that geospatial thinking years ago) and defense. Of course.</p>
<p>So I hope to see them filling in their applications as well. In the meanwhile, I shall study hard and seek to produce in the next few months a more detailed analysis of each. But in the meanwhile, if you are gloomy about the ability of the great information companies to survive the current firestorm of Change, reflect on this. Three of my six &#8211; Palantir, Treato and SpaceCurve &#8211; share a common investor in Reed Elsevier Ventures. They should take a bow for keeping their owners anchored within the framework of change, and making them money while they do it.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidworlock.com/2012/04/big-data-six-to-watch/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Abundance and Scarcity</title>
		<link>http://www.davidworlock.com/2012/03/abundance-and-scarcity/</link>
		<comments>http://www.davidworlock.com/2012/03/abundance-and-scarcity/#comments</comments>
		<pubDate>Wed, 28 Mar 2012 19:47:42 +0000</pubDate>
		<dc:creator>dworlock</dc:creator>
				<category><![CDATA[B2B]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[Financial services]]></category>
		<category><![CDATA[healthcare]]></category>
		<category><![CDATA[Industry Analysis]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[online advertising]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://www.davidworlock.com/?p=1174</guid>
		<description><![CDATA[I sat down to write a glowing note on the Digital Science conference at London&#8217;s glorious Royal Institution last night. &#8220;Inventing the Future&#8221; was a huge success and underlined the creative quality of the debate on the digital future in this city. As I stared ruminatively at my blank screen, an alert crossed it: Emap [...]]]></description>
			<content:encoded><![CDATA[<p>I sat down to write a glowing note on the Digital Science conference at London&#8217;s glorious Royal Institution last night. &#8220;Inventing the Future&#8221; was a huge success and underlined the creative quality of the debate on the digital future in this city. As I stared ruminatively at my blank screen, an alert crossed it: Emap have decided to split themselves into three parts, to be called (no, I am not kidding) Top Right Group (something to do with graphs?) for the whole outfit, i2i Events for the (you guessed it!) events division, 4C Group for the information division (&#8220;Fore-see&#8221;, geddit?), and, triumphantly, EMAP Publishing for the magazines. Given that they did not waste any of that expensive rebranding budget on the magazines we can guess that this lot are for sale first (though a rumour today also gives that honour to the CAP automotive data unit). The best guess is that everything is for sale, and some reports are already citing advisory appointments in a variety of places.</p>
<p>Meanwhile, the philosophers of the night before had been talking of the very nature of the digital, networked society. Their threnody was &#8220;Open&#8221;. JP Rangaswami, Chief Scientist at Salesforce.com (I have heard this man twice in a week and would be happy to go again for more tomorrow) set the tone. We have to realize that the network has turned our media picture on its head. Now we have to understand the ways in which consumers are re-using and reshaping content. The social networks are ways of amplifying and diminishing those responses, filtering and distilling them. The publisher&#8217;s role is to get out of the way &#8211; this is not a push world anymore, but act as a distributor and reproducer of excellence without doing harm or trying to outbid the creativity of endusers. Stian Westlake of NESTA, looking at this from a policy viewpoint, saw the need to rebalance the investment, to innovate in areas of strength like the UK financial services markets, and to make education fit the requirement of a networked economy. As JP said, re-quoting Stewart Brand &#8220;information wants to be free&#8221;. We have it in abundance, while we have scarce resources for shaping and forming it as users want it, and enabling them to do that in their own contexts.</p>
<p>It turns out, of course, that some of the data we want is held by government. The third speaker was Professor Nigel Shadbolt, Professor of AI at Southampton, Director of the new Open data Institute, and Sir Tim Berners Lee&#8217;s vice-gerent and apostolic delegate to the UK government&#8217;s Open Data programme here on earth. He mercifully skated across the difficulties of getting governments to do what they have said they will do, while pointing out that despite the fad of Big Data, linked data was now a vital component at all levels, big and small, in delivering the liberating effect of making compatible data available for remixing. With these three speakers we were in the magic territory of platform publishing. Here it was unthinkable not to promulgate your APIs. Here was a collaborative world of licensing and data sharing. Here was a vision of many of the things we shall be doing to to create a data-driven world in the networks for the net benefit of all of us.</p>
<p>And then I read the EMAP announcement, and it brings home the way in which the present and the future are pulling apart radically at the moment. No one looked at the EMAP holdings through the eyes of customers, buyers, or users. Channel and format, the classifications of the past, are the only way that current managers can see their businesses. So we divide into three channels what needed to be seen as a platform environment, created by ripping out all the formats and making all of the data neutral and remixable in any context. So the building and construction marketplace at EMAP, which has magazines, data and events (events &#8211; the greatest source of data yet discovered on earth), becomes a way of shaping and customizing content for users large and small, directed by them and driven by their requirements. But the advisors cannot understand anything but ongoing businesses, the strategy has no place in the IM, the McGraw-Hill failure to do this at Dodds and Sweets is not encouraging, so we divide the stuff into parcels that can be sold, and sell it off at small portion of its worth, while blaming the technology that could save it for &#8220;disrupting&#8221; it to death.</p>
<p>Maybe this is right. Maybe the old world has to be purged before the new one takes over. Maybe we have to go through the waste of redundancies, the dissipation of content, the loss of continuity with users/readers/customers before they are able to show us once again what we really should be doing. But now, when we know so much about &#8220;inventing the future&#8221; this seems a very rum way of proceeding. Incidentally, last night&#8217;s conference host, Digital Science, is a very exciting Macmillan start-up whose business it is to invest in software developed by users in science research to support their work. Truly then a new player with more than a whiff of the zeitgeist of this conference in its nostrils. Those of us with long memories remember an older Macmillan, however. One that owned the Healthcare and nursing magazine market, and lapped up the jobs advertising cream in the days when users (or the NHS), could not use the web as an advertising environment. So Macmillan sold its magazine division before the advertising crash &#8211; to EMAP. It is people, decisions and the choices made by users that change things. It is hardly new to note that lack of a tide table can create serious risk of drowning, but it could be true.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidworlock.com/2012/03/abundance-and-scarcity/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Crowds, Voices and no Ties</title>
		<link>http://www.davidworlock.com/2012/03/crowds-voices-and-no-ties/</link>
		<comments>http://www.davidworlock.com/2012/03/crowds-voices-and-no-ties/#comments</comments>
		<pubDate>Thu, 22 Mar 2012 21:01:54 +0000</pubDate>
		<dc:creator>dworlock</dc:creator>
				<category><![CDATA[B2B]]></category>
		<category><![CDATA[Blog]]></category>
		<category><![CDATA[data protection]]></category>
		<category><![CDATA[Financial services]]></category>
		<category><![CDATA[Industry Analysis]]></category>
		<category><![CDATA[internet]]></category>
		<category><![CDATA[mobile content]]></category>
		<category><![CDATA[online advertising]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[Publishing]]></category>
		<category><![CDATA[Search]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[social media]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Workflow]]></category>

		<guid isPermaLink="false">http://www.davidworlock.com/?p=1166</guid>
		<description><![CDATA[Two days were enough this week to encompass an industry in the making and in transition. Many participants at the London Web Summit on Monday, as well as at the IXXUS Future of Publishing meeting on Tuesday, would describe themselves as being in the Information Industry (aka media, publishing, information services and solutions etc). I [...]]]></description>
			<content:encoded><![CDATA[<p>Two days were enough this week to encompass an industry in the making and in transition. Many participants at the London Web Summit on Monday, as well as at the IXXUS Future of Publishing meeting on Tuesday, would describe themselves as being in the Information Industry (aka media, publishing, information services and solutions etc). I went to both, and as I staggered home on Tuesday night I could only reflect that this is not one industry but a hundred, and the cultural differences between the pieces are now profound. In fact, this industry is a Lilleputian version of the whole world around it, which sounds the way it should sound. But doing the breadth in two days? Very frightening.</p>
<p>For a start there were 1000 delegates in the Old Brewery in Chiswell Street on Monday. Our ebullient hosts, Paddy Cosgrave  and Mike Butcher (TechCrunch), compered it with the energy of a variety show in the Edwardian music halls. And they had a band that provided a 10 bar intro/exit for every speaker. It had something of everything, and, at King Paddy&#8217;s command, no ties were allowed (Yes, this is the sort of thing you do have to tell the English). And like a variety show (vaudeville) it was good in parts and not in other parts. The panels, despite some good appearances, were often   so hurried and poorly moderated that it was hard to extract meaning at all. And the audience was very mixed &#8211; investors networked less easily here with a vast crowd of start-up hopefuls than they did at last year&#8217;s similarly sized NOAH show, but the same messgae was available. The energy is back in the London market, just as it is in Berlin and Barcelona, but London is the place to get the finance and finger the future. My Investor of the Day award goes to Niklas Zennstrom of Atomica: despite the questions from his moderator he came across as someone who had learnt real lessons from Skype and Joost, and knew how to listen to the next crazy and apply the right degree of enthusism, tolerance and sophisticated discouragement. And my Thinker of the Day would have to be J P Rangaswami, Chief Scientist at Salesforce.com. His observation that we would at last overcome the entrapment of the Qwerty keyboard, and that the future of work was only understandable if we saw it as as massively integrated multi player videogame was delightful, as was his insistence that knowledge work on the network was &#8220;bursty&#8221; &#8211; so we invented the need for meetings to fill the gaps between activities.</p>
<p>Also high quality was the discussion on the future of money. We had two credit card -based services ranged against two chip-based money transfer services. I give the latter my vote, but questions like cost-free money transfer, the death of cash and the removal of some of the key roles of banks played very well, as did the notion that with digital money comes the end of money-handling privacy. Gareth Williams did a great job of persuading us that the Edinburgh &#8211; based, Scottish Equity Partners-backed online travel service SkyScanner would break into the Expedia /Kayak marketplace, but in truth its revenues of £2.5-3.0 m per month on a lead gen/referral business model, from 20 million unique monthly users, shows that it is well on the way. Offices in Singapore and now Hong Kong emphasize where the growth is, and 7 million apps testify to the mobile nature of the challenge. But is Google waiting to pounce on all of this?</p>
<p>So what else did I learn? That YAiA stands for &#8220;Yet another iPad App&#8221;. That 50% of Turkish shopping for consumer goods is now online. That Google only has 20% of the Russian search market, and Facebook is only the fourth most popular online service. That FAB has 3 million members (50% social network, 40% mobile) and sold 111,111 products last month on the way to revenues of $110m this year. So some of the players in the hall were definably big already. But you could not say that of Nick D&#8217;Aloisio, aged 16, funded to the tune of £350k , and launching his service (<a href="http://www.summly.com">www.summly.com</a>) to provide artificial intelligence support to people doing research online who needed to summarize what they had read. When he said that he was going to take two years off to do his A level school exams, there was a palpable sigh of relief from the 20 year old entrepreneurs in the audience.</p>
<p>It didn&#8217;t matter to me , proudly sporting the only grey beard in the room. But I have to admit that I felt relief amongst my peers in the IXXUS event, held in sunshine on the Kensington roof garden, which is improably furnished with ducks and flamingos (live). An audience of technocrats from all of the leading information services players  were looking at the issues surrounding what seems to me the key question of the hour &#8211; how do we effectively re-platform in ways that add to our asset value, increase our ability to act fast to change our service dimensions in times of torrential market change and still stay within a broad avenue of standards now established and extending from XML  right through to RDF and SPARQL. We can now discuss these things in London &#8211; they are of the present and I was delighted to hear John Powell of Alfresco (a real ornament to the Open Source model) and the IXXUS team under Steve Odart providing practical advice and guidance to real and urgent questions from the audience. Three years ago I would not have been allowed vocabulary like &#8220;ontologies&#8221; or &#8220;triples&#8221; in a publishing context: today this is coinage of the conversation and I rejoice in it.</p>
<p>And one last observation. Go to a conference of 1000 web developers and investors and what happens: from breakfast to dinner I never arrived at a boxed food table in time to find a box left to consume. Good for your figure, you may observe. Yes, but I made up for it the next day. They may have their drawbacks but publishers do know how to eat, and IXXUs responded to their proclivities very well indeed.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.davidworlock.com/2012/03/crowds-voices-and-no-ties/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>

