“Keep it Simple, Stupid” was an acronym I brought home from the first management course I ever attended yet it has taken me years to find out what it really means. There are, clearly, few things more complex than simplicity, and one man’s “Simple” is another man’s Higgs Boson. So I was very energised to have a call last week from an information industry original who has been offering taxonomy and classification services to the information marketplace since 1983. When I first met Ross Leher in the late 1980s we were both wondering how far we would have to go into the 1990s until information providers recognized that they needed high quality metadata to make their content discoverable in a networked world. Ross had sold his camera shop to take the long bet on this, but he worked at his new cause with a near religious persuasion, as I realised when I went to see him in the 1990s at his base in Denver, Colorado. Denver at that time was home to IHS, whose key product involved researching regulatory material from a morass of US government grey literature. Denver people did metadata. It was a revolution waiting to happen.

So when I heard his voice on the phone last week my first emotion was relief – that he had not simply given up and retired to Florida – and then agreement. Yes, we were 15 years too early. And many of the people we thought were primary customers, like the Yellow Page companies and the phone books and the industrial directories – are now either dead or dying, or in the trauma of complete technological makeover. Ross’s company, WAND Inc (www.wandinc.com) is now very widely acknowledged as a market leading player in horizontal and multi-lingual taxonomy and classification development. They are the player you go to if you have to classify content, if you are in a cross-over area between disciplines (he has a great case study around taxonomies for medical image libraries), and if you have real language problems (“make this search work just as effectively in Japanese and Spanish”). What they do is really simple.

Your taxonomy requirement is going to start with broad terms that define your content and its area of activity. These can then be narrowed and specified to give additional granularity in any specific field. These classifications can be incorporated into the WAND Preferred Term Code, given a number, and used in a programmatic, automated way to classify and mark up your content (www.datafacet.com). Preferred terms can be matched to synonyms, and the codes can be used to extend the process to very many different languages. So someone whose company, for example, was created in Spanish can be found in the same list as someone who has a Japanese outfit, as the result of a search made by a Chinese user working in Chinese.

And from synonyms we can extend the process  to extended terms themselves, and then map the WAND system to third party maps – think of UNSPSC, Harmonized Codes or NAICS, as well as those superficial and now dwindling Yellow Page classifications. WAND can isolate and list attributes for a term, and can then add brand information. All of these activities add value to commoditized data, and one would think that the newspaper industry at least would have been deep into this for 15 years. Yet few examples – Factiva is an honourable example – exist which demonstrate this.

Not the least interesting part of Ross’s account of the past few years was the interest now shown by major enterprize software and systems players in this field of activity. Reports from a variety of sources (IDC, Gartner) have high-lighted the time being wasted in  internal corporate search. Both Oracle and Microsoft have metadata initiatives relevant to this, and it still seems to me more likely that Big Software will see the point before the content industry itself. With major players like Thomson Reuters (Open Calais) deeply concerned about mark-up, there are signs that an awareness of the role of taxonomy is almost in place, but as the major enterprize systems players bump and grunt competitively with the major, but much smaller, information services and solutions players, I think this is going to be one of the competitive areas.

And there is a danger here. As we talk more and more about Big Data and analytics, we tend to forget that we cannot discard all sense of the component added value of our own information. We know that our content is becoming commoditized, but that is not improved by ignoring now conventional ways of adding value to it. We also know that the lower and more generalized species of metadata are becoming commoditized; look for instance at the recent Thomson Reuters agreement with the European Commission to widen the ability of its competitors to utilize its RICs equity listings codes. This type of thing means that, as with content, we shall be forced to increase the value we add through metadata in order to maintain our hold on the metadata – and content – which we own.

And, one day, the only thing worth owning – because it is the only thing people search and it produces most of the answers that people want – will be the metadata itself. When that sort of sophisticated metadata becomes plugged into commercial workflow and most discovery is machine to machine and not person to machine we shall have entered a new information age. Just let us not forget what people like Ross Leher did to get us there.


Here we sit, in a poor benighted island, slowly sinking into economic anonymity, in a great world where economic growth seems to be a property of lands we once called “under-developed”. A worthy come-uppance, and a suitable subject for Davos this week. Yet, as a persistent optimist, I somehow glimpse a glowing future for my children’s children. Information services and solutions lie close to the heart of developmental growth, and I have written here repeatedly (too often for some readers!) about the necessary connection between injecting data/content into workflow and the regeneration of a post-industrial economy. For some reason the information industry has its eyes fixed on pure information usage (sometimes called “research”). In some areas, though – credit rating, risk management, automated financial trading systems, scientific research – we have come out of the bunker and begun to look at the way applied intelligence, often now derived from Big Data and analytics, can change the way that we view the operational logic of whole sectors of commercial and industrial life.

Now, lets pull back a step further and see how information services change networked industry and society at large. I only have space for two examples. The first was driven home to me on Monday at a dinner given by the Real Time Club. The speaker, Dr Siavash Mahdavi (http://en.wikipedia.org/wiki/Siavash_Haroun_Mahdavi), spoke on 3D printing, and by the time he had finished, and we had examined printed hip joints and shoe inserts amongst other examples the penny was beginning to drop for me. We are moving in the network from manufacturing by extrusion processes through moulds, the industrial revolution pre-digital world, to additive manufacturing, creating products in software and instructing printing devices to build them in extremely thin 2D layers one on top of the other until the desired shapes and structures are created. Medical implants have had the publicity here, but gold jewellery was mentioned as an application. This is a design – intensive, network efficient manufacturing world in which design and the actual printer can be in totally different places. Printing can take place using any materials which can be chemically – adapted to the process. Customization (the running shoe insert designed for the imprint and weight distribution of your own foot) and personalisation are at the centre of this. Every product can be made for you. However, it remains a requirement that everything we know about the performance, qualities and expectations of an artificial hip are brought to bear in the network upon the design process, as the information services world creates the bullets for manufacturing workflow to fire. And all this is going strong now: the lead engineering player in 3D printing in the UK is Renishaw (http://www.renishaw.com/en/additive-manufacturing-news–15505) (and with eery coincidence  it announced today a strong trading year, with sales up 11%).

If this is not bizarre enough, I stumbled upon a Google story this week about automated motoring. Apparently Google’s own patented technology had racked up 200,000 autonomous driverless miles by the end of last year. This may just be another Google enthusiasm which runs out of steam, but it does have a history (http://en.wikipedia.org/wiki/Autonomous_car), and a great deal of real research, and my bet is that it will happen in this over-crowded isle a lot quicker than the UK estimate of 2056. Extending the network to our over-populated motorways may be the only way to squeeze more capacity from infrastructure we do not have the space to rebuild, and to control scarce parking resource. Driving my car to the motorway and then surrendering control to a system that governs inter-car distance and speed until I leave is a likely first stage. And as the car becomes part of the network, then its ability to intelligently appraise where it is, where it is going and how it is feeling becomes a natural extension of a world of autos which are already computers on wheels. Information service solutions will be vital to feed this activity: important players like ITOWorld (www.itoworld.com) already assemble critical geospatial data, matched at the vital micro level by services like Elgin (http://www.elgin.gov.uk/) who can tell you about every road repair in Britain. At the moment this is part of the world of local government and planning: tomorrow it will have to be part of the knowledge base of your motor car.

When I think about examples like these I become more and more convinced that the new world of information service knowledge and intelligence will be more important than the old one, patrolled by intermediaries like librarians, and governed by quite irrelevant business models like advertising. And here is a world where the use shapes the content, and where suppliers are involved in developing solutions for sectors or even individual companies. Here the information services and solutions players have forgotten whether they are “content” or “software” players, because it has no bearing on the end result, and they had to have both elements to play in any case.

So who will do this stuff well? Undoubtedly the Indians and the Chinese and the Brazilians amongst others. But in many ways this future vision levels out a lot of the inequalities of the old and new worlds. You do not need a great deal of cheap labour to compete here. Capital too will have a different importance if you can custom manufacture close to the point of use, and avoid shipping and warehousing. I quite fancy the chances of this old island: good with design, strong start-up culture, great software development skills, good financial services investment culture, strong presence in information and education markets globally. Or at least I would, if our politicians did not think that modernity was returning to the railway investment mode of Britain in the 1840s, or aping the French and the Japanese high speed trains of the 1960s and 1970s. The infrastructure requirement here would be to create the most intensive high bandwidth broadband coverage in Europe. Fat chance of that while politicians think there are more votes to be got by shaving 34 minutes off the journey time from London to Birmingham!

Some of my friends call this type of article “futuristic madness” (and that was the polite one!). But, to me, the real madness lies in taking the formats of the Gutenberg age (books, newspapers etc), carefully wrapping them in software and delivering them in facsimile form across the network – and then calling these eFormats Innovation!

