Big Data – Charm or Seduction [invited article by Mike Cushman].
“The Allure of Big Data”, The 14th Social Study of ICT workshop at LSE on 25 April 2014 pointed to answers on some questions but left others unaddressed. Two in particular were left hanging: ‘How New is Big Data’ and ‘What is Big Data’?
How new is Big Data?
Like many themes in the fields of Management and Information Systems it is both new and not new and both the ‘gee-whiz’ and ‘we’ve seen it all before’ reflexes are incomplete.
In important aspects Big Data is a re-packaging and re-selling of Data Warehousing, Data Mining, Knowledge Management, e-science and many other items form the consultants’’ catalogues of past decades. Each of these, especially KM, is a re-badging of previous efforts. But only to say that is to miss that the growth of processing power and cheap, and ever-cheaper, storage is producing changes in the uses to which accumulations of data can be put. In addition previous iterations have not had the current quantities of social media content and GPS attributes from the growth of mobile computing available to them. The development of innovative algorithms to analyse the growth of quantity and types of data afford new possibilities, even if many of them, but far from all of them, just look like expanded versions of the old routines.
What is Big Data?
Much of the discussion at the workshop was compromised by the lumping of too many distinct phenomena under one heading. Big Data is not one thing and this is a preliminary attempt at a typology of Big Data.
- Big Data is the business. Companies like Google and Facebook are essentially their ability to analyse the data provided by their users in return for free provision of services. Discussions about such companies should lead to discussion about he role of advertising in the economy and society. While newspapers and magazines have always been dependent upon advertising revenue, this revenue is far more central to Google and its peers.
- Big Data for marketing. The collection of customer data through loyalty cards allows retailers to design promotions at a national, store and individual customer level and makes CRM systems far more powerful.
- Big Data for cost control. The collection of data on every aspect of the business allows the elimination of unnecessary cost, making cost accounting far more effective and supporting lean manufacturing approaches.
- Big Data for workforce management. Employers now have access to far more data about employees’ histories and performance. This has led to the spread of both performance related pay and more intrusive disciplinary codes.
- Big Data for performance ranking and comparison. It has become accepted that heterogeneous organisations can be listed in meaningful league tables with standardised measures as easily as football teams can. The result of a football match is unambiguous, subject to moderately competent refereeing. The performance of a school, university or hospital is less easily agreed. LSE moves alarmingly up and down national and international rankings according to the measures and their weightings selected by a particular newspaper. Big Data is the key cement to the conceit that these league tables are a sensible activity and that they sufficiently meaningful to obliterate the harm they do. Because Big Data and the tables are assumed to be necessary the data must be constructed and collected regardless of cost and disruption, so the Research Excellence Framework is allowed to dominate university life and only education measurable in GCSEs is understood to be a valuable product of school efforts.
- Big Data for product development. The collection of data about products in use in industries like motor manufacturing can feed back into product design to eliminate design faults and weaknesses and better meet customer demands.
- Big Data for science. The growth in computing capacity is necessary for data-rich experiments like those at CERN but also the collection of far greater quantities of observational data in both hard science like meteorology and in social science leading to the production of new scientific knowledge.
- Big Data for policy development. Policy in areas like housing, transport, education and health have always depended on large data sets like the national census and the general household survey (the degree of faithfulness of any particular policy to the data that is claimed to support it has always, and will always, be a matter for political argument). Whether the development of bigger data will improve policy development or only intensify politicisation of data use is a matter for conjecture.
- Big Data for surveillance. There has long been a recognition that states collect data on their citizens. Each state announces loudly the data collection practices of their opponents while, generally, concealing its own. ‘Totalitarian’ states have been more willing to publicise their surveillance in order to intimidate their population; ‘liberal democracies’ try to minimise knowledge about their own practices claiming it is only ‘them’ about whom dossiers are compiled – criminals, terrorists, subversives, and paedophiles. The admitted categories have always been elastic according to political priorities so may also be widened to include such as trade unionists; benefit claimants; or immigrants, refugees and aliens. While groups are added there is great institutional resistance to slimming down the list. Edward Snowden revealed that even ‘liberal democracies’ regard every citizen as potentially hostile and a surveillance subject ‘just in case’
There are continuing ethical and privacy concerns about Big Data. These are made more complex and irresolvable because Big Data is too often discussed as one thing. Regarding it is many distinct phenomena, with each domain having its own ethical and privacy requirements will allow more clarity.
Mike Cushman
29 April 2014
Mike Cushman is a retired colleague from the LSE who also specialises in Information Systems and their social and organisational implications.