This is the trailer from a movie called "Moneyball", which is based on the true story of how a low budget baseball team was formed from players who were picked according to a statistical analysis of many players. This is an example of "big data" application. Big data is the systematic analysis of enormous amounts of information and extraction of knowledge (or wisdom) from this an analysis.
Data are being recorded more than ever. Internet in all forms, mobile phones, sensors, are recording continuously and in digital - standardized format all kinds of data. Heart beat rate of runners, traffic in the cities, searches in the web, weather, news, emotions (through social media), location of smartphones, and many other details that there is no point to list now. Recoding is not producing any knowledge. This amount of information is not even manageable by humans alone. This is the playground of computerized processors! Humans are inventing algorithms that analyze with the help of computers the available data. Of course within the limits of our current imagination.
Data are stored though and are available for ever, with the new technologies of the "cloud". They will be there for generations to come. Computers are starting to learn! Because simply... they have unlimited time to look into and combine all this information. And they are way faster than humans in analyzing huge amounts of raw data. And when a new algorithm is available they can look back at all the stored data with the new "eye" made available and enrich their knowledge. So the possibilities are only expanding.
This is a trend that cannot be reversed. Simply because our lives are improving through this procedure. Some examples:
More convenience in everyday life (for example, location services and personalized information on demand and in real time)
Longer weather forecasts
Superhuman fast stock market reactions and complex financial services
Better pricing and distribution of products at retail (super markets for example can plan their stock according to weather forecast, or customer behavior analysis!)
Urban crime control through analysis of events in correlation to various parameters like historical arrest patterns, paydays, sporting events, rainfall and holidays!
And for all these you do not need to conduct a survey as in the past. Most likely all the data anyone will ever need are there already. You just have to think of the best way to "dig" in them and extract the knowledge you need.
Of course, as written in a very enlightening New York Times article: "Data is tamed and understood using computer and mathematical models. These models, like metaphors in literature, are explanatory simplifications. They are useful for understanding, but they have their limits. A model might spot a correlation and draw a statistical inference that is unfair or discriminatory, based on online searches, affecting the products, bank loans and health insurance a person is offered, privacy advocates warn." But even this kind of problems can be solved much faster than in the past. Remember... data are there for ever and they only get more and more!
The main issue that must be addresses and resolved is access to the available data. I believe that recording is not bad for anyone, as long as the access to the records are open to anyone! This is the only way that true wisdom can be extracted from all this information. This the "Open Data" movement which is affecting all of us! The Guardian has been encouraging this movement for some time now. This a society issue and in my opinion it should be ranking very high in our priorities for the future. I would even call it a political issue!
Of course, there’s a major shortage of analytical talent. According to the McKinsey Global Institute (MGI), this is already an issue in the private sector. MGI predicts that in the US alone there will be an annual shortage of graduates in deep analytical fields of 140,000 to 190,000 by 2018. A very good article on this subject is from Linda Rosencrance of the Spotfire Blogging Team "The Promise and Challenge of Using Big Data to Address World Problems".
Big data in the end is all about wisdom and evolution! And obviously this cannot be left to the hands of few or only in the processing power of computers!
From the day we are born, some role models are imposed to us and we are trying to be like them. Society accepts those who have a car, own a house in a city, have a phone, a TV, a mobile phone. Society admires people educated to be lawyers, doctors, scientists, engineers, teachers. Trendy clothes and brands have to be worn.
A contrast: What happens when you own a bike? (BBC News - Cycling industry gives economy £3bn boost http://bbc.in/qiwMpM) When you commute either with your own two feet, or with a bike, or using the public transportation for longer distances? How things would be if we did not own the houses we are staying in the cities, but we were renting (maybe as part of the taxes we pay) them from the state according to our income? What if phones and mobile phones were developed under certain standards and specifications and the focus was on the service provided by the operators through them? What if there was only one TV device per house? How about admiring farmers, dancers, painters, musicians, carpenters, palaeontologists and plumbers as much as engineers, mathematicians, doctors and lawyers? What if the clothes were picked according to their functionality and not according to the little label on the chest?
I bet the world would be a much better place! Lets try to imagine:
Houses built according to certain standards by the state. Maintained by the state and demolished and rebuilt every 50-60 years by the state! Costs covered through taxes we pay. Families would not have to invest from their budget (usually taking 30-40% of the family income) for the housing project. Standards could be observed more closely. Speculation, price inflation and housing market bubbles would be unknown terms! After all, building a house, except for the brief period of the construction, it is not a very productive way to invest ones resources! This model would apply in all cities. Parks and public spaces could be organized better. Long term planning of the community would be much easier. Architecture would have unlimited space to develop new technologies. Energy consumption could be more reasonable, helping the environment. Resources and wealth beyond our imagination which now are being "frozen" in the value of bricks and walls and private swimming pools would be made available for investment to the productive economy!
The commuting vision: People would be able to move around in the cities on safe roads with their bikes or by walking or other human powered vehicles. For longer distances or when the circumstances are not favorable they would be able to use public transportation like trains, trams, electric buses. Accidents costing endless millions to state budget and cause unbelievable personal misery would be reduced! Public health would improve by having less obese citizens and better quality of air. Unreasonable social discrimination based on the kind of vehicle that someone owns would be unknown. Cities would have more public space. State would also save a big chunk of its' budget which goes to maintenance of the roads and for building new ones to accommodate the increasing number of cars and their parking needs. Automotive companies would be focused on engine technologies and not on pointless chassis design projects that offer nothing substantial to our lives. And finally, personal finance resources "blown in the wind" for a new car that is losing it's value as soon as you open the door for the first time, would find better ways of use. Public transportation would require increased budget from the state, but it would be way lower than the requirements for new avenues and their maintenance, or for addressing public health issues, created by the low quality of life in the cities, or by car accidents.
Singers, dancers, painters, gardeners would be equally important with all the professions. Because the society would have available more free time and more resources to invest in the public welfare and culture. We would prevent the overpopulation in certain professions and allow personalities to develop better, through the education system. Universities and schools could be real cultural centers and not mass production facilities of brains that think in the same way and lose most of their capability to question and improvise!Social discrimination based on ones job title would be reduced, and social value would be attributed mostly to the quality of his / her skills. Excellence in personal level, would be encouraged and rewarded by the society according to the skills and not the perceived value of the profession.
Many more examples could be presented here describing this ideal world. The few, utopic-at-first-sight, cases I described above, highlight some simple imbalances in our lifestyle. Automotive companies compete on the fields of chassis design and financial solutions (basically loans) for their cars, instead of engineering breakthroughs for the engines that move these cars. People get loans, which will last more than their productive lifetime, to buy bigger and more energy consuming houses than they need. They suppress their creativity and natural predisposition towards the development of skills, in order to be socially accepted. They end up serving non productive positions in multinational companies, without any clear view of the outcome of their day to day work, just because they are socially accepted (probably because of their 6-digit salaries, or collection of credit cards). Because of the intense social focus in certain sectors of economy, value inflation is making this endless pursue of the "dream" even more unobtainable. Consumption without any real need behind it, is the hollow foundation of the development of modern societies.
People need to open their eyes and realize what is the real value of the skills, items, services that they spend their time and resources in their ongoing effort to acquire. We are wasting our lives and resources in pursue of goals that we have not chosen! And we are dragging down with us the whole planet...
(by the way: The housing model described above, has been successfully implemented in Singapore for the past 30 years, the automotive model is the one under which this industry started more than a century ago, and was lost along the way)
The value of any media is directly proportional to the culture of those who use it. Following this logic I present the path followed by the Social Network Services and the internet in general. Understanding of the direction of evolution for new media will help in the realistic evaluation and use by the community. Also it will help overcome prejudice and cultrural barriers that exist. If there is a shared vision of what we want to achieve then the effectiveness of the medi will be optimal.
Flood of Information
Web sites - Social Media - Augmented reality - Location based services - Blogs - Facebook - Twitter - Internet of things. Terms and concepts which lead to a certain outcome: Plenty of information about people, places, objects and personal views on any matter possible in human mind. A vast universe of information, occasionaly useful, that can result to knowledge or even better wisdom, as long as we become aware of high quality data at the right time. But this immense stream of data is flowing around us and goes untapped, though thankfully recorded and registered. To turn this around and exploit the full potential of the available data, these must be recorded, formatted, categorized so that they become understandable and accessible (in terms of volume and quality), and then should be made easily available to anyone interested! In brief I would say that recorded data should be attributed a meaning (or more!).
Current technology has given us the means of recording (PCs, networks, open space, GPS, cameras, mobile phones, RFiD) and with the architecture of CLOUD there is ample storage space and computer-processing power. For the past few years has begun with demonic pace, that exceeds our imagination, the recording of information of both the natural world and the spiritual. Tools have also been developed, that can relay this information to all interested parties at any given time. Actually this trend was initiated back in the late '90s when platforms like "Autonomy" made their appearance. It is obvious to see the strength of this trend, just by following the number of doctorates - thesis that have been "invested" on this direction of the information technology market around the world.
Nowadays, the rate of growth of available information is beyond the capability of monitoring and retrieval not only for the average person, but even for specialists or teams of specialists! This rate will certainly continue to accelerate and inflate. So this inflated wealth of info will only grow as there is now a steady flow of data even from objects / devices (internet of things). Under these circumstances, the stored information does not create knowledge at the moment as fast as one would imagine.
Knowledge from the information
Obviously the facts mentioned above highlight the importance of procedures, techniques - visualization and dissemination of knowledge (not simply information) that can be derived from all available data. The average individual in the contemporary society and on an everyday base, should be able to take advantage of the available knowledge stemming from the data. The "nervous system" to transfer the information recorded by the "sensors" is available already. Where "nervous system" put the communication networks and where "sensors" put persons and devices that transmit information. The "recording memory" also exists and I am reffering to the central storage of information, as well as individual - personal capabilities.
The technology should provide us the capability to retrieve, visualize and to some extent automate the transmission of knowledge. We should be able to "see" in front of us solutions, the moment we need them. Solutions that have been used in the past or solutions that may arise from a combination of stored relevant information.
This requires an analysis of the meaning of the information stored so that the neccessary links are established. The meaning can be derived and assigned from some measurable physical characteristics or data analysis.
The simple first step is to georeferenced information. So interested parties can access information in the geographical scope. This is already starting to happen.
A second step is the symantic categorization which is achieved by keywords or by using special tags - markings (see # tags to Twitter for example). These two steps are underway in recent years.
Most difficult step is classifying attributes relating to identification based on shape, color, material and size, or even smell. This means that objects or persons are automatically recognized by their appearance! This will enable automatic connections (eg, automatic grouping of all clothing or all vehicles or all edible mushrooms!)
Indirect evaluation of information can be derived from the use they had by other interested parties in the past. Every time a piece of information is being used for a purpose, the nature of this use should be recorded and added to the attributes.
Another level could be their "lifeline" or evolution of use on a timeline. I am referring to the activity or value timeline as it is recorded through search engines, social networking or sensor recording. First hints of use of this "life-cycle" values are the various trend-reporting services that have been presented by engines like Google or Twitter, etc.
Then we come to the visualization of the relevant information. The presentation must be based on the associated "meanings". It should also be as straightforward as possible. Without requiring neither search procedures on sources (which can be practically infinite) nor evaluation of data (which should be automatic). This way location, quality and identity of information are combined and the user sees only relevant - useful entries, probably ranked according to some criteria.
It is obvious that the amount of available information can make visualization complicated (especially if it includes historical factors). Hence we get to the necessity to come up with new technological solutions and visualization methods that will make the flow of data comprehensible. Existing example of such user interfaces, include overlapping levels, projected on real life scenery, or graphic representation of streams on timeline, etc. But we need to design new interfaces to facilitate handy browsing for users.
In this evolution we should include new software and applications that link the meaning from many sources and create a "flow" of information and trends. Also, new hardware is needed (transparent organic displays, or maybe even body implants). A new generation of graphics is an obvious solution. The advantage of graphic representation is that they are compatible with the existing culture and educational system for most users.
The ethics of knowledge If we focus on the last two points (mentioned above) of the procedures that assign meaning to stored information, we can instantly understand that they are subjective and dependent on the user behavior. In a world with a culture of collective wisdom and knowledge, each person using the information and their derivatives, have a responsibility towards the community for the proper ethical use. By this I do not imply in any way manipulation of free will and cultural variety, but rather a methodology of thinking and linking it with information. Actions based on the available information must have a rational thinking behind them. Or contribution to the knowledge base must be done with a sense of social responsibility and not recklessly.
The danger of creating very conservative societies and manipulating the way of thinking of many people, is real though. This can happen by the formulation of knowledge that will be applied by systems, as well as from the extensive analysis of data and situations. As knowledge is derived by systems and methods, this incorporates certain degree of predictions and evaluations. This by nature is conservative, since any change i the status quo, should be acceptable thus predicted by the system itself! There are obvious conservative political repercussions. In his account, this faith in the technologically augmented system model becomes a reason to defend the status quo. This problem constitutes a whole subject by itself and can be a very serious discussion on cybernetics. A relevant articles can be seen here: The Limits of Big Data - ReadWriteCloud as well as a BBC documentary: All Watched Over by Machines of Loving Grace The "classes" of the knowledge society
This process of collective knowledge will lead to some new "classes" of people and systems.
Authors / creators = those who produce wisdom in the first level. I am referring to the "birth" of knowledge through a process of analysis, study, awareness and reporting
Repeaters = those who forward, promote and enrich to some degree the knowledge which is produced by systems or "creators". People or brands or companies with influence and an effective distribution network.
Consumers = Readers and users of knowledge as produced and broadcasted.
Although, the class of authors, is particularly valuable, all the "classes" add value to information, even through simple usage. Even the statistical analysis of their activity or the evaluation of it, will add more levels to the knowledge base. So, even the simple "consumer" can and should be aware of his role and importance in this chain of "knowledge building".
Our culture has already moved on from information society to knowledge. The power now is in the "owner" of knowledge as it is derived from publicly available information. And of course this leads to effective application of knowledge to everyday or even more complicated decisions and procedures.