How does Kaggle make money

What can you ask in a world full of big data?


Read on one side

Companies and organizations can start a kind of tender on the company's website: They say what their problem is. They announce what data they have collected. Whoever solves the problem best gets a prize.

Recently, for example, the health company Heritage wanted to find out with the help of Kaggle which patient from its database will have to go to hospital in the coming year (3 million dollars in prize money). The music company EMI Music wanted to find out which piece of music would be the next big hit ($ 10,000); An awareness-raising campaign about dangers on the Internet even asked which Twitter news service user is likely to be a psychopath ($ 1,000). Sometimes it sounds playful, sometimes academic, but Goldbloom says: "You can work out how valuable it can be for a bank or an insurance company to predict whether you will wreck your car in the coming year."

With Kaggle you can also see how much the analysis of big data stocks has so far been a mixture of science and tinkering. Around 45,000 data detectives have registered with Kaggle to crack problems there - "and our experience is that physicists and electrical engineers do best, and those with a certain amount of common sense. Oh, and there's a glacier researcher who regularly with good results. "

Why? Goldbloom shrugs his shoulders: It's not all just advanced mathematics and pure science, you also need intuition and practical understanding. "One of my favorite examples was a competition for a very large used car dealer in the United States," says Goldbloom. "These people brought us ten years of historical data and wanted to know: Which used cars will prove to be particularly durable in the long term? And it turned out that it was not the number of kilometers driven or the size of the engine that made the difference, but the cars unusual colors proved to be the most durable. " One can only speculate about the reason. But statistically, the result holds up and is extremely valuable to the trader.

So data tinkerers at Kaggle, scientific organizations and a growing number of high-tech companies, individual problem by individual problem, are trying to find out what can be asked in a world full of big data.

Big data for the police

At the Fraunhofer Institute for Intelligent Analysis and Information Systems (IAIS) in Sankt Augustin, experts are currently working on an ambitious project that will one day support the police and emergency services: The plan is to evaluate all possible cell phone data and Internet information such as Twitter messages - and then automatically informing the emergency services whether their help is needed somewhere in the country, whether a major event is getting out of hand. For this purpose, clusters of people must be examined to see whether they deviate from past clusters - in other words, you need historical databases. And Twitter messages, for example, have to be searched to see whether they express joy - or panic. The big data systems of the future, Fraunhofer is convinced, will be able to interpret human language well.

Which immediately poses the next unsolved problem: What about data protection? The processing of personal data can affect several basic rights at the same time, says the EU MP Jan Philipp Albrecht. Many researchers have also recognized the problem. Stefan Wrobel from the Fraunhofer Institute says: "It is by no means sufficient to just separate surname, first name, age and address from a data record in order to anonymize it. Even if you separate the name from the movement data, you just have to look where the signal is at night, then you know where the owner of the cell phone lives, so you can easily identify most people. "

Fraunhofer has found a technical solution to this problem. "Roughly speaking, the data sets were broken down into parts and these parts were thrown through again". But this one solution had already cost several years. In other words: Big data can be lengthy and expensive, especially in Europe, where data protection is taken very seriously.

And there is another fundamental problem that makes the profitable use of big data solutions difficult: It has to do with the decision-making structures in corporations.

New way of making decisions

The future of modern corporate management can be viewed at the headquarters of the big data experts SAS Institute in Heidelberg. In the conference room, a projector throws the business figures of an internationally operating toy company on the wall - sales of teddy bears, toy cars and the like. Broken down by region, by profit, by delivery time; differentiated according to the respective season, according to locations with a lot of competition and with little competition, the possibilities seem endless.

All of this can be sorted with a few clicks of the mouse, and even randomly paddling around in the group's databases: the computer then automatically creates a few clear graphics over and over again, which you can use to learn a little more about the business, challenges and opportunities of the toy trade can.

It's a demo. But corporate management today can run quite similarly - at least that's what they promise with big data companies. The executive's room is transformed into a kind of command bridge, on which the computer provides an overview of the corporate ship and, if necessary, rummages through all the data from the company and its environment. You can try out scenarios in a playful way: What if you offered the teddy bears cheaper in China? What if the suppliers in Eastern Europe went on strike? "Big data brings the possibility of a whole new way of decision-making," wrote three researchers from the McKinsey Global Institute recently. "With controlled experiments, companies can test hypotheses and let the results guide their business and investment decisions."

So we are talking about a further push of scientification in corporate headquarters. The problem is that a really consistent orientation towards the data clearly contradicts the established management structures.

So far, data has been a decision-making aid, but the decision was made by the boss, say Andrew McAfee and Erik Brynjolfsson, two MIT experts for researching business in the digital age. However, they suggest that this should change in the age of big data: Bosses should best re-educate their people immediately. And lead by example yourself, the experts demand. "When you make an important decision, make it a habit to ask: What does the data say?"