What are big data and sparks

Everything you always wanted to know about big data

Dorothea Heymann-Reder February 02, 2016 4926 views
Hardly any other term is associated with so many hopes and concerns at the moment. "Big data" is actually only the logical consequence of an exponential one Increase in data volume, networking as well as storage and processing capacities.

Definition of big data

"Big Data" are large amounts of datathat have different formats, structures and sources and are constantly changing. The data is too extensive and complex to be processed using “conventional” means. Data mining tools and processes can then help to get out of the mass of data useful information to tap into and findings to win. This would then be “Smart Data” - the next big hype.

Big data sources

  • The volume of data has been growing exponentially for a long time. About the data that is knowingly generated and stored, such as Emails, Office files, Photos and social media, automatically generated data are also increasingly coming. The internet is a bottomless data barrel. In addition, there are log files from telecommunication connections as well as protocol and version data from office and scientific applications.
  • Smartphones and tablets permanently transmit data: communication data, location data and much more. Apps often require access to a plethora of data sources before they allow access. The consumer pays for usage with their data instead of money.
  • The latter also applies to Customer cards with which customers can collect "points" or discounts. Specialist companies meticulously keep records of the purchases and prepare the data for marketing purposes.
  • Increasingly, portable devices, so-called Wearables, used. Examples are fitness bracelets that are networked via smartphone apps or directly with their providers. Ostensibly created to give consumers a better training experience, they are also there in the background to send data about body functions to advertisers and health insurance companies.
  • Industry 4.0 and Internet of things (IoT) are based on the fact that objects are networked with one another, communicate via the network and control their functions, logistics, maintenance, and even their own assembly without human intervention. A prominent current example is the “self-driving car”, which collects and passes on terabytes of data every minute.
  • Tracker programs on the Internet as well as cookies on the PC or mobile device track the surfing behavior of the user and make it available to advertisers.
  • RFID transpondertransmit location data and more. They can be found from A to Z in automobiles from immobilizers to time recording systems. Warehouse management, animal identification, banknotes, garbage disposal, the little jacks of all trades are in use almost everywhere.
  • Wherever sensors are in use, data is collected and - often - saved.

Use of big data

Two important fields in which Big Data is used are:
  • Advertising and marketing - Advertisers are of course electrified by the prospect that big data could one day bring them “transparent customers”. You could continually "make offers that he cannot refuse". In theory, not much is missing if all customer data from the various sources were linked and processed in real time. In practice, however, a consumer who has bought a suitcase is annoyed for three years that he is permanently only offered suitcases. He doesn't need anyone anymore.
  • Counterterrorism - The collection and monitoring of telecommunications and social media data, for example, can help thwart terrorist attacks in advance. However, data retention and human surveillance are a controversial issue in Western democracies. Not every data collection action that is possible is also allowed.
Other possible areas of application for big data are scientific research, Discovery of irregularities in the Financial traffic (Fraud detection) as well as medical applications, e.g. in epidemiology.


The greatest challenges for the use of big data are:
  • Skilled labor shortage - Computer scientists and specialists who can handle big data technologies are hard to find.
  • Poor data quality - The data are so heterogeneous (partly also multimedia data), of so different informative value and so volatile that it is difficult to classify them correctly and to condense them in a purposeful manner.
  • Unsatisfactory results - Companies sometimes underestimate how difficult it is to obtain usable information and translate it into entrepreneurial action. The results are below expectations.
  • Legal hurdles - Fortunately they do exist, but for data collectors they are an obstacle: laws such as B. the Federal Data Protection Act, in which the right to informational self-determination is anchored.
  • Technical difficulties - Big data tools and techniques are developing rapidly and the use and administration of these technologies is not child's play.

Tools and Services

A market overview of Big data tools would go beyond the scope of this short guide. Especially since these tools also form a heterogeneous landscape. There are visualization tools, databases, analysis tools, data and process mining tools and many more. All contribute to turning unstructured big data into purpose-oriented smart data.

The Microsoft Azure world

Big data and BI tools and services from Microsoft are briefly summarized here.
  • Azure services work with HDInsight and include the following:
  • Stream Analytics - Helps with real-time data analysis of streaming data
  • Data factory - Create, plan, and monitor scalable data services. Provide a data pipeline (Data Mouvement-as-a-Service). Possibility to integrate automatisms
  • Data Catalog
  • Event hubs - Collect sensor data and information from websites with Event Hub, forward it to Stream Analytics, read out with SQL and consume it by users.
  • SQL Database Elastic Pool - Database pooling, Azure SQL Data Warehouse as data warehouse as a service

Dorothea Heymann-Reder

Dorothea Heymann-Reder writes blog posts, advice articles and white papers. Her specialist articles deal with commercial and business issues as well as the entire spectrum of digitization.

Get to know a new type of business software for the age of digitization!

Your non-binding entry into the Scopevisio world

Test now, decide later!
Take off into the cloud

Try Scopevisio's cloud enterprise software for 30 days free of charge. Non-binding, without any restrictions!

Try it free for 30 days

Can we help you?

Request information material

Do you have any further questions that you would like us to answer? We will put together your individual information package according to your requirements.

Arrange an online demo

Would you like to find out more about software from the cloud? We answer your questions and show you the advantages of working with Scopevisio.

Request a callback

Do you have any questions or would you simply like some advice? Make an appointment to call you back. Our experts will be happy to help you personally.