Infrastructure for Advanced Analytics and Machine Learning
- Beschreibung
The ongoing data deluge driven by the increasing digitalization of science, society and industry, leads to a significant increase in demand for data storage, processing and analytics within several industrial domains. Sciences and industry are overwhelmed by the need to store large amounts of transactional and machine-generated data resulting from the customer, service and manufacturing processes. Examples of machine- generated data are server logs as well as sensor data that is generated in finer granularities and frequencies. Further, datasets are often enriched with web and open data from social media, blogs or other open data sources. The Internet of Things (IoT) will further blur the boundaries between the physical and the digital world causing an even further increase in the digital footprint of the world. In this course, we will learn about data applications and their requirements. Further, we will discuss the core infrastructure necessary to handle the large data volumes and analytical problems. As part of the exercises students will utilize different frameworks, e.g. MapReduce and Spark to implement different algorithms. This class will cover the following topics: Data Applications in Industry and Sciences Resource Management: YARN, Mesos and Kubernetes Hadoop Processing Engines: Spark, Flink SQL on Hadoop: Impala, Hive, Spark, Presto Stream Processing: Kafka, Spark Streaming, Flink, Heron Fault Tolerance: CAP Theorem, Eventual Consistency, Quorum Protocols, Apache Zookeeper Data in the Cloud: Elastic MapReduce, Azure HDInsight, Google Cloud Dataflow Machine Learning (MLLib) Natural Language Processing Deep Learning: Convolutional Neural Networks The course will be offered as a block lecture.
- Institut
- Institut für Informatik
- Dozent:in
- Assistent:in
- Externe Homepage
- http://www.nm.ifi.lmu.de/teaching/Vorlesungen/2020ss/data-analytics/
- Kursteilnehmer:innen
- 22 von 20
- Anmeldung
Di 25 Feb 2020 00:00 – Sa 14 Mär 2020 23:59
Abmeldung nur bis Sa 04 Apr 2020 23:59
- Anweisungen zur Bewerbung
Besuch der Vorlesungen Rechnernetze und verteilte Systeme, Betriebssysteme, Rechnerarchitektur oder vergleichbare Kentnisse erforderlich.Programmierkenntnisse in Python und Umgang mit Linux Kommandozeile erforderlich. Bitte legen Sie kurz in einem Motivationsschreiben dar, warum sie der Kurs interessiert! Legen Sie dar, dass Sie die notwendigen Vorkenntnisse besitzen.
- Material
Das Kursmaterial ist nur für Mitglieder des Kurses einsehbar, also z.B. für Teilnehmer:innen, Tutor:innen, Korrektor:innen und Verwalter:innen.
- Prüfungen