Friday, October 24, 2014

Новые курсы по Big Data от MIT на edX

Какое-то время назад на сайте MIT появилась публикация об интересной инициативе MIT на edX, а именно о запуске MIT Professional Education первой сессии курса Tackling the Challenges of Big Data на edX, которая будет доступна всем желающим.

Using Big Data in the Workplace -- MIT Professional Education Courses

Важно отметить, что данный шестинедельный курс платный и стоит $545, скажем прямо, что цена достаточно большая, если брать средние расценки платных курсов MOOC. Хотя набор инструкторов из MIT конечно впечатляет. Первая сессия курса начнется 4 ноября 2014 года и продлится до 16 декабря 2014 года. Вторая сессия курса начнется 3 февраля 2015 года и продлится до 17 марта 2015 года. Более подробную информацию о данном курсе и его достоинствах можно найти на его странице на edX, кстати, там можно найти полный список инструкторов и отзывы студентов, которые уже успешно поучаствовали в данном курсе, а также достаточно подробную секцию FAQ.

Программа курса

Module One: Introduction and Use Cases

The introductory module aims to give a broad survey of Big Data challenges and opportunities and highlights applications as case studies.

Introduction: Big Data Challenges (Sam Madden)

  • Identify and understand the application of existing tools and new technologies needed to solve next generation data challenges
  • Challenges posed by the ability to scale and the constraints of today's computing platforms and algorithms
  • Addressing the universal issue of Big Data and how to use the data to align with a company’s mission and goals

Case Study: Transportation (Daniela Rus)

  • Data-driven models for transportation
  • Coresets for Global Positioning System (GPS) data streams
  • Congestion-aware planning

Case Study: Visualizing Twitter (Sam Madden)

  • Understand the power of geocoded Twitter data
  • Learn how Graphic Processing Units (GPUs) can be used for extremely high throughput data processing
  • Utilize MapD, a new GPU-based database system for visualizing Twitter in action

Module Two: Big Data Collection

The data capture module surveys approaches to data collection, cleaning, and integration.

Data Cleaning and Integration (Michael Stonebraker)

  • Available tools and protocols for performing data integration
  • Curation issues (cleaning, transforming, and consolidating data)

Hosted Data Platforms and the Cloud (Matei Zaharia)

  • How performance, scalability, and cost models are impacted by hosted data platforms in the cloud
  • Internal and external platforms to store data

Module Three: Big Data Storage

The module on Big Data storage describes modern approaches to databases and computing platforms.

Modern Databases (Michael Stonebraker)

  • Survey data management solutions in today’s market place, including traditional RDBMS, NoSQL, NewSQL, and Hadoop
  • Strategic aspects of database management

Distributed Computing Platforms (Matei Zaharia)

  • Parallel computing systems that enable distributed data processing on clusters, including MapReduce, Dryad, Spark
  • Programming models for batch, interactive, and streaming applications
  • Tradeoffs between programming models

NoSQL, NewSQL (Sam Madden)

  • Survey of new emerging database and storage systems for Big Data
  • Tradeoffs between reduced consistency, performance, and availability
  • Understanding how to rethink the design of database systems can lead to order of magnitude performance improvements

Module Four: Big Data Systems

The systems module discusses solutions to creating and deploying working Big Data systems and applications.

Security (Nickolai Zeldovich)

  • Protecting confidential data in a large database using encryption
  • Techniques for executing database queries over encrypted data without decryption

Multicore Scalability (Nickolai Zeldovich)

  • Understanding what affects the scalability of concurrent programs on multicore systems
  • Lock-free synchronization for data structures in cache-coherent shared memory

User Interfaces for Data (David Karger)

  • Principles of and tools for data visualization and exploratory data analysis
  • Research in data-oriented user interfaces

Module Five: Big Data Analytics

The analytics module covers state-of-the-art algorithms for very large data sets and streaming computation.

Fast Algorithms I (Ronitt Rubinfeld)

  • Efficiency in data analysis

Fast Algorithms II (Piotr Indyk)

  • Advanced applications of efficient algorithms
  • Scale-up properties

Data Compression (Daniela Rus)

  • Reducing the size of the Big Data file and its impact on storage and transmission capacity
  • Design of data compression schemes such as coresets to apply to Big Data set

Machine Learning Tools (Tommi Jaakkola)

  • Computational capabilities of the latest advances in machine learning
  • Advanced machine learning algorithms and techniques for application to large data sets

Case Study: Information Summarization (Regina Barzilay)

Applications: Medicine (John Guttag)

  • Utilize data to improve operational efficiency and reduce costs
  • Analytics and tools to improve patient care and control risks
  • Using Big Data to improve hospital performance and equipment management

Applications: Finance (Andrew Lo)

  • Learn how big data and machine learning can be applied to financial forecasting and risk management
  • Analyze the dynamics of the consumer credit card business of a major commercial bank
  • Recognize and acquire intuition for business cases where big data is useful and where it isn't

Большой список онлайн-курсов по теме Data Science: Data science online courses

No comments:

Post a Comment