Build a data platform to the industry-leading standards set by Microsoft’s own infrastructure.
In Data Engineering on Azure you will learn how to:
Pick the right Azure services for different data scenarios
Manage data inventory
Implement production quality data modeling, analytics, and machine learning workloads
Handle data governance
Using DevOps to increase reliability
Ingesting, storing, and distributing data
Apply best practices for compliance and access control
Data Engineering on Azure reveals the data management patterns and techniques that support Microsoft’s own massive data infrastructure. Author Vlad Riscutia, a data engineer at Microsoft, teaches you to bring an engineering rigor to your data platform and ensure that your data prototypes function just as well under the pressures of production. You'll implement common data modeling patterns, stand up cloud-native data platforms on Azure, and get to grips with DevOps for both analytics and machine learning.
Purchase of the print book includes a free eBook in PDF, Kindle, and ePub formats from Manning Publications.
About the technology
Build secure, stable data platforms that can scale to loads of any size. When a project moves from the lab into production, you need confidence that it can stand up to real-world challenges. This book teaches you to design and implement cloud-based data infrastructure that you can easily monitor, scale, and modify.
About the book
In Data Engineering on Azure you’ll learn the skills you need to build and maintain big data platforms in massive enterprises. This invaluable guide includes clear, practical guidance for setting up infrastructure, orchestration, workloads, and governance. As you go, you’ll set up efficient machine learning pipelines, and then master time-saving automation and DevOps solutions. The Azure-based examples are easy to reproduce on other cloud platforms.
Data inventory and data governance
Assure data quality, compliance, and distribution
Build automated pipelines to increase reliability
Ingest, store, and distribute data
Production-quality data modeling, analytics, and machine learning
About the reader
For data engineers familiar with cloud computing and DevOps.
About the author
Vlad Riscutia is a software architect at Microsoft.
Table of Contents
PART 1 INFRASTRUCTURE
PART 2 WORKLOADS
7 Machine learning
PART 3 GOVERNANCE
9 Data quality
11 Distributing data