Microsoft today unveiled Azure Purview, a new data governance solution in public preview. Additionally, the company announced that Azure Synapse Analytics is now generally available.
Azure Purview automates the discovery of data and cataloging while minimizing compliance risk. Purview helps businesses map all their data, no matter where it resides, and provides an end-to-end view of their data estate. Azure Synapse Analytics meanwhile leverages on-demand or provisioned resources to ingest, prepare, manage, and serve data for business intelligence. Azure Synapse Analytics changes how enterprises store data and gain insight by bringing together data warehousing, big data, data integration, and AI.
Businesses are increasingly leveraging data as a strategic asset, which makes data services critical. Data needs to not only be stored and managed, but also discovered and analyzed at ever-growing volumes. Having designed services that do exactly that for itself, Microsoft is comfortable selling access to them.
“One of the things we’ve done for the past several years as we worked with our customers who are going through their digital transformation is to understand where is it that the pain exists in terms of the challenges of ‘We need to become a data-driven company and build up the data culture,’” Azure Data CVP Rohan Kumar told VentureBeat. “One of the key foundational elements of that is to really have a data platform that essentially allows you to generate insights very quickly by breaking down silos of data.”
Purview catalogs data from on-premises, multi-cloud, or software-as-a-service (SaaS) locations. Purview will let Azure customers understand exactly what data they have, manage its compliance with privacy regulations, and derive insights. Purview aims to maximize the compliant use of a company’s own data by understanding it, how it moves, and who it is shared with.
“This launch is around mapping of your entire data state catalog in both the physical and the business assets,” Kumar said. “So for every data asset that exists in your organization, having a very good understanding of where did it originate from, what were the changes that were made, who made those changes. Based on that, you can make decisions around can you trust this data, which all these become very, very important again when you think about becoming a data-driven organization.”
Azure Purview includes three main components:
- Data discovery, classification, and mapping: Azure Purview will automatically find all of an organization’s data on-premises or in the cloud, even those managed by other providers, and evaluate the characteristics and sensitivity of the data.
- Data catalog: Azure Purview enables all users to search for trusted data using a simple web-based experience. Visual graphs let users quickly see if data of interest is from a trusted source.
- Data governance: Azure Purview provides a bird’s-eye view of a company’s data landscape, enabling data officers to efficiently govern data use. This enables key insights such as the distribution of data across environments, how data is moving, and where sensitive data is stored.
Microsoft says these improvements will help break down the internal barriers that have traditionally complicated and slowed data governance. Furthermore, Purview’s roadmap includes governance policies to help with compliance for the EU’s GDPR and California’s CCPA.
“It’s not just about ensuring that you’re using the best AI and machine learning,” Kumar said. “Getting insights, that’s all great. But if you’re doing it on datasets which your customers haven’t given you consent, then you could be in serious trouble. Our customers across all verticals understand that. This is where the combination of Synapse and Purview is really game-changing for them.”
Azure Synapse Analytics
Microsoft unveiled Azure Synapse Analytics in November 2019, promising to help organizations use their own data and deploy AI. The goal was to let anyone in an organization access its analytics, thus freeing up skilled tech workers from having to manage data infrastructure.
Azure Synapse can query both relational and non-relational data at “petabyte-scale,” directed by lines of SQL. Since its announcement, Microsoft says the number of Azure customers running petabyte-scale workloads has increased fivefold.
“We support both SQL engines and Spark engines natively integrated because we see a ton of traction that Spark is getting with the data scientists,” Kumar said. “They need to collaborate very well with the data engineers to build their machine learning models on the dataset.”
Features like intelligent workload management, workload isolation, and limitless concurrency optimize the performance of queries in real time, and deep integration with Power BI and Azure Machine Learning simplifies the sharing of cleansed and processed data.
“Azure Machine Learning has a drag-and-drop experience where you basically point to the data that you want to train the model on,” Kumar said. “You pick the attributes that are very important as a part of the training. And the advantage over there is once you train the model you can automatically use that model within the SQL and the Spark queries that you’re doing within Synapse, with no additional work.”
The Azure Synapse Studio provides tools for data prep, data management, data warehousing, big data, and AI tasks. Additionally, it lets users manage data pipelines and build proofs of concept while securely accessing datasets and custom control interfaces. On the security side of the equation, Azure Synapse features automated threat detection and always-on data encryption, and it offers fine-grained access controls and column- and row-level security.
Kumar says hundreds of companies have adopted Azure Synapse Analytics over the past year, including major players like FedEx, Procter & Gamble, and Wolters Kluwer that rely on Azure Synapse. “It has been one of the fastest-growing data services that we have in Azure,” Kumar said. “We have seen significant growth both in terms of customers and the usage as they’ve started relying more and more on the analytics that is coming from Synapse.”