Acante with Unity Catalog on Databricks: Data security observability & access governance made easy

Learn how Acante helps accelerate the Databricks adoption journey and democratize access to data while confidently adopting Unity Catalog as your primary repository for access governance policies.

February 12, 2024
-
Balaji Vasu
Jon Brisbin

The modern data stack is largely built as a set of as-a-service platforms delivered on top of public cloud infrastructure. Databricks, S3, ADLS, a variety of vector databases, dbt, Kafka, Fivetran and others have become the new core technologies for analytics, model training, transformation and ingestion. 

Unfortunately, traditional cloud security capabilities are completely blind to the fluid data and  dynamic access mechanisms to data in these platforms. They lack the relevant data context (schemas, content type, sensitivity, lineage, leakage risk, tags, and privileges, for example) to provide the appropriate level of data security and governance. This lack of visibility and context puts today’s data platform teams in a constant struggle with their security counterparts in their efforts to democratize access to data while still meeting security and compliance mandates.

Unity Catalog: The New Data Governance Portal from Databricks

Databricks has taken a huge step forward toward simplifying data management with the evolution of the Unity Catalog for unified governance for data. “With Unity Catalog, organizations can seamlessly govern their structured and unstructured data, machine learning models, notebooks, dashboards and files on any cloud or platform. Data scientists, analysts and engineers can use Unity Catalog to securely discover, access and collaborate on trusted data and AI assets, leveraging AI to boost productivity and unlock the full potential of the lakehouse architecture.”

Unity Catalog is clearly the go-forward path for all Databricks customers. Given that, Acante has built a seamless and deep integration with Unity Catalog.

Setting up the Acante Data Security Intelligence Platform in a Unity Catalog environment is extremely simple. The Acante deployment is distributed as a notebook – the Metadata Discovery notebook. It is supported by a second provisioning notebook or, alternately, a terraform module. This provisioning notebook automates the entire setup, creating the necessary resources, service principals and configurations while setting up the metadata discovery notebook as a job. Databricks customers can on-board multiple workspaces at once and discover all the catalogs automatically. It also ensures a tight security model, preventing customer data within Databricks from leaving their environment. By just running this single notebook, the whole provisioning process can be completed in less than five minutes. 

Acante Captures the Extensive Telemetry Exposed by Unity Catalog

Unity Catalog exposes an extensive set of telemetry – all of which is automatically captured by Acante. This includes platform configurations, catalog schemas, identities, access policies, data lineage, audit logs, metadata about all workloads, cleanrooms, delta shares and other security information. This telemetry collection is fairly involved. It’s captured from multiple sources in Databricks and requires significant transformations to derive relevant security insights. Some examples of telemetry sources include: 

  • system.information_schema: This includes a variety of tables that carry details about the metastores, catalogs, schema details, masking and filter functions, access privileges, information about Shares and much more.
  • Control Plane APIs: There are endpoints available to gather information such as identities and groups, external locations, view definitions and details about all the workloads such as notebooks, jobs, dashboard and pipelines, including their sharing and permissions. 
  • system.access.audit: This provides granular query-level audit logs with multiple options for logging verbosity.
  • system.access.column_lineage and table.lineage: Unity Catalog natively provides data lineage information down to column level along with the source that transformed the data. 

From Data (telemetry) to Insights 

By analyzing and stitching together all this rich telemetry, Acante is able to generate a host of powerful data security and access insights across the Databricks lakehouse and its ecosystem, including ingress / egress systems, external volumes and connected cloud datastores (e.g. AWS S3, RDS and others). The Acante Dynamic Identity-Data Threat GraphTM is at the core of the platform, powering these insights, and is the only solution in the industry for modern data stacks that brings together data security observability, access intelligence and access governance in a single platform. The security analytics empower data teams to: 

  • Easily approve, provision and right-size data access privileges (Data Privilege Access Management), all with complete data risk context
  • Automatically discover, classify and assess risk for all sensitive data at petabyte-order scale
  • Track sensitive data flows and prevent leakage of the data by any identity – including by users, service principals or workloads
  • Implement granular data security guardrails to ensure compliant and secure data use

Ultimately, this empowers data teams to deliver fast, secure and compliant access to data. They can accelerate their Databricks adoption journey and democratize access to their data while confidently adopting Unity Catalog as their primary repository for access governance policies. Acante’s tight integration with Unity Catalog allows organizations to get the most out of their Databricks investments. 

This is the first blog in a series that will discuss the above capabilities and use cases in more detail.

Unveiling the Challenge

As our digital footprint expands, so do the challenges of securing our data assets. Acante.ai recognizes the exponential proliferation and constant change in data access patterns, creating blind spots for traditional security approaches.

The Acante.ai Difference

At Acante.ai, our approach to data security marks a paradigm shift in the industry. Unlike traditional security models that often succumb to the static nature of data threats, Acante.ai thrives on dynamism. We believe that true security evolves with the challenges, and that's precisely what sets us apart. The Acante.ai difference lies in our commitment to providing security teams with more than just a shield; we offer a strategic ally that anticipates, adapts, and fortifies against the unpredictable proliferation of data access patterns. Our solution doesn't just keep pace with the digital transformation journey; it propels it forward. But what truly defines the Acante.ai difference goes beyond technology; it's ingrained in our culture. We are a collective of thoughtful, compassionate, and collaborative individuals on a shared mission to disrupt the security industry. With deep expertise from major brands and startups, we've collectively built over 10 startups, resulting in category-creating businesses, acquisitions, and IPOs. Our success is a testament to the collaborative spirit within our team, where every member contributes to shaping our culture and the future of data security. Join Acante.ai, and experience the difference that drives us to redefine the limits of protection in the digital age.

Dynamic Data Security

Explore the cutting-edge realm of dynamic data security with Acante.ai. In an era where the digital landscape is in a perpetual state of flux, Acante.ai's comprehensive approach to data security becomes not just a solution but a strategic imperative. Imagine a security system that not only reacts to the ever-changing data access patterns but anticipates and adapts in real-time. This level of sophistication is what sets Acante.ai apart. Our solution not only seamlessly integrates with the native controls of your data lakes and warehouse ecosystems but also evolves with them. It's not just about protecting your data; it's about empowering it. Acante.ai's dynamic data security solution is not confined by static parameters; it's a living, breathing shield that moves in harmony with the pulse of your data. As businesses navigate the complexities of the modern data landscape, Acante.ai provides not just a safeguard but a strategic ally, ensuring that security is not a hindrance but an enabler of progress.

Conclusion

In a world where data is both a valuable asset and a potential liability, Acante.ai emerges as a beacon of innovation. Join us on this exploration of the future of data security and discover how Acante.ai is empowering organizations to navigate the evolving landscape with confidence.
Request a Demo
The Next Wave of AI Safety Needs to Focus on Data Governanceimage
The Next Wave of AI Safety Needs to Focus on Data Governance

The path to AI success requires organizations to unlock the value of their proprietary data, but in order to do that, they need to ensure that the data they feed into these AI systems, including LLMs, is secure.

Acante Announces Partnership with Commvault to Bring Together the Best of Data Access Governance and Protection for Enterprise Cloud Dataimage
Acante Announces Partnership with Commvault to Bring Together the Best of Data Access Governance and Protection for Enterprise Cloud Data

Seamless integration with Commvault Cloud provides unparalleled cyber resilience in the face of growing ransomware attacks and breaches

AI Risk Starts with Data Risk: DBTA Data Summit Keynote Summaryimage
AI Risk Starts with Data Risk: DBTA Data Summit Keynote Summary

What the first wave of AI security efforts are missing, and how the Data Layer is where new and critical security and privacy concerns need to be addressed.

Databricks has open sourced Unity Catalog: What that means for the ecosystemimage
Databricks has open sourced Unity Catalog: What that means for the ecosystem

Our point of view on why we need unified governance for Data and AI and why we are excited about Databricks releasing Unity Catalog as open source.

Nam quis nulla. Integer malesuada. In in enim a arcu imperdiet malesuada. Sed vel lectus. Donec odio urna, tempus molestie, porttitor ut, iaculis quis
Read now
Nam quis nulla. Integer malesuada. In in enim a arcu imperdiet malesuada. Sed vel lectus. Donec odio urna, tempus molestie, porttitor ut, iaculis quis
Read now
Nam quis nulla. Integer malesuada. In in enim a arcu imperdiet malesuada. Sed vel lectus. Donec odio urna, tempus molestie, porttitor ut, iaculis quis
Read now
Acante with Unity Catalog on Databricks: Data security observability & access governance made easy

The modern data stack is largely built as a set of as-a-service platforms delivered on top of public cloud infrastructure. Databricks, S3, ADLS, a variety of vector databases, dbt, Kafka, Fivetran and others have become the new core technologies for analytics, model training, transformation and ingestion. 

Unfortunately, traditional cloud security capabilities are completely blind to the fluid data and  dynamic access mechanisms to data in these platforms. They lack the relevant data context (schemas, content type, sensitivity, lineage, leakage risk, tags, and privileges, for example) to provide the appropriate level of data security and governance. This lack of visibility and context puts today’s data platform teams in a constant struggle with their security counterparts in their efforts to democratize access to data while still meeting security and compliance mandates.

Unity Catalog: The New Data Governance Portal from Databricks

Databricks has taken a huge step forward toward simplifying data management with the evolution of the Unity Catalog for unified governance for data. “With Unity Catalog, organizations can seamlessly govern their structured and unstructured data, machine learning models, notebooks, dashboards and files on any cloud or platform. Data scientists, analysts and engineers can use Unity Catalog to securely discover, access and collaborate on trusted data and AI assets, leveraging AI to boost productivity and unlock the full potential of the lakehouse architecture.”

Unity Catalog is clearly the go-forward path for all Databricks customers. Given that, Acante has built a seamless and deep integration with Unity Catalog.

Setting up the Acante Data Security Intelligence Platform in a Unity Catalog environment is extremely simple. The Acante deployment is distributed as a notebook – the Metadata Discovery notebook. It is supported by a second provisioning notebook or, alternately, a terraform module. This provisioning notebook automates the entire setup, creating the necessary resources, service principals and configurations while setting up the metadata discovery notebook as a job. Databricks customers can on-board multiple workspaces at once and discover all the catalogs automatically. It also ensures a tight security model, preventing customer data within Databricks from leaving their environment. By just running this single notebook, the whole provisioning process can be completed in less than five minutes. 

Acante Captures the Extensive Telemetry Exposed by Unity Catalog

Unity Catalog exposes an extensive set of telemetry – all of which is automatically captured by Acante. This includes platform configurations, catalog schemas, identities, access policies, data lineage, audit logs, metadata about all workloads, cleanrooms, delta shares and other security information. This telemetry collection is fairly involved. It’s captured from multiple sources in Databricks and requires significant transformations to derive relevant security insights. Some examples of telemetry sources include: 

  • system.information_schema: This includes a variety of tables that carry details about the metastores, catalogs, schema details, masking and filter functions, access privileges, information about Shares and much more.
  • Control Plane APIs: There are endpoints available to gather information such as identities and groups, external locations, view definitions and details about all the workloads such as notebooks, jobs, dashboard and pipelines, including their sharing and permissions. 
  • system.access.audit: This provides granular query-level audit logs with multiple options for logging verbosity.
  • system.access.column_lineage and table.lineage: Unity Catalog natively provides data lineage information down to column level along with the source that transformed the data. 

From Data (telemetry) to Insights 

By analyzing and stitching together all this rich telemetry, Acante is able to generate a host of powerful data security and access insights across the Databricks lakehouse and its ecosystem, including ingress / egress systems, external volumes and connected cloud datastores (e.g. AWS S3, RDS and others). The Acante Dynamic Identity-Data Threat GraphTM is at the core of the platform, powering these insights, and is the only solution in the industry for modern data stacks that brings together data security observability, access intelligence and access governance in a single platform. The security analytics empower data teams to: 

  • Easily approve, provision and right-size data access privileges (Data Privilege Access Management), all with complete data risk context
  • Automatically discover, classify and assess risk for all sensitive data at petabyte-order scale
  • Track sensitive data flows and prevent leakage of the data by any identity – including by users, service principals or workloads
  • Implement granular data security guardrails to ensure compliant and secure data use

Ultimately, this empowers data teams to deliver fast, secure and compliant access to data. They can accelerate their Databricks adoption journey and democratize access to their data while confidently adopting Unity Catalog as their primary repository for access governance policies. Acante’s tight integration with Unity Catalog allows organizations to get the most out of their Databricks investments. 

This is the first blog in a series that will discuss the above capabilities and use cases in more detail.

The Next Wave of AI Safety Needs to Focus on Data Governanceimage
The Next Wave of AI Safety Needs to Focus on Data Governance

The path to AI success requires organizations to unlock the value of their proprietary data, but in order to do that, they need to ensure that the data they feed into these AI systems, including LLMs, is secure.

Acante Announces Partnership with Commvault to Bring Together the Best of Data Access Governance and Protection for Enterprise Cloud Dataimage
Acante Announces Partnership with Commvault to Bring Together the Best of Data Access Governance and Protection for Enterprise Cloud Data

Seamless integration with Commvault Cloud provides unparalleled cyber resilience in the face of growing ransomware attacks and breaches

AI Risk Starts with Data Risk: DBTA Data Summit Keynote Summaryimage
AI Risk Starts with Data Risk: DBTA Data Summit Keynote Summary

What the first wave of AI security efforts are missing, and how the Data Layer is where new and critical security and privacy concerns need to be addressed.

Databricks has open sourced Unity Catalog: What that means for the ecosystemimage
Databricks has open sourced Unity Catalog: What that means for the ecosystem

Our point of view on why we need unified governance for Data and AI and why we are excited about Databricks releasing Unity Catalog as open source.

Nam quis nulla. Integer malesuada. In in enim a arcu imperdiet malesuada. Sed vel lectus. Donec odio urna, tempus molestie, porttitor ut, iaculis quis
Read now
Nam quis nulla. Integer malesuada. In in enim a arcu imperdiet malesuada. Sed vel lectus. Donec odio urna, tempus molestie, porttitor ut, iaculis quis
Read now
Nam quis nulla. Integer malesuada. In in enim a arcu imperdiet malesuada. Sed vel lectus. Donec odio urna, tempus molestie, porttitor ut, iaculis quis
Read now
Nam quis nulla. Integer malesuada. In in enim a arcu imperdiet malesuada. Sed vel lectus. Donec odio urna, tempus molestie, porttitor ut, iaculis quis
Read now