The SunnyData Blog

Explore insights and practical tips on mastering Databricks Data Intelligence Platform and the full spectrum of today's modern data ecosystem.

AWS

ETL

EDW

Databricks, Cost, AI Jeronimo Acosta Databricks, Cost, AI Jeronimo Acosta

Winter Is So Over: Quick Guide on the 18 Big Announcements

Here we present Databricks Summit 2024 18 biggest announcements, including Generative AI tools, AI-powered BI dashboards, open-source Unity Catalog, no-code data ingestion, serverless compute, secure data collaboration, built-in AI functions for data warehousing, and the all-new Delta Lake 4.0. Discover how these advancements can unlock the full potential of your data. Get the details and see how SunnyData can help - read now!

Read More
Databricks, Cost, AI Josue Bogran Databricks, Cost, AI Josue Bogran

Cost Saving Best Practices For Databricks Workflows

Discover how to manage pipeline costs effectively with Databricks Workflows. This article offers practical tips to reduce total cost of ownership without sacrificing performance, and provides insights into understanding your costs better. Learn strategies like using job compute and spot instances, setting warnings and timeouts, leveraging task dependencies, and implementing autoscaling and tagging. Optimize your resource usage and get the most out of your Databricks environment. Read on for actionable advice to streamline your data processes.

Read More
Databricks, Architecture, AI, Snowflake Josue Bogran Databricks, Architecture, AI, Snowflake Josue Bogran

Navigating Data Governance with Unity Catalog: A Practical Exploration

This comprehensive guide delves into the essential features of Unity Catalog by Databricks, highlighting its role in enhancing security, automating data documentation, and streamlining ML and AI governance. Learn practical tips on integrating this powerful tool into your data strategy to boost productivity and ensure compliance. Whether you're scaling up or enhancing data discoverability, this article is your roadmap to leveraging data for innovation while maintaining robust security.

Read More
Databricks, Architecture, AI Kailash Thapa Databricks, Architecture, AI Kailash Thapa

Unity Catalog and Enterprise Data Governance Tools: How Should They Fit In Your Stack

Inn this blog we address whether Unity Catalog can replace existing enterprise catalogs when integrated with Databricks. We clarify that while Unity Catalog excels at centralizing governance and enhancing data management within Databricks, it complements rather than replaces established catalogs like Alation or Collibra if they already add significant value.

With extensive experience in data solutions, our CEO Kai notes that Unity Catalog is indispensable for managing permissions and access across data, ML, and AI assets effectively. However, for broader governance needs, using it alongside other data catalogs ensures comprehensive management across all data systems.

Read More
Databricks, Architecture, AI, Snowflake Josue Bogran Databricks, Architecture, AI, Snowflake Josue Bogran

Navigating Data Governance with Unity Catalog: Enhancing Security and Productivity

Unity Catalog from Databricks is revolutionizing how businesses manage their data, providing a unified governance platform that centralizes control over data and AI assets. It enhances productivity, bolsters security, and streamlines compliance by offering a single, searchable repository for all data assets.

The platform automates data documentation with Generative AI, easing the workload on data stewards and enriching data management with semantic searches and interactive visualizations. Additionally, Unity Catalog's Lakehouse Federation integrates data across multiple platforms, ensuring seamless data accessibility. Its advanced data lineage capabilities offer clear visibility into data movements, crucial for compliance and informed decision-making, making it a strategic asset for any data-driven organization.

Read More
Databricks, ETL, AI, Hadoop, Snowflake Jeronimo Acosta Databricks, ETL, AI, Hadoop, Snowflake Jeronimo Acosta

As the snow melts, it's time to build a data skyscraper with Databricks

Funny or not, building a secured, governed and scalable data platform that supports multiple types of use cases along with the data management processes and practices is very similar to building a skyscraper - the higher the building grows and supports more units and people, the complexity increases.

This guide will help you understand the complexities of Databricks, ensuring your data skyscraper stands tall and proud.

Read More
Databricks, Cost, AWS, AI Kailash Thapa Databricks, Cost, AWS, AI Kailash Thapa

Databricks Model Serving for end-to-end AI life-cycle management

In the evolving world of AI and ML, businesses demand efficient, secure ways to deploy and manage AI models. Databricks Model Serving offers a unified solution, enhancing security and streamlining integration. This platform ensures low-latency, scalable model deployment via a REST API, perfectly suited for web and client applications. It smartly scales to demand, using serverless computing to cut costs and improve response times, providing an effective, economical framework for enterprises navigating the complexities of AI model management.

Read More
Migrations, Databricks, ETL, AWS, Hadoop Kailash Thapa Migrations, Databricks, ETL, AWS, Hadoop Kailash Thapa

Hadoop to Databricks: A Guide to Data Processing, Governance and Applications

In the intricate landscape of migration planning, it is imperative to map processes and prioritize them according to their criticality. This implies a strategic process to determine the sequence in which processes should be migrated according to business.

In addition, organizations will have to define whether to follow a "lift and shift" approach or a "refactor" approach. The good news is that here we will help you choose which option is best for the scenario.

Read More
Databricks, Migrations, ETL, EDW, AWS, Hadoop Kailash Thapa Databricks, Migrations, ETL, EDW, AWS, Hadoop Kailash Thapa

Migrating Hadoop to Databricks - a deeper dive

Migrating from a large Hadoop environment to Databricks is a complex and large project. In this blog we will dive into different areas of the migration process and the challenges that the customer should plan in these areas: Administration, Data Migration, Data Processing, Security and Governance, and Data Consumption (tools and processes)

Read More
Migrations, Databricks, EDW, AWS, AI, Hadoop Kailash Thapa Migrations, Databricks, EDW, AWS, AI, Hadoop Kailash Thapa

Hadoop to Databricks Lakehouse Migration Approach and Guide

Over the past 10 years of big data analytics and data lakes, Hadoop has proven unscalable, overly complex (both onPremise and cloud versions) and unable to deliver to the business for ease of consumption or meet their innovation aspirations.

Migrating from Hadoop to Databricks will help you scale effectively, simplify your data platform and accelerate innovation with support for analytics, machine learning and AI.

Read More