Empowering Business Insights at a Regional Bank with an open and scalable Databricks Lakehouse platform

Key Highlights

  • The Delta Lakehouse replaced several legacy operational and analytic data stores, centralizing all core data assets for analytics and data science use cases - simplifying the business users data journey.              

  • ~4X reduction in data overall data analytics platform TCO
     

INDUSTRY: Banking & Financial Services

SOLUTION: Customer segmentation, Transaction enrichment, Customer 360

PLATFORM USE CASE: Delta Lake, Data Science, Machine Learning, ETL
 

Financial institutions (banks, credit unions, asset management) now more than ever need to deeply understand their customers' spending habits, identify rapidly moving market signals on their B2B and B2C customers/prospects to provide highly tailored products and solutions that differentiate themselves from the competition. Traditional banks are also facing disruption from digital banks/banking solutions, crypto solutions coupled with a tough macro-economic condition with high interest rates and an economic slowdown. Thus, it is imperative for financial institutions to be able to get a comprehensive view of their customers, products, insights into customer spend to survive, thrive and grow. 

Most banks are still managing their business processing and data analytics/insights needs with legacy systems. Thus, Digital Transformation along with data management modernization efforts has become front and center for most banks to meet their customers where they are - digital offerings, hybrid customer experience (digital + physical), and faster product offerings. One of the critical components in digital transformation for organizations is a comprehensive store of all its operational data from business applications and ease of access to curated data to serve digital needs for internal use or external customer use. 


Customer challenges prior to new data consumption platform 


The regional bank that SunnyData worked with was experiencing customer growth and wanted to ensure that their customers were offered products/services based on their preferences. 

However, legacy business applications using mainframe systems combined with legacy analytics solutions with data silos and heavy reliance on IT was constraining the banks growth targets and not meeting customer satisfaction scores. 

The regional bank consulted with business partners across the organization to understand what the necessary steps were to transform analytics at the bank which would provide the best experience for their customers.



The bank assessed these major business challenges that were inhibiting growth and a lower customer satisfaction score than competitors:

  • Lack of holistic view of the customer slowed down customer service responses and inadequate customer insights,

  • Incomplete understanding of how customers were leveraging their products/services.

  • Ineffective marketing campaigns: Who should they be targeting from a marketing/product/service perspective?

  • Time to market for new products: Delays are marketing insights provided by data resulting in slower time to market for new products. 

 

Technology and business leaders jointly concluded that the data infrastructure for analytics, self-service and digital needs were inadequate and needed to modernize to be able to meet their customer needs. 

  • The overall data infrastructure TCO (total cost of ownership) was still growing at double digits because legacy processes that were running in those expensive on-premises systems. 

  • Need for speed - IT and data teams were constantly over-worked and over budget responding and reacting to ever changing business needs. 

  • Business teams weren’t empowered for self-service with the centralized data infrastructure and thus were building data silos themselves. 

  • Data analytics team was not innovative fast enough and data science as a function was lacking.

 

Path of Transformation

The bank explored various cloud analytics and data platforms as part of the data analytics/modernization journey. Cloud was the future for the bank, and it was going to be a multi-year journey. SQL based platforms were in primary consideration because of the current data and business teams’ familiarity with that language. The following requirements were a must-have as the bank reviewed several cloud platforms. 

  • Ability to support current SQL workloads and ease of migration of existing data marts.

  • Support for Data science use cases - the bank did not have a DS competency yet, but this function was going to be critical.

  • Minimize disruption to current BI and report consumption from the warehouse when the platform switch happened. 

  • Ease of data ingestion from various legacy applications and data sources.

  • Scale, agility and ability to scale up and down.

  • Support for various structured and unstructured data sources.

  • Highly secure, compliant and ability to tag, classify and meet audit needs for the bank.

  • Ease of data discovery and self-service for the business. 

 

The Databricks Lakehouse met the requirements, and the transformation journey began.

 

Based on the criteria above, the vision of the platform and the proof concepts that were executed, the bank selected to partner with Databricks, and the transformation journey began. 

The bank embarked on a journey to integrate Databricks into its infrastructure aiming to improve analytics capabilities, foster greater collaboration, streamline data processing, and increase data quality (validation and check results) via event logs. It was a multi-year journey, with an iterative approach to data and process migration, use cases onboarding along with end-user enablement and adoption that resulted in the following benefits. 

 

Key benefits post adoption of the Databricks Lakhouse

Advanced Analytics and Data Ingestion

  1. We developed a repeatable pattern for data ingestion, data transformation through the various layers (bronze, silver gold) to provide a standard consumption layer that allowed for new use cases to be easily onboarded for various business functions. 

  2. We built an ingestion framework that covered all our data load and transformation needs. It had in-built security controls, which maintained all the metadata and security controls to meet the bank's Enterprise security needs. 

  3. The ingestion process accepted a JSON file that included the source, target and required transformation. It allowed for both simple and complex transformation.

  4. We also leveraged Airflow but given the complexity of the DAG, we built our own custom framework on top of Airflow, which accepted a YAML file that included job information and its related interdependencies.

  5. Change data capture from data sources - For managing schema-level changes using Delta tables, we built our own custom framework which automated different database type operations (DDL) without requiring developers to re-write processes.  This also helped us to implement different audit controls on the data store.

  6. Example use cases included analyzing customer behavior, market trends, or identifying potential risks.

Machine Learning:

The bank developed and deployed its first machine-learning model to production on credit risk assessments using Delta Tables, Spark workflows and ML-flow.

Data Governance and Data Discovery through Unity Catalog:

Unity Catalog was implemented to ensure various workspaces and technical metadata had appropriate access control measures conforming to the banks’ security needs with plans to classify with appropriate tags in the future. The data assets within Databricks were also easily discoverable for end business user consumption through the Unity Catalog.

Scalability and performance:

The Lakehouse platform provided the bank with the ability to scale up and down as workloads increased or decreased and pay only for compute time as needed.

Collaborative Environment:

The bank was able to leverage the collective expertise of its teams to solve complex problems that their customers are experiencing and drive continuous innovation within its analytical capabilities.

TCO Reduction:

 4X reduction in overall data analytics infrastructure, operations and employee costs by migrating to Databricks.

Challenges in Adoption

The biggest challenge during the implementation was minimizing disruption to the business teams in the data and analytic consumption process and showing them a more efficient way to do self-service or make a change request. Appropriate change management programs along with enablement programs had to be instituted in parallel as new data assets and data products went live to ensure that the business teams weren’t disrupted. The teams involved with the transition were accustomed to traditional analytics tools and legacy database architecture design paradigms.

 

Post Adoption - Optimization and onboarding of new use cases

Business lines from across the organization were ecstatic from the transformation - combination of modern technology platform and more importantly - speed of execution to changing requirements and self-service. Business application data that would take days or weeks to be updated in the analytics systems were fed in near-real time in Databricks lakehouse. Business lines were able to make data driven decisions with up-to-date information which has helped with their overall customer growth and satisfaction. This has positively impacted job satisfaction, employee retention allowing the bank to focus on process optimization and other strategic initiatives. 

 

What’s next?

The Databricks Lakehouse has set up the bank to easily expand to other use cases - ingesting additional new data sources (structured and unstructured), use of ML in other areas of the business and proving out GenAI - testing LLM models for several high value use cases leveraging the data in the lakehouse to easily feed the models along with the appropriate controls required for compliance.