Birlasoft built an Enterprise level Central Data Repository to allow multi-function analytics
By using DAAS (Data as a service) to embed data and analytics into the core processes and decision making. By migrating siloed legacy data warehouse and data processing systems to AWS cloud Data Lake, Birlasoft was able to address the issue of unintegrated & redundant data sets resulting in ease of data accessibility with added data security.
The entire solution revolved around siloed data integration, replacing personal data processing and old ETL systems. The newer data lake was created using AWS S3. Further using the S3 as the upstream source for Snowflake data warehouse on AWS. The data processing was done using AWS EMR, PySpark, and Lambda to draw meaningful insights. Easy data access was possible using AWS Athena on S3 data. The orchestrations were done using AWS Step function Lambda and cloud watch. Finally, access management to the Data Lake was enabled using AWS IAM users and roles and code versioning using Bit bucket.
The Data Lake solution was able to address the issue of data swaps, eliminating redundancy and improving data governance. The siloed data was now in a central data repository. Thus, improving cross departmental partnerships and integration opportunities. Furthermore, the access management to confidential and restricted data was handled seamlessly using AWS IAM.
Technology stack: AWS IAM, EMR, S3 Buckets, Lambda, Step functions, Athena, Data Catalog, Git Hub, Snowflake on AWS.