Data is the most important asset of an enterprise. It’s no secret that data is growing at a pace that is making enterprises not just take notice but also make a hard stop, re-strategize and change the game plan otherwise they will fall behind in the fast lane. Data growth in volume is one thing but there are other aspects around it that become tricky and difficult to manage. These aspects include data formats, storage, security, consumption in the form of analytics, business insights.
Therefore, every enterprise needs a data management strategy to address the following:
• Data Governance
• Data Operations
• Data Delivery
Key benefits of a data management strategy include:
• Higher Data Quality
• Better Business Insights driving informed business decisions
• Improved Data Security
Why Hybrid Data Management?
To understand hybrid data management better, it is important to understand the differences between traditional and emerging data & analytics platforms.
|Database||RDBMS||No SQL, Open Source|
|Data Platform||Data warehouse||Data Lake|
|Data Management Software (ETL, Reporting, Visualization, Advanced Analytics)||Commodity Software||Open Source|
Key differences between traditional & emerging data management approach include:
|Cost||Startup costs are higher and ongoing costs may be lower||Startup costs are lower and ongoing costs will be higher (OPEX)|
|Time to start||Can start late as per budget available||Need to start fast to maximize ROI|
|Disaster Recovery (DR)||DR can be additional cost||Mostly built-in DR capability|
|Scalability||Not Adequate. Scalability at additional cost and time||Flexibility to scale with business needs. However, if not planned accurately, cloud fees can add up quickly|
|Data Security||Complete control of||Confidentiality control based on cloud vendor|
| highly confidential information|
| data privacy|
| infrastructure risks & outages|
|Technical Community||Limited quantity of subject matter experts||Open Source community is growing with quicker turnaround for issues and challenges|
As it’s quite apparent from the above comparison, there are advantages and disadvantages for each approach. This drives the need for embracing a hybrid data management approach which can combine the advantages of both the approaches. There are also some other factors like the following that require a hybrid approach:
• Mixed analytical work-loads
• Long-tail usage of traditional data platforms as change to emerging is not easy
• Time-spent on data preparation & movement rather than transformations and analytical processing
This hybrid approach can scale with the growing enterprise needs, increase agility, enable innovation, increase predictability, improve forecasting accuracy, detect new behavioral patterns and deliver analytical insights relevant to the business processes and applications.
How to Deploy Hybrid Data Management?
Deploying hybrid data management can be started anytime based on the immediate business need or a challenge like growing data volume and limitation with a physical data center/platform setup. Some other considerations before deploying include:
• Impact on current technical landscape which includes impacts on on-prem infra, data availability, data access, data movement & processing
• Impact on business processes
• Investment considerations
• Technology choices & adoption
The following diagram illustrates a hybrid data management reference architecture for a data warehouse and a data lake platform however it is not limited to what’s in the diagram. This can be customized further based on the enterprise architecture:
The hybrid data management combines the features from both the traditional and emerging platforms. For example, the following combinations can be incorporated as shown in the above diagram.
• On premises + Cloud
• Structured + Unstructured data
• Enterprise Data Sources + External Source APIs like Social Media, Weather, etc,
• SQL + No SQL
• Data warehouse + Data Lake
• Commodity + Open Source
In principle, it’s extremely important for the CTOs/CDOs to look at a hybrid data management strategy so as to enable business with a robust data & analytics platform that drives agility, quality of data and insights to run and grow the business.
Author – Satish Pala
Satish Pala is a digital transformation leader with 19 years of experience in the IT industry and possesses solid experience in IT service delivery including extensive Program & Project Management, People Management, Client Management, Pre-Sales & Solutioning across various domains. He has delivered many cutting-edge analytics solutions from within India as well as from client locations for some of the top Fortune 500 companies. He is a passionate technologist & trainer with keen interest in Advanced Analytics, Big Data, Cloud and Next-gen Business Intelligence. He has worked in leading software services companies like DXC Technology, Hewlett Packard and Infosys. At Indium, he leads the digital practice which includes Big Data Engineering, Advanced Analytics, Blockchain and Product Development. He is responsible for overall delivery, capability and growth within the digital practices. Satish holds a Bachelor’s degree (B.Tech.) from IIT Madras. He is a certified Project Management Professional from PMI.