Before deep diving into the relevance of Data Management Platform for Data Engineers, it is essential to understand what DMP is.
DMP (Data Management Platform) in layman terms is a platform which is created for the marketing need to manage the marketing data. Data is ingested from different data providers about the devices, users (prospects, customers). Audiences (unique set of users/devices) are created based on the segmentation criteria. These audiences are then targeted and campaigns are run against them for customer conversion, up-sell, cross-sell, etc.
Majority of the initial DMP providers were niche players who later got acquired by big players. This allowed big marketing companies to bridge the gap and sell complete marketing solutions to the customers e.g. Adobe acquired Demdex, Oracle acquired Bluekai, Salesforce acquired Krux, etc
The major functionalities of DMP include defining complex segmentation criteria, integrating first, second and third party data, cross-device ID stitching to identify unique profiles, stitching and sessionization to identify, track and optimize the user journey, and lastly look-a-like modeling to identify prospects similar to the organization high value customers
DMP is usually fed with three kinds of data i.e. first party data, second party data and third party data. The first party data includes organization personal data like its owned website data, CRM data, POS data, Surveys, etc. The second party data consists of the organization’s partner data. A partner symbolizes an organization who is willing to share its data based on the legal agreement or partnership with the organization who is willing to acquire data. Lastly, the third party data comprises of the data provided by the third party providers like Exelate, Acxiom, etc. who places third party cookies in various websites and collects the user data across different websites and platforms. This data collected can churn incredible insights, which proves extremely beneficial in enabling businesses across
Technical Architecture of DMP
DMP is primarily intended to create the right set of audiences on which the campaigns can run. If one looks into the typical internal architecture of major DMP providers, one will be able to find the below commonalities –
• As the DMP tool stores and processes huge amount of non-PII data of their multiple clients, it does not make sense to utilize on-premise infrastructure. Therefore, most of the DMP providers host and process the data on cloud, with majority of them utilizing the AWS cloud offerings
• Segments, sources, destinations metadata information are stored in relational or in hierarchical NoSQL database
• Incoming batch feeds are staged in cloud storage like S3
• Message queues like Kafka are being used to stream in real-time data from website and other real-time sources
• Spark framework and technologies are used to perform parallel batch and real-time processing of the incoming data
• Wide columnar NoSQL databases like Hbase are used to ingest web streaming data
• Cleaned, aggregated and sessionized data are sent to cloud hosted MPP columnar data-warehouse like redshift where the analytics is performed
• Cross-device ID match and profile merging is done using deterministic modeling. Here the DMP vendor looks into its complete data lake to find the similar devices
• Cross-device ID match and profile merging is also done using probabilistic modeling by leveraging machine learning techniques and looking into the attributes and user behavior
• Look-a-likes are identified using machine learning models (users/devices exhibiting similar attributes, behavior will be identified based on the look-a-like percentage and the reach which the customer is looking for)
• Visualization capabilities are integrated into the DMP tools and the reports directly hit the MPP Data-warehouse. Standard set of analytical reports are displayed in the dashboard, which provides insights into the traits, segments, campaign performance, etc
• Once the user and the segments are processed, this information is streamed out either in real-time or via batch feeds to DSP, Ad-servers, Personalization tool, Email campaigning tool, CDP (Consumer Data Platform), Partner, Data-warehouse platform, Reporting platform for advanced analytics and visualization, Data science initiative for advanced prediction, modeling and analytics, etc
DMP for Data Engineers
If we look into the above technologies, toolsets, functionalities, we will quickly realize that this is what data engineers are capable of and this is what they specialize in.
Apart from creating DMP products, data engineers can also work on custom requirements and can create similar or more advanced capabilities and can add in advanced features which the client is specifically looking for. CDP (Consumer Data Platform) is also one offering which can be looked upon when the need is to provide Customer 360 degree view and where certain PII data is required for deeper and extended analysis and targeting.
Apart from creating DMP tool/functionality, following are the other areas where Data Engineers expertise around DMP is handy –
• During the DMP fusion and design phase, Organization identifies the sources of the data which they want to bring in the DMP. It is the job of data engineers to make a connection to the data sources and make sure that the data is properly fed in. There might be few API’s which needs to be called-in to ingest data, there might be few http end-points from which data needs to be consumed, there might be few custom data sources, databases, data-warehouses, website tags, 2nd party data feeds, 3rd party data feeds, CRM feeds, batch feeds, etc from which data needs to be ingested.
• DMP vendor might help to establish the data connection but it is always better to have in-house specialist who knows all the connection touchpoints
• Ensure and validate that the data is coming in
• Implement and ensure that the user stitching is done properly to ensure seamless capture of user journey
• Implement and ensure that the user stitching is done and id’s are unified to a single profile using deterministic and probabilistic modeling techniques
• Implement and verify that that the segmentation rules are properly defined
• Send audiences from DMP to the targeting platform
• Understand the data extract needs. Does the data need to be pushed in real-time or in batch mode?
• Understand the data which the DMP holds, its relevance, use and applicability during the campaign.
• Specifics on the information which it can send out
Other complex considerations where data engineers can extend the functionality provided by the DMP tools are –
• Architecture and design of the standard DMP tool is fixed and changes often takes time to be incorporated in the tool. If there is a major performance impact due to certain fixed process, data engineers can step-in to create a customized alternate process or approach to achieve the desired result
• Break-up terabyte files into smaller chunks for faster parallel ingestion and processing. If the SLA for a particular process is defined by a product and there is a need to achieve the result faster, then a customized/parallel job can be written to perform the action
• As the data engineers understand the underlying DMP technical process, they can create a workaround solution to achieve the desired result
Data Engineers when wearing the hat of a Data Analyst can –
• Understand and analyze the source data. As data engineers/analysts play with data, they understand data very well and they know about different kinds of data
• Analyze and infer the different data points and the data sources which is more meaningful for the marketing campaigns
• Analyze and suggest which third party providers’ data can further enhance the existing first party and second party data
• Understand the targeted users, segmentation criteria, marketing criteria, marketing rules, marketing campaigns. After analyzing all of this information and the incoming data in the DMP, data analysts and engineers can provide much better inputs on what and where to focus on
• Analyze the DMP reports, segment performance, campaign performance and can work in conjunction with the marketing analysts to enhance the marketing performance
After weighing down options, few organizations might decide to use industry-standard DMP tools so as to give them a quick jump start. Most of the DMP functionalities in this case will be taken care by the platform (which is built by some data-engineers). But there is always a scope and requirement of additional functionalities and capabilities which is not provided by the DMP vendors.
DMP engineers and analysts have a big role to play along with the marketing analysts for a successful DMP and marketing campaign implementation and this fact can’t be discounted when an organization leaps into a DMP implementation.