Data analytics software tools enable businesses to analyze vast stores of data for great competitive advantage. Data analytics software can mine data that tracks a diverse array of business activity from current sales to historic inventory and process is based on data scientists' queries. Deephaven is a data software company. The initial version of its engine was developed as an in-house product at a quantitative hedge fund.
Analytics Insight has engaged in an exclusive interview with Pete Goddard, CEO of DeephavenData Labs.
Deephaven is a data software company. The initial version of its engine was developed as an in-house product at a quantitative hedge fund. Its impact on that team's productivity and its ability to service an extreme range of use cases encouraged the founding engineers and the company's CEO to spin it out and form an independent software company, Deephaven Data Labs.
Data that changes. Its engine is best of breed at the intersection of real-time data and (any / all of) ML, data science, application development, analytics, and BI. Because new data often need context to be valuable, Deephaven excels at workflows that combine both dynamic data (streams) and static data (batches).
Diverse, data-driven teams. The framework around the Deephaven engine empowers a large range of people to be productive on a shared platform. Quants, AI scientists, data developers, and dashboard users are all equipped to be directly productive with real-time data. They'll respectively use different parts of the suite of Deephaven tools. This benefits collaboration and accelerates the innovation cycles fundamental to the enterprise.
Deephaven's architecture has evolved with a "dynamic-data first" focus. This contrasts greatly with the bulk of the data software industry, which was built to serve static data and is now trying to adapt to the new world order.
Because of this, Deephaven has fundamentals within the engine that others do not – ones that cannot be "bolted on". The three that are most empowering to users are:
An updated model that is constantly tracking changes to tables, rather than simply static snapshots thereof. This matters a lot because many operations and calculations can be much faster and more efficient when tracking changes instead of looking at a whole new snapshot of a table.
Overt capabilities that allow users to deliver chains of business or AI logic to data. This contrasts to typical SQL solutions and has benefits in ease-of-use, in the support for complex use cases, and in the way intermediate calculations can be made available to teammates and enterprise applications.
Solutions for delivering dense dynamic data to browsers and other downstream applications within a full embrace of open formats. Deephaven has extended a popular open project called Apache Arrow Flight to support table data that changes. Deephaven provides users with the ability to publish quickly-changing real-time tables, periodically-updating tables, and static tables to downstream applications and consumers using all the same methods and APIs. Further, it leads the industry regarding its browser integrations. (Previously, handling quickly changing data and massive tables in browsers had only subpar solutions.)
The product itself needed to evolve to service a greater variety of usage patterns. With an in-house product, one can largely dictate how people use a system because they work for you. As a software vendor, one must match prospects' expected workflows and minimize the pain of change, while offering compelling value and differentiators. To do this, Deephaven focused on deeply partnering with a small number of very sophisticated clients. By listening intently to their needs and partnering with them on priorities and solutions, a de facto round table of contributors – a "private community" – shaped Deephaven's engine and framework into what it is today.
Finally, all of this happened within a quickly moving environment. The role of respective CS languages, the cloud, containerization, data streams, community software, and open-source business models has changed a lot over the few years since Deephaven was formed. All of these impacted the expectations of users and enterprise decision-makers. Deephaven established a commitment to be modern and to embrace these trends. Getting the surfboard to the tip of that wave took some effort.
Five massive themes are swirling together today:
The synergies and capabilities offered by cloud solutions.
The seeming tension between solutions respectively servicing unstructured and structured data.
The implications of SQL vs. NoSQL solutions, as implied by the tension noted above.
The role of streaming data and real-time solutions in a historically batch-dominated world.
The capabilities of AI to address previously unsolved challenges and to offer new efficiencies.
Consensus suggests that growth is toward cloud, streaming data, and solutions that can handle unstructured data and support AI. The greatest values, however, are delivered by solutions that embrace these trends while appropriately incorporating the need to service legacy workflows, code, and data structures. For these reasons, the interoperability of one's system matters most for those focused beyond the short term. Commercial ecosystems are trying to become all-encompassing solutions, but open-source software and open formats are the real stories.
Deephaven just released its open, community version. The company's paramount priority is to explain its value proposition, partner with users, and evolve the project and product in a direction that best serves the community. Invariably this effort combines story-telling, support, and R&D, and Deephaven is committed to and resourcing all three.
From a product evolution perspective, because Deephaven services a range of personas and use case flavors, the project plan incorporates a few inter-connected priorities:
Further evolving Deephaven libraries to deliver the engine's excellence with real-time data to popular AI libraries (like PyTorch and Tensor Flow).
Increasing the suite of data ingestion and exhaust capabilities with particular focus on SQL-/ODBC-/CDC-integrations, on-disk column stores (Orc, Iceberg), popular data lakes, and enhanced Kafka-related capabilities.
Continuing to deliver elegant workflows for data scientists and developers. Adding features to our web-IDE and bringing to market the Jupyter integration we have been baking are important; as is a thoughtful approach to more richly integrating with VS-Code and JetBrains' IDEs.
Promoting and enhancing Barrage, the (open-source) protocol for dynamic data that we have married to the Apache Arrow Flight project. We look forward to partnering with that community to extend all of their data science utility toward dynamic and real-time data.
Continuing to invest in performance. The definitions of "fast" and "fast enough" will continue to move, and it is important that Deephaven users can magically keep themselves at that leading edge.
Delivering Deephaven-as-a-cloud-service. Turnkey, elastic, configurable.
Supporting peer-to-peer publishing and consumption of dynamic tables.
I think some fundamentals address the question:
For any organization, the most expensive cost is slowness, missed opportunity, and lack of innovation. These costs hit top lines and can create an existential problem.
It is almost always the case that people cost more money than computers, particularly in today's world of elastic cloud workloads. Ensuring people are productive is key.
People are most productive when their workflows are collaborative; when the friction to their work is reduced; when their tools are empowering, easy, and (hopefully) fun; and when it's a breeze to get data, resources, or help from colleagues or others.
All of the above points to choosing:
open-first software solutions that prioritize interoperability,
open data formats,
"Fewer systems that do more things well enough" rather than "a patchwork of solutions that each do a narrow thing impeccably".
To make this approach future-proof, we further underscore the importance of architectures that embrace data that changes and APIs that communities can evolve and use to easily deliver new components and experiences.
Join our WhatsApp Channel to get the latest news, exclusives and videos on WhatsApp
_____________
Disclaimer: Analytics Insight does not provide financial advice or guidance. Also note that the cryptocurrencies mentioned/listed on the website could potentially be scams, i.e. designed to induce you to invest financial resources that may be lost forever and not be recoverable once investments are made. You are responsible for conducting your own research (DYOR) before making any investments. Read more here.