One of the recent cover stories of ‘The Economist’ emphasized on the importance that data has been gaining, stating “the world’s most valuable resource is no longer oil, but data.” Machine learning, which is bringing about the most dramatic advancements in artificial intelligence, is a data intensive technique. Lots of data is required to create, test and train the AI. As AI is gaining importance in the business world, so is data.
AI is being leveraged by financial firms to advice customers on their investment choices, automakers are using it to build autopilot systems, and virtual assistants similar to Siri, Cortana are being introduced. AI has now got an immense potential to boost economic growth across various sectors. Data possession and analysis using AI have become important areas for businesses looking to compete with each other.
In India, as early as 1986, a DRDO organization named ‘Centre for Artificial Intelligence and Robotics’ had been established. So artificial intelligence is not something which we have discovered now. The capabilities in AI and machine learning remained dormant for the longest time due to unavailability of large volumes of data from multiple sources. But now not only do we have great volumes of data from varied sources but also the ability to analyze massive data sets in the form of trends, patterns and associations in milliseconds. Developments in the field of ‘Big Data’ have altered and transformed the scope and future of AI significantly.
We can list five main reasons for seeing big data as a critical enabler of AI.
1. The Big Computing Power
Computational capacity has the ultimate role to play in transforming data from a compliance burden to a business asset. Until recently, technologies involving large scale cluster computing or analytic algorithms were deemed to be too costly and time consuming. Nanoseconds is now what is required to process millions of datasets, thanks to the exponential rise in the speed of computing. There are CPUs and GPUs with sequential and parallel computing capabilities that help process data in real-time and subsequently derive rules for AI-based applications.
2. ‘Data First’ Approach
Agility and ready access of large volumes of data is leading to a rapid revolution in AI-based applications. Earlier, data scientists and statisticians were limited to working with ‘sample datasets’. But ‘big data has relieved the scientists from this constraint. They can now work with the real data itself with all its nuances and detail. Iteration-based data discovery is enabled by big data and with excellent indicative and predictive analytics tools now available, more and more organizations are moving away from a hypothesis-based approach towards a data-first approach.
3. NLP for Big Data
Natural language processing (NLP) technologies are being leveraged in a variety of interactive applications like Siri, Alexa, online banking service bots and the likes. Learning from human communication is an integral part of AI. Human datasets are voluminous with a variety of languages and dialects. NLP for big data is leveraged to automatically find relevant information or summarize content in large volumes to obtain collective insights. Also, in the ever-increasing stores of content, big data can help reveal patterns and trends across disparate data sources.
4. Programming Languages and Platforms
Among developers’ favorite programming languages for AI development, Python is recommended for its simplicity, syntax and versatility. It is a very portable language and can be used on platforms like Linus, Mac OS, Windows and UNIX. It possesses excellent statistical data analysis capabilities and has an extensive variety of library and tools. For commercial scale operations, big data platforms like Hadoop can also be used. The language and the process will depend on the desired level of functionality of the AI application being developed.
5. Balance Between Cost and Performance
Memory devices like DRAMs and NANADs now make possible efficient storage and retrieval of big data. In-memory server architectures, increasingly being used in advanced databases and high speed analytics shops, favor DRAM as the preferred solution. Upmem, a French company, has come out with a method to offload actual processing to DRAM for AI workloads. By connecting thousands of DRAM processing units to a traditional processor, the workloads will offload on to the DPU where under the same power envelope, they’ll run twenty times faster. Similarly, NAND memory caches have lower data center power by significant margins while being deployed at a fraction of the cost. Enterprise executives should know whether they require a strong performance or high efficiency.
Big data’s impact goes well beyond simple data and analytics. The innovation and disruption waves are being heightened by the powerful combination of AI and big data. These two are amongst the most promising technology paths that businesses will trod upon in the future. The first wave of big data was all about flexibility and speed. The second wave will be all bout leveraging the power of AI by understanding its convergence and inter-dependence with respect to big data.