Today adoption of machine learning technologies has become the key development trend for software companies. The use of AI determines product competitiveness on the global IT market. However, engineers who are just at the start of AI developments come across a great variety of questions concerning the choice of basic algorithmic approaches to problem solving. In this article, I’d like to describe the main features and use cases of AI in software development.
How is it possible to provide a learning structure with a sufficient amount of qualitative training data? This is the main issue of applying machine learning in projects that almost everyone will face when trying to use ML. It is widely believed that the choice is limited to variations of deep neural networks and their integration into a single architecture.
This is how the Generative Adversarial Networks (GANs) approach has emerged. It allows solving training tasks based on the implicit automatic generation of training samples, thus, significantly reducing the amount of data required for training.
There is also another approach that differs considerably from neural networks. It contains minimum requirements to the training data set. However, its drawback is a high entry threshold for practical implementation. It is the probabilistic reasoning and probabilistic inference approach. This specific approach allowed a number of start-ups to distinguish themselves on the AI software development market.
One of them was Gamalon, a company that named its technology “Bayesian Program Synthesis” (BPS).
In its marketing promotion they used the following slogan: “Bayesian Program Synthesis (BPS) technology versus state-of-the-art deep learning. We show that the Gamalon BPS system learns from only a few examples, not millions. It can learn using a tablet processor, not hundreds of servers. It learns right away while we play with it, not over weeks or months. And it learns from just one person, not from thousands. Someday soon you might even have your own private machine intelligence running on your mobile device”.
Sounds great. So what is the main idea of this approach?
Let’s have a look at the problem of machine learning from a different angle. A typical task is building a universal algorithm that learns to respond with a data set based on the training data.
Initial success of neural network application was based on the concept of creating universal algorithms that could precisely reproduce any functional dependency. Thus, they can be trained to solve any tasks. This is an inductive approach, and it involves development of a general model based on a large number of specific instances.
However, the more complicated the relations are, the more extended the training examples (training data) will be. Modern neural networks that classify thousands of image objects are trained on hundreds of millions of examples.
A deductive approach, from general to specific, is opposite to a training method based on samples. A conceptual model plays a major role here. It is based on common knowledge: how input and output data of a developed trained system can be related. This approach is called a simulation inference engine.
In this model training is limited to parametrization of concept correlation and doesn’t require a considerable amount of training data. The most successful models are Bayesian models, which allow relating aprioristic and conditional a posteriori probabilities of input and output data relevant to the conditions determined by the available data set.
Such models enable solving not only direct output problems but also inverse problems. These are the tasks which make it possible to find origins of different correlations between input and output data.
For example, it helps to find out how biased a coin is when tossing.
These models have been known for a long time but their usage has always been followed by essential difficulties in implementing transformation in the Bayes’ rule. Only the recent developments have made this approach effective in terms of practical implementation. It is related to the emergence of languages of probabilistic programming – program construction tools that describe a conceptual model of relations between input and output data in terms of numerical and probabilistic correlations and contain a built-in engine for the inference engine.
Everyone who will master probabilistic programming will become a specialist in developing highly effective AI systems.
The modern languages of probabilistic programming are based on expressive general-purpose programming languages. Among popular languages we can list the Figaro language based on Scala and Edward based on Python. Edward allows using a popular library TensorFlow and including neural networks in the conceptual model.
The symbiosis of the existing approaches gives hope for the rapid emergence of AI systems that combine all the benefits of Bayesian deductive models and a linear but powerful inductive approach to deep neural networks.