Modern AI algorithms in the shadow of LLM

Current AI news is mainly dominated by generative AI and Large Language Models (LLMs). Although these models deliver impressive performance, they also have their limitations; LLMs are often heavy (require a lot of resources), expensive to train and sometimes unsuitable for applications that require fast and accurate answers, such as in high-risk environments. That is why we have looked further into AI topics in which there are also many nice developments, but for which LLMs are not necessarily suitable.

In this blog we highlight three AI examples where LLMs are not used.

Sales Forecasting

First, the challenges and methods within sales forecasting, a specific application of time series forecasting. In sales you have to deal with various effects, such as seasons, discounts and cannibalization. You want to be able to model these effects and this is based on a good sales forecasting model, which can accurately predict what you are going to sell. To make these predictions, you can use classical methods, such as linear regression and support vector machines, but also more modern techniques, such as Deep Learning. There are different types of models, each with their own advantages and disadvantages.

For example, linear regression models are often simple to train and easy to explain, but they tend to be anti-conservative in their predictions. Neural Networks sometimes seem an obvious solution, even though they are larger and harder to train (and also less easy to explain). CNNs (Convolutional Neural Networks), for example, are good at predicting local effects (short-term events, for example). RNNs (Recurrent Neural Networks), on the other hand, are naturally better able to take seasonal effects into account.

Which of these models is best depends on the available dataset. Classical, smaller algorithms are still used a lot, but neural networks are often the most accurate.

Self-driving Cars

Tesla’s self-driving cars also do not use LLMs, but Tesla Hydranet, a self-developed neural network, specially built to perform many tasks simultaneously, to process information quickly and to be able to respond quickly. With 8 cameras, without the use of radar, so on visual input alone, Teslas can drive completely independently.

Tesla uses complex models to process the visual data from the 8 cameras, to identify objects and to anticipate changes in the driving environment. The so-called Hydranet is made up of various convolutional neural networks (CNNs), Transformer models and other sub-elements. Regnets are used as feature extractors. The output of these models is fed to a BiFPN to combine features from high layers of the Regnets and low layers of the Regnets. In this way, the different layers of the Regnets can, as it were, talk to each other in order to combine semantic information (from high layers) with high resolution (from low layers). The features that come out of the BiFPN are then fed to a transformer, with only 1 transformer block. This transformer ensures that the input from all 8 cameras comes together in 1 feature space. Up until now we have only worked on separate images, but of course, in order to drive, we have to work with video. For this reason, the features in feature space are put in a video queue at regular intervals (once every +- 20ms, but also per distance driven). The features from the video queue are finally fed into a Spatial RNN. This RNN has the ability to fill only certain pixels in its output space, which makes it possible for the car to only change relevant pixels (in which something changes, for example).

All in all, the design of the model is very interesting and it shows how much engineering sometimes goes into solving complex problems.

Multi-Agent Pathfinding

The third example of AI without LLM is multi-agent pathfinding, where multiple agents must navigate simultaneously without interfering with each other. Global route planning becomes less and less attractive as the number of agents increases. This is because the distance that the agents could have covered together increases. In addition, unexpected deviations from one agent can influence the optimal path of another. In many situations, starting to move quickly is therefore much more valuable than an optimal route. Deep Learning in improving decision-making processes of individual agents can help to determine both the optimal global calculations and the local behavior. For example, it can recognize that an area has few obstacles, and a direct route will be the best calculation, or that there are narrow paths with many obstacles, and a path must be calculated that takes a lot of account of traffic jams and detours, where the fastest and shortest routes are far apart. This can alleviate the problem with global coordination, but not solve it.

AI in Gaming

To work efficiently with many agents, global coordination must make way for autonomous agents. This is easier said than done. A major challenge in training autonomously controlled behavior is that it often depends on arbitrary elements in the training process. To illustrate this, we present an AI model that learns to play Pokémon. This gets stuck in a number of places where the rewards for actions unintentionally hinder the discovery of existing solutions. For example, the AI avoids the fastest way to recover after a battle because it can lose Pokémon in that location by saving them on the computer. The problem is that here the reward for ultimately desired behavior has a huge influence on the learning process.

The new approach of DreamerV3 offers a solution for this. This solution is able to learn to play a variety of games, from a universal starting position. The key to this success lies in the division of what it tries to learn into three separate models. The first model tries to predict the future purely based on visual input. It does not matter how good that future is for the score, only that it is accurate given the inputs that are given. This then results in a number of actions that fit predictions as well as possible.

The second model tries to determine the rewards for actions as well as possible. This stimulates better mapping of actions that seem negative. This is separate from the third model, which chooses actions based on these first two models, and decides how much risk it wants to take, because, for example, it has not found anything new of better value for a long time, or has found a new advantageous action that it wants to maximize until that is no longer the case. In this way, the combination of models can switch between exploratory testing of options and making good use of the options it has.

This division enables the system to gradually gain an increasingly better picture of the way in which actions have an effect on the world, and thus to find precisely those actions that lead to innovation. Instead of teaching an AI one specific task, it can be helpful to break the learning process down into separate parts. This can lead to a much more generic solution. For example, a model that had learned to make diamonds in Minecraft could handle a variety of initial situations. This allows an agent to be trained to perform more complex tasks independently, without having to tailor the training to the task exactly.

Once these agents can act independently in an environment with other agents, the solution scales linearly as the number of agents increases. Where a central calculation with a certain number of agents runs into a wall of sky-high computational costs as soon as 1 is added, independent agents can avoid this wall by being able to manage with a predictable amount of computing power per agent.

This was a glimpse of the developments that are taking place outside LLMs in the world of AI. Even if generative models are not applicable to your domain, it is certainly worth paying attention to what AI can innovate in a broader sense.

Jip Maijers en Jaap Rutten

Also interesting

Blog

Modern AI algorithms in the shadow of LLM

Current AI news is mainly dominated by generative AI and Large Language Models (LLMs). Although these models deliver impressive performance, they also have their limitations.

Blog

Quantum Computing

Quantum computers can perform complex calculations and analyses many times faster and more efficiently than the traditional computers we work with every day.

Blog

Data + AI = Company of the Future

The main goal of training AI systems is to eventually deploy these systems into production so that they can make predictions based on new data.

Blog

Hello world

As friendly as “hello world” sounds, it can be so naive when people decide on the basis of these “hello world” examples whether a technique or service can also be used to solve their own “real world” problems.

Employees speak out

Create a FLOW with Scaled Agile

In order to achieve a Flow in the development of new products and services, it has proven to be very effective to use a framework to flexibly and adaptively set up customer requirements.

Blog

Is AI taking our tester jobs?

AI and Machine Learning are developing rapidly, also in the testing field. AI test tools are becoming more advanced and frameworks to develop something yourself are becoming more accessible. This leads to the question of what the role of the tester will be once AI applications have been fully developed within the testing field.