The basics of Machine Learning

After shedding some light onto Symbolic AI in the previous article, we’re now moving on to take a closer look at Machine Learning (ML). When it comes to Symbolic AI, breaking down a problem as minutely as possible is key for successfully solving it.

Only this enables the computer to correctly access the “learned” answers (be it in the form of a decision tree or a table) and lets the program solve the problem in a comprehensible way. However, this also means that an algorithm can only provide the solution to a very specific problem. The moment the problem changes, the algorithm must be modified (or rewritten).

This is exactly where machine learning really comes in. The aim of machine learning is to allow the computer to formulate and solve the problem - i.e. to enable it to take over the “unpleasant part” of adjustment and fine-tuning itself. To do this, a machine learning program is equipped with two different algorithms, a learning algorithm and an inference algorithm. Equipped with those, the same (or very similar) program can ideally solve many different problems, while applying the same method - instead of programming everything from scratch.

On a side note, the term “learning” should be used with caution: People tend to anthropomorphise, i.e. we like to attribute human qualities to inanimate objects: Everyone has probably scolded their computer at one point or another - knowing fully well that it neither hears nor understands what is said. If we now say that computers (or rather: algorithms) “learn”, then there is a great danger of us understanding this learning process as analogous to the human learning process. But as we’ll soon discover, machines “learn” quite differently. That’s why most ML developers prefer to talk about “training” their algorithms.

The most important ML terms, explained simply

The best way to understand how ML works? Going over an example step by step. But before we plunge right into it, we need some essential ML vocabulary first:

Algorithm: An algorithm is a fixed, finite sequence of instructions, used to perform a particular calculation. Each computer program consists of a number of different algorithms. An algorithm can be very simple (“Find the smallest number in this list of numbers”) or very complex (“Train this neural network”). Most ML methods work with two algorithms: The so-called training algorithm trains the program based on the available data, the so-called inference algorithm applies the gained " findings " and delivers results.
Parameter: A value that is learnt during the training of a machine learning program. Based on existing parameters, the training algorithm “trains” and tries to continuously improve the parameters. The inferencel algorithm uses the parameters to calculate results.
Model: A trained machine learning method. A model is essentially a (sometimes very large) set of parameters and instructions on how to use them - a ready-to-run ML procedure and the result of the training process. The training algorithm generates the model, the inference algorithm uses it to perform a task.
Inference: Running a model using a second algorithm. This algorithm uses the model to create a prediction based on an input, classify an input or create some interesting value based on an input. Hence the term “inference algorithm” or “prediction algorithm”. Inference is a technical term for “execute the ML model and output a result”.

As we see, Machine learning use the combination of a training algorithm and a prediction (or inference) algorithm. The training algorithm uses data to gradually determine parameters. The set of all learned parameters is called a model, basically a “set of rules” established by the algorithm, applicable even to unknown data. The inference algorithm then uses the model and applies it to any given data. Finally it delivers the desired results.

Procedure of a Machine Learning Training

Equipped with the right vocabulary, we can take a closer look at the execution of a machine learning project:

We select the machine learning method for which we want to train a model. The choice will depend on the problem to be solved, the available data, the experience and also on gut feeling.
Then we divide the available data into two parts: The training data and the test data. We apply our training data and thus obtain our model. The model is checked on the unknown test data. It is most important that the test data aren’t used during the training phase under any circumstances. The reason is obvious: Computers are great at learning by heart. Complex models like neural networks can actually start to memorize by themselves. The following results might be quite remarkable. There’s only one flaw: They’re not based on a model formulated by the program, but on “memorized” data. This effect is called “overfitting”. However, the test data are supposed to simulate the “unknown” during quality control and to check whether the model has really “learned” something. A good model achieves about the same error rate on the test data as on the training data without ever having seen it before.
We use the training data to develop the model with the learning algorithm. The more data we have, the “stronger” the model becomes. Using up all available data for the training algorithm is called an “epoch”.
In order to test it, the trained model is used on the test data unknown to it and makes predictions. If we did everything right, the predictions on unknown data should be as good as on the training data - the model can generalize and solve the problem. Now it is ready for practical use.

A birds-eye-view of Machine Learning:
Selection of a method,
Splitting raw data into training and test set,
The training algorithm trains on the training data,
The inference algorithm tests with the test data and performs predictions.

A Machine Learning project - step by step

Now that we have gone through the individual steps theoretically, we should put them to use on a specific example. Doing this we also want to introduce the most important ML algorithm - Linear Regression.

The most important ML algorithm: Linear Regression

Applying the steps we just established, we first have to decide on the most suitable algorithm for the particular problem. Selecting an algorithm is the first step in our Machine Learning process. We described the algorithm as a guideline, specifying how the computer should handle certain data. An especially important algorithm in Machine Learning is the so-called Linear Regression. All the basic components of ML may be found here. Even though other methods are far more complicated, they all work according to the same principle. You might also know linear regression from spreadsheets by the term “trend line”.

So what does the impressive term “Linear Regression” actually mean? A regression is a problem that - based on several entered variables or values - “spits out” one or more numerical values. An example of a regression, for example, is a tax return: After entering a series of values, the end result is the tax to be paid. Or the calculation of a braking distance: The speed is entered, the expected braking distance is given.

The inner workings of the most simple linear Regression: The query is entered on the lower axis, the answer can be read from the left axis using the line of the linear model.

So a regression can predict numbers. The term “linear” refers to the mathematical principle. Linear regression uses linear equations to do this. In simpler terms, we take a series of known data points, try as hard as we can to put a line through them and then use this trend line to make a prediction for new, unknown data.

Not wanting to complicate things further (and in order to make our example easier to display), we will limit ourselves here to the simplest form of linear regression, where there’s only one value entered and one value output. A certain value on the x-axis results in a corresponding value on the y-axis. The values can be displayed as points in a coordinate system.

This simple procedure has merely two parameters: On the one hand, there is the slope and the offset of the line. These two values are gradually adjusted by the training algorithm.

In this case the inference algorithm is very simple as well: Based on the input (the x-value) the output is calculated, by looking at the corresponding y-value of the line at this point.

Putting ML into praxis - Linear Regression, step by step

Lying on a meadow, looking into the night sky, listening to the chirping of the crickets… we ask ourselves: How often per second do the crickets chirp? We assume that crickets (as insects) like it warm and that the intensity of chirping depends on the temperature. We want to calculate this for all temperatures, but we cannot control the weather to get the data. Besides, we obviously don’t want to lie around night after night with a microphone and a thermometer.

Step 1 - Choosing the right ML procedure

So we have two readings that seem to be linearly dependent on each other. The warmer it gets, the more intense the chirping of the crickets seems to be: Before our inner eye we see a linearly ascending line - and decide to use linear regression as the probably best method!

Step 2 - Training and test data for the training algorithm

In order to “learn” something, our training algorithm needs at least some data. So we take heart, pack our measuring equipment, visit the meadow and collect the training and test data for our ML method. (Or we simply download the data here). Before we “feed” the model with the first batch of data, it’s parameters are set to a value first. These initial values for the parameters of the model are either random numbers or zeros (depending on the parameters and the ML method). Without having learned anything, the model can at least spit out values without crashing. Though, of course, these values are totally wrong.

Linear regression before the Training data — Initial state of training the cricket model: The line is random and does not predict the data well.

Step 3 - Training the ML model

To start the actual training session, we enter a set of training data into the model and calculate the (probably entirely wrong) results. Because of the random initial values, the model first guesses the results at random. The random values in this initial formula cause the model to spit out meaningless numbers. But at least we have a result, even if it is wrong.

Now we can use a trick: We calculate the error. The larger the amount of test data, the clearer the picture of the errors and deviations becomes. The inference algorithm calculates the current prediction of the model. The training algorithm compares the prediction with the actual correct result in the training data and computes how much they differ. This deviation is used to adjust the parameters in order to improve the next prediction. The algorithm gradually improves our linear function, monitors the deviations and “registers” whether they are increasing or decreasing, depending on how the parameters are adjusted. In short, it states how we need to change our model to better match the data.

Linear Regression, fed with some Training data — After using some Training data to move the line, it already predicts the data much better.

Step 4 - Checking against the test data

Without immersing ourselves too deeply in mathematics, let us put it this way: With the right formulas, the training data and the initial (wrong) output of the model, the algorithm can optimize the parameters and thus get a better solution next time. Using linear regression, as we did in our case, this is done by moving the line closer to the training points.

Linear Regression, precise prediction — After using up all of the training set, the line can be used to predict unknown data points pretty well.

This process is repeated again and again until the result is good enough:

We take a batch of training data, calculate the result.
We calculate the error and observe how much the result is wrong (e.g. by simply computing the difference to the desired result).
Based on the error and clever mathematics (mostly derivatives) we calculate how we have to change the model to minimize the error.
When we are satisfied with the result, we stop - if not, we go back to 1. and we repeat all the steps with another part of the training data.

So what would happen if we used up all the training data and are still not happy with the result? Easy: We initiate a new epoch (see above) after putting the data in a new random order. Ideally there is so much data as to never reach the end of an epoch. In practice, however, data is often used several times, i.e. the model is trained over several epochs. It’s somewhat similar to a human learning a foreign language: We often have to go through the new vocabulary several times (and if possible in alternating order) until we have learned it.

Linear regression and ML dealing with “high-dimensional input”

Maybe people might think: “But that’s cheating! Drawing a line through a few points isn’t AI!” However, you have to keep in mind that the same method works not only with a simple one-dimensional function, i.e. one input and output value, but also with so-called “high-dimensional input”.

This rather pompous term merely implies that one has several input variables. It is therefore more related to a tax return than to science fiction: Every tax return has a multidimensional input (wage, independent income, number of children, …). A letter is also addressed by multidimensional entries: Name, surname, company, address supplement, street, house number, postal code, city, country - here we are at a 9-dimensional input!

Our cricket example had only one input and one output value, so that we could display the data sets two-dimensionally as a graph. But the same mathematics that can be employed to train a simple linear regression with one input and output will also work for any number of dimensions. Referring to the underlying mathematics, it is still called “linear” - even though the drawing of the model isn’t and can’t even be depicted graphically as a line - if at all.

ML in practice: As simple as possible, as complicated as necessary

We conclude: With proper pre-processing, a Linear Regression can be amazingly performant. That’s why this simple algorithm still has so much practical relevance. On the one hand as a element of more complex systems - as we will see later, no Neural Network can do without Linear Regression. On the other hand, simple methods often perform better than complex ones, when it comes to Machine Learning. The quality of the result usually depends less on the method, but rather on the training data. Also, the performance gained by a complex procedure can be so low that the increased effort (computing time, storage space, programming effort, fine-tuning, …) for the complex procedure is not really worthwhile.

In any case, linear methods are always well suited to define a “baseline”: What can be achieved with minimal effort using existing data? Many projects start with a linear method as a baseline in order to better estimate the quality of later results.

Of course, there are a lot of things that linear regression cannot do - otherwise there would be no need for Deep Learning or Neural Networks. But as with the methods of Symbolic AI we have presented before, the following holds true: As simple as possible, as complicated as necessary. In a real project, the aim is to achieve the best possible result while wasting as few resources as possible.

So what happens next? We will look further at Machine Learning and discuss how to further categorize ML procedures. We will also consider their advantages and disadvantages.

25 Feb 2025
Working with Ollama, Part 2
In the first part of our article on Ollama, we demonstrated how to install Ollama and local models. In this second part, we cover advanced usage of Ollama by customizing modelfiles and integrating with the AnythingLLM frontend. We show how these tools make managing and utilizing local AI models more efficient.
weiterlesen
24 Feb 2025
Working with Ollama, Part 1
In the first installment of our two-part series “Working with Ollama,” we introduce the open-source, cross-platform solution Ollama, which simplifies both the management and usage of AI models.
weiterlesen
08 Apr 2024
Whisper 3 Large for JAVA
For an internal product prototype we have traced OpenAI’s Whisper 3 model from Huggingface and made it usable under JAVA via DJL.
weiterlesen
14 Jun 2023
ChatGPT for Teams: Privacy-Compliant Use in the Workplace
In today’s digital business world, AI-powered communication platforms like ChatGPT are essential for tasks such as answering complex code questions or creating top-notch texts for offers. However, in companies dealing with sensitive customer data, using ChatGPT can lead to a data protection dilemma. While ChatGPT offers an option to prevent the use of chat conversations for training purposes, it comes with certain limitations. Moreover, as of June 2023, there is no way to manage multiple team members or users through a company account. Each user must register individually and use their own email, phone number, and credit card. If you want to use ChatGPT+, for example, you cannot pay for all users with one credit card. Individual invoices also end up with individual users, creating an organizational and accounting nightmare. We at DIVISO have also grappled with this issue and went in search of a solution.
weiterlesen
25 Oct 2021
Git as a management tool for training data and experiments in ML
In this part of the series of articles on MLOps, we start with information that will be familiar to most of you: With the basics of Git. However, to give a different perspective on the well-known tool, these basics provide the basis to highlight the function and benefits of Git for machine learning (ML) and the difference in managing training data.
weiterlesen
02 Aug 2021
MLOps: Establishment and operation of an AI
With Machine Learning Operations (MLOps) we ensure that data is efficiently and strategically integrated into business processes through regular and automated training, thus contributing to increased revenue. The challenge is to establish and maintain these automated processes.
weiterlesen
31 Aug 2020
Types of Artificial Neural Networks
In our real-world example, we used a “feed-forward neural network” to recognise handwritten numbers. This is probably the most basic form of a NN. In reality, however, there are hundreds of types of mathematical formulas that are used – beyond addition and multiplication – to compute steps in a neural network, many different ways to arrange the layers, and many mathematical approaches to train the network.
weiterlesen
17 Jul 2020
Amazon DJL - a new DL framework for Java
Developers who wanted to explore neural networks and deep learning using the JVM, and especially Java, had little choice so far. Those who wanted to focus exclusively on Java could not get around DL4J until now. If it had to be the JVM, but not necessarily Java, the MXNet Scala Frontend was also an option. Finally, if a little Python didn’t scare you, you could try a hybrid solution, combining TensorFlow and Java just like we already explained in previous articles.
weiterlesen
29 Jun 2020
NLP, NLU and NLG: AI and text
So far, we have generally steered clear of the areas of text comprehension and text generation by ML in our practical examples for the basic understanding of AI. For good reason, we have focused primarily on two types of problems: classification of images and prediction of numerical values.
weiterlesen
23 Jun 2020
Neural networks - The five most common mistakes
AI and especially Neural Networks or Deep Learning have been the technological hype topic for some years now. However, since the subject is quite abstract – one could say it is uncharted territory for most people – we want to clear up some mistakes that we often encounter in our work.
weiterlesen
02 Jun 2020
What are Neural Networks and how do they work?
In our past articles we mainly covered the basics of current AI research and tried to shed some light on them in a way that is understandable for non-IT scientists. We are now proceeding to the probably “hottest” current AI topic: Neural Networks (NN).
weiterlesen
12 May 2020
Deep Java Learning Introduction - Part 1: NDManager & NDArray
After our first presentation of Amazon’s new Deep Learning Framework for Java, DJL, we now want to introduce the basics of Deep Learning under Java with DJL step by step in a series of beginner posts. This is not about quickly copying code snippets, but about really understanding the framework and the concepts.
weiterlesen
11 May 2020
Deep Fakes - How to spot faked Images
A (fairly) new kind of neural networks, so-called Generative Adversarial Networks or GANs, are nowadays capable of generating deceptively real images of people that do not actually exist. These fake images are indistinguishable from real photos at first glance. Fortunately, you might still uncover them if you look closely – if you know what to look for!
weiterlesen
28 Jun 2019
Recap: ML Conference 2019 in Munich
On 17.06. another round of the semi annual ML Conference started in Munich. As usual, it started with a day-long workshop with joint live coding, giving the participants an approachable introduction into Machine Learning and Deep Learning.
weiterlesen
24 May 2019
Understanding AI - Part 5: Supervised & Unsupervised Learning in ML
In the previous article we introduced the basic concepts of Machine Learning and how the training of an ML model works, using a simple but practical algorithm. Next, we want to take a closer look at the different types of Machine Learning.
weiterlesen
14 May 2019
BGL symposium 2019 - lecture 'AI and Magic'
“Any sufficiently advanced technology is indistinguishable from magic.” – Arthur C. Clarke JAX 2019 is barely over, but Christoph is already on the podium for the next talk. At the symposium of the BLG (Federal Association of Industrial Photographic Laboratories), his lecture will cover “AI and Magic – How does Artificial Intelligence work?
weiterlesen
29 Apr 2019
Jax 2019 Recap
JAX 2019 is approaching and once again Christoph is contributing two sessions. This year he’s focussing on Neural Networks and explains how to use TensorFlow-Training while working with JVM.
weiterlesen
08 Apr 2019
Understanding AI - Part 3: Methods of symbolic AI
In the previous article we added two distinctions to our initial definition of AI: On the one hand we distinguish between strong and weak AI (Terminator & Science Fiction vs. the scientific status quo). Also we pointed out the difference between symbolic AI and Machine Learning.
weiterlesen
21 Mar 2019
Understanding AI - Part 2: Symbolic AI, Neural Networks and Deep Learning
Artificial Intelligence (AI) is as old as computer science itself. Calculations, logical deductions, complex assignments… all this was once restricted to humans, until computers came forth.
weiterlesen
07 Mar 2019
Understanding AI - Part 1: What is AI?
From household help to doomsday scenario - there’s hardly a topic where public perception, state of research and reality seem so incongruent as with artificial intelligence. Reason enough to shed some light onto this subject with a series of articles.
weiterlesen
06 Aug 2018
DL4J Workshop at the ML Summit in Berlin
On October 1st and 2nd the first ML Summit takes place in Berlin. In 12 workshops in three parallel tracks, experts impart practical knowledge on the topics Applications for Business, Machine Learning Basics & Tools and Specialized Topics.
weiterlesen
23 Apr 2018
Jax 2018 - Talks about DL4J and more
Christoph will give two talks about Java and Machine Learning at JAX 2018
weiterlesen
29 Jan 2018
Enterprise TensorFlow 4 - Executing a TensorFlow Session in Java
A TensorFlow Session can be executed in Java in the same way as in Python. This post shows how.
weiterlesen
23 Jan 2018
Enterprise TensorFlow 3 - Loading a SavedModel in Java
Part 3 in the series about Java / TensorFlow Interoperability, showing how to load a TensorFlow SavedModel in Java
weiterlesen
22 Jan 2018
Enterprise TensorFlow 2 - Saving a trained model
Part 2 in the series about Java / TensorFlow Interoperability, discussing how to save a model so it can be reused in a different environment.
weiterlesen
11 Jan 2018
TensorFlow and Java - An interview with entwickler.de
Our CTO was interviewed about TensorFlow / Java Interoperability while at ML Conference 2017 in Berlin.
weiterlesen
08 Jan 2018
Enterprise Tensorflow: Code Examples
Overview over the example projects for TensorFlow / Java integration
weiterlesen
30 Nov 2017
Enterprise Tensorflow - Java vs. Python
This is the first part of a series of posts about Java and Tensorflow interop. It is a more extensive version of my talk at ML Conference 2017 in Berlin
weiterlesen
15 Nov 2017
ML Conference 2017 in Berlin
An announcement for my presentation at the ML Conference 2017 in Berlin
weiterlesen

The basics of Machine Learning

The most important ML terms, explained simply

Procedure of a Machine Learning Training

A Machine Learning project - step by step

The most important ML algorithm: Linear Regression

Putting ML into praxis - Linear Regression, step by step

Step 1 - Choosing the right ML procedure

Step 2 - Training and test data for the training algorithm

Step 3 - Training the ML model

Step 4 - Checking against the test data

Linear regression and ML dealing with “high-dimensional input”

ML in practice: As simple as possible, as complicated as necessary

Working with Ollama, Part 2

Working with Ollama, Part 1

Whisper 3 Large for JAVA

ChatGPT for Teams: Privacy-Compliant Use in the Workplace

Git as a management tool for training data and experiments in ML

MLOps: Establishment and operation of an AI

Types of Artificial Neural Networks

Amazon DJL - a new DL framework for Java

NLP, NLU and NLG: AI and text

Neural networks - The five most common mistakes

What are Neural Networks and how do they work?

Deep Java Learning Introduction - Part 1: NDManager & NDArray

Deep Fakes - How to spot faked Images

Recap: ML Conference 2019 in Munich

Understanding AI - Part 5: Supervised & Unsupervised Learning in ML

BGL symposium 2019 - lecture 'AI and Magic'

Jax 2019 Recap

Understanding AI - Part 3: Methods of symbolic AI

Understanding AI - Part 2: Symbolic AI, Neural Networks and Deep Learning

Understanding AI - Part 1: What is AI?

DL4J Workshop at the ML Summit in Berlin

Jax 2018 - Talks about DL4J and more

Enterprise TensorFlow 4 - Executing a TensorFlow Session in Java

Enterprise TensorFlow 3 - Loading a SavedModel in Java

Enterprise TensorFlow 2 - Saving a trained model

TensorFlow and Java - An interview with entwickler.de

Enterprise Tensorflow: Code Examples

Enterprise Tensorflow - Java vs. Python

ML Conference 2017 in Berlin