Are decision trees AI

Science Year 2019: »Artificial Intelligence« The working worlds of the future will be decisively shaped by the development of artificial intelligence. The Science Year 2019 will deal with the opportunities and challenges of this technology.

For many users and interested parties, artificial intelligence algorithms are impenetrable black boxes. In fact, there are procedures that are difficult to grasp, even for experts. At the same time, however, the processes of many powerful algorithms are relatively easy to understand without having to be an expert on AI.

In this blog post, I would like to go into more detail on three such processes and, above all, help potential users of AI to better assess AI processes and thus make it easier for them to get started with their own use case.

1. Decision trees

Decision trees are the simplest possible algorithms. Here decision options are shown similar to the branches of a tree crown.



The present exemplary decision tree algorithm is intended to decide whether a person is physically fit or not. To do this, a series of questions is run through, for example with a chatbot that asks customers questions. Each answer decides which question is asked next. In our example: If the observed person is 35, then the question is then whether they are active in sports. If he is 17, the question of eating habits such as eating pizza often appears more relevant. At the end of each path there is an assignment to the class “fit” or “not fit”.

The highlight is that the algorithm itself learns which questions to ask and how the questions have to be arranged depending on the answer. This happens without the active involvement of others. It "learns" because it doesn't ask any questions, but optimizes internally according to certain criteria (without the user noticing). One such criterion could be the quality of division: The questions (or splits) to be run through are chosen in such a way that a clear division between the two target classes (“fit” or “not fit”) is achieved. Age, eating pizza and physical activity certainly have a great influence on a person's athletic condition and are weighted accordingly.

2. Linear regression

This procedure is certainly familiar to one or the other. Imagine trying to predict the college grade average of future students based on their school grade averages.



The black dots show knowledge from the past: The school grade average of students whose university grade average is already known. The algorithm now calculates an optimal best-fit line (shown here in red; optimal means that the total deviations are as small as possible). This best-fit straight line shows the previous relationship between school grades and university grades and can be transferred to new generations of students. In this way, our algorithm can provide a forecast of the university grade average.

3. k-Nearest-Neighbor-Algorithm (kNN)

With this algorithm, similar data points ("neighbors") can be found for a given data point and the given data point can then be assigned to the class which predominates among the neighbors.

What sounds abstract is easy to illustrate: Imagine that you are a dermatologist and have classified many birthmarks as benign or malignant in the past. You suspect that the length and width of the mole play a decisive role. So they know from past diagnoses which combination of length and width led to which result. Use the kNN to support future diagnostics. Plot the width on the x-axis and the length on the y-axis. Blue means benign, red means malignant. The yellow point stands for a still unclassified birthmark. kNN is there to help you make a diagnosis.

Source: Fraunhofer IAO


The k in the name kNN stands for the number of neighbors to be considered. Of course, the algorithm can come to different results depending on how k is chosen.

If you choose k = 3, the algorithm will classify the new birthmark (i.e. the yellow point) as benign (i.e. assign it to the blue class), because two out of three neighbors are blue and only one out of three neighbors is red). On the other hand, if you choose k = 5, the red neighbors predominate (three out of five) and the new birthmark is classified as malignant.

The search for the optimal k is a non-trivial task and depends heavily on the application. You also have to think about how to define “distance” or “neighborhood”. It's easy to do with numbers in two-dimensional space (as in the example). But standard dimensions are also available for higher-dimensional rooms.

Further use cases of the above procedures are

  • automatic determination of incoming documents (e.g. invoice, reminder or appointment request)
  • independent extraction of names, addresses, dates etc.
  • Prediction of current or future processes, for example manufacturing processes

Once you have understood the basic principle, the step to the next generalization or further development of the process is no longer that great. Even with more complex algorithms, it is no longer crucial to understand the exact functionality, but to know which methods are suitable for which problems and which adjusting screws there are in each case in order to increase the predictive quality of the method.

If you want to delve deeper into the subject, you are cordially invited to visit our free seminars in November or to speak to me directly - I look forward to your questions and your AI application ideas!

Blog series: Business Innovation Engineering Center BIEC - Using Artificial Intelligence
Medium-sized companies face the challenge of thinking ahead about their products, organizational forms and business models of tomorrow, despite the good order situation today. How can your own business processes be improved with the help of AI? What potential for new business models lies dormant in AI applications? In the blog series, the BIEC, as an innovation partner for medium-sized companies, provides answers to these and many other questions about digitization and transformation.

Reading links: