Tim van Lier

Why data preparation is the key to success in every process mining project

Process mining is a fantastic tool to gain insights into how your processes run. Tools like SAP Signavio or ARIS enable deep process analyses to be created. However, it is the underlying data that can make or break your project. Mostly because data isn’t automatically extracted in the correct format. How can you best prepare the source-data to ensure the success of your process mining project? In this blog, I'll guide you through the various options so you can make the best choice for your project.

Data preparation

Data preparation is the process of turning raw data into an event log. An event log is what a process mining tool needs to display valuable flows and dashboards. In many cases, data preparation consumes about 80% of the total project time. Therefore, I dare saying that data preparation is the Achilles' heel of a process mining project.

Why is it so complex? SAP tables only contain codes. So, the information is hidden. Additionally, there isn't one table that contains all the process data from A to Z. To track an object, various columns from different tables are needed. For example, a purchase order goes through several phases and stored in different tables, before the payment finally takes place. Linking and interpreting raw data is the most challenging part of a process mining project. Below, you'll see the three most commonly used methods for preparing data. Visit our expertise page for more information about process mining.

Option 1: Manual construction of the event log in the process mining tool

After extracting the data from the source system, you build the process logic in the process mining tool. This is done using SQL queries. This option requires by far the most manual work and is relatively time-consuming and costly. However, this option gives you the advantage of complete flexibility within one tool.

Option 2: Using standard connectors

All mature process mining tools offer 'connectors.' These are bundles of SQL queries that format the data from a standard process. The advantage of these connectors is that it saves a lot of time because you don't start with a blank sheet of paper. The disadvantage is that a standard process almost never exists. You'll have to manually correct for deviations from the standard after installing the connector. In practice, this often means resolving various error messages.

Option 3: Konekti: a no-code platform that produces event logs

In my opinion, the best option to transform the data into an event log is using an external tool. At McCoy, we like to work with the no-code platform Konekti. The biggest advantage of Konekti is that you don't need to be an SQL guru to create an event log. Another important advantage of this option is that it offers more flexibility than using the standard connector. You can fully customize the data model to your specific situation. Another advantage of Konekti is, generated event logs can be used in other analysis tools.

In conclusion

As you can see, all three methods of data preparation have their own pros and cons. Whatever option suits your needs best, we find it essential to be able to quickly move on to the most important part of the project: identifying and realizing process optimizations.

Would you like to know more about making this choice? Or do you have other questions about process mining? Please visit our process mining page to discover more. Prefer to personally discuss this topic? Feel free to contact Tim van Lier.