Back
The Celonis Process Query Language (PQL) Turned 5 in February
Programming

The Celonis Process Query Language (PQL) Turned 5 in February

BlogCelonis Engineering Blog

With our Process Query Language, better known as PQL, and its execution engine, the Celonis SaolaDB, turning 5 this February we reached a real milestone.

What is PQL and SaolaDB, and why is it important?

The Process Query Language is, together with the Celonis SaolaDB, the backbone of our Execution Management System (EMS) platform. Every time our customers take advantage of the Celonis EMS to analyze what is happening inside their operations and processes and are then enabled to improve them, they are using PQL and its execution engine. Users construct questions using PQL to ask the Celonis SaolaDB. The Celonis SaolaDB, our in-memory database, is answering these questions. It is thereby highly optimized for process mining purposes by combining state-of-the-art techniques from relational and graph databases. The construction of questions in PQL is done by using terms called “operators”. There are now more than 180 operators, ranging from process-specific functions to mathematical operators, which allow users to analyze all facets of a business process in detail, as well as to detect and employ process improvements.

But it’s when you combine operators that the magic happens. For example, it requires a combination of just three operators to find out the optimal payment terms with your suppliers, and therefore the best terms to put in new supplier contracts. This maximizes cash discounts for on-time payment and working capital by not paying too soon. In another use case, a sales-based organization wants to predict the likely revenue from all sales opportunities currently in the CRM. This also needs just three operators: AVG, CASE WHEN and CALC_THROUGHPUT. In other words, operators in PQL are building blocks you can construct to build queries for new use cases - it’s a bit like building with Lego.

PQL – designed to be efficient, designed to be fast

PQL was constructed to help you get to these sorts of insights faster and with fewer operators. We are also proud that the language helps produce answers for our customers extremely quickly – even when interrogating the largest databases.

Flexibility, speed and scale are some of the most important reasons why we opted to develop our query language in the first place. New problems need new solutions. In 2014, we used a predecessor language of PQL which was an extension to SQL (Structured Query Language), a long-established, well-understood relational database query tool. Back then, this extension compiled down to SQL, and the result was executed on top of a relational database. This provided reasonable flexibility for our purposes but lacked process-specific algorithms and boasted somewhat underwhelming performance. SQL – devised in the early 1970s – and more importantly general purpose databases were not designed to look at processes. Building queries with SQL relevant to our customers’ needs required some workarounds, an amount of code that seemed bulky to us, and generated response times that we felt could be significantly improved.

That’s what led us to establish four design goals for the new language. It had to be easy to use (especially versus SQL). It had to be event-log centered (an event log is the core of process mining and our new language focuses on it). It had to be flexible and business focused – we were not looking to create an academic language.

Some of those goals are self-explanatory – but perhaps not the emphasis on event logs. Without event log awareness, any process mining tool is incomplete – lacking the full picture of what is really going on. Celonis focuses on event logs – they are built-in and automatically applied - and therefore ensure full optimization capability.

The new PQL that emerged into the light in February 2016 is inspired by SQL but tailored to process-related queries. It is processed by our custom analytical database and can translate business questions into executable process queries. That gives us the advantage to own and optimize the code for our specific use cases and provides us with the freedom to add the latest process mining algorithms. And - as you can see from this example - it is far more efficient than SQL in terms of the amount of code needed to generate the desired answer.

SQL vs. Celonis PQL

The latest PQL developments

The proof of Celonis PQL’s maturity and value to our customers is demonstrated every day by the thousands of users across multiple industries who apply our code to various process types and vast amounts of event data - millions of queries per day. In a process of continual improvement, at Celonis, we emphasize capturing and delivering the improvements our customers want. It’s a process that is never finished, of course, as new ideas and requirements arise all the time.

A real strength at Celonis is that we are driven by the world’s most extensive process mining user base - we have more than ten times the number of users than our nearest competitor. And that means we get a lot of feedback. About a year ago, we identified and prioritized the next wave of embellishments, with almost all delivered during 2020. This included some great ideas – for example a range of important new operators like CURRENCY_CONVERT_SAP for getting a view of the value in queries involving multiple currencies and allowing for changing exchange rates over the period of the query. It dynamically converts all values within the scope of the query into a chosen target currency based on the exchange rate at a given conversion date. This means if a document is, let’s say, three years old, we take the conversion rate from three years back to avoid making out of date comparisons.

We also significantly improved the PQL editor with compelling upgrades for people writing queries. This includes auto-completion for tables and column names, enhanced code diagnostics for real-time error reporting as you type, a zoom function (to look at code segments in closer detail), and context-sensitive suggestions while typing. There are also some great new ways of formatting code – straightforward to access with shortcuts to prettify (that’s what my colleagues call it) your query and see the structure more clearly. If that weren’t enough, we’ve also included code folding - so you can expand and focus on the relevant part of the code – and warnings about deprecated code that will still return a valid answer but which will not be supported in the future. The new editor is available in the Celonis Studio.

In terms of impact on business users, probably the most important new feature in PQL is the Multi-Event Log (MEL). Before MEL, analyses with PQL were within a single process, for example, Purchase to Pay (P2P). MEL is a radical step forward because it allows you to interrogate multiple processes in one data model. Sticking with the purchasing example, that could be P2P and Order to Cash (O2C) and Accounts Payable (AP). By visualizing multiple processes in one sequential process, or as parallel processes, you get a complete process overview, showing how bottlenecks in one process – either upstream or downstream - affect performance in another. Parallel process visualization is possible within the Multi-Event Log Process Explorer, available in the Celonis Studio.

PQL gets older, faster and stronger

In the latest round of updates, we also made PQL faster and more robust. It was already way ahead of query tools based on SQL for process-specific queries, but we’ve now taken it on to the next level. By streamlining queries and looking at edge cases, creating faster aggregations and uprating queries and joins, we’ve made very large queries much faster. It’s even more resilient now too with zero-downtime deployment – with no outages - thanks to even higher availability of the data models. And by taking it all to the cloud, we can now continuously monitor for issues and optimize any bottlenecks.

Want to learn more about PQL?

Finally, no birthday celebration is complete without a look to the future, and we’ve made massive efforts to make it even easier for you to learn about PQL and to get up and running quickly. Please sign up to our training platform (for free) if you would like to take part in our two new first-step training courses – the 30-minute “PQL and the Celonis PQL Engine - An Introduction” and “Basic coding with PQL”, which covers basic process functions and aggregations in about 2.5 hours. And if you want to learn more about the Celonis PQL, you can read “Celonis PQL: A Query Language for Process Mining” try PQL for free in Celonis Snap or contact us here.

Celonis Martin Klenk
Martin Klenk and Team Celonis
Co-Founder and CTO // Engineering & Product

The best team is a diverse team - and a the best team wins. That is #TeamCelonis.

Dear visitor, you're using an outdated browser. Parts of this website will not work correctly. For a better experience, update or change your browser.