Process Mining is a discipline that sits between data mining and business process management. The starting point of process mining is an event log, which is analyzed to extract useful insights and recurrent patterns about how processes are executed within organizations. However, often its concrete application is hampered by the considerable preparation effort that needs to be conducted by human experts to collect the required data for building a suitable event log. Instead, event logs need to be extracted from different and heterogeneous data sources, often using customized extraction scripts whose implementation requires both technical and domain expertise. While this is recognized as a relevant issue in the process mining community, literature solutions tend to be ad-hoc for particular application contexts, or not enough structured to be easily applied in practice. In this paper, we tackle this issue by proposing an interactive and general-purpose approach to support organizations in generating simulated event logs that can be employed to discover the structure of the data pipelines executed within a business process. A data pipeline is a composite workflow for processing data that is enacted as part of process execution. To assess the practical applicability of the approach, we show the results of a preliminary evaluation performed in a digital marketing scenario in the range of the recently funded H2020 DataCloud project.
2022, 2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC), Pages 1172-1177
An Interactive Approach to Support Event Log Generation for Data Pipeline Discovery (04b Atto di convegno in volume)
Benvenuti D., Falleroni L., Marrella A., Perales F.
Gruppo di ricerca: Processes, Services and Software Engineering