PM: Methods in pm4py
Revision as of 15:16, 13 September 2025 by Onnowpurbo (talk | contribs)
Here’s a comparison table of the main methods in process mining (as available in PM4Py) so you can see their differences at a glance:
Process Discovery Methods
| Method | Output Model | Pros | Cons | Best Use Case | 
|---|---|---|---|---|
| Alpha Miner | Petri Net | Simple, foundational, easy to explain | Very sensitive to noise/incomplete logs | Educational/demo purposes, very clean logs | 
| Heuristics Miner | Heuristics Net / Petri Net | Handles noise, considers frequency | May oversimplify rare behavior | Real-life logs with noise and high variability | 
| Inductive Miner | Petri Net / Process Tree / BPMN | Always produces sound models, block-structured | May abstract away some detail | General-purpose discovery, recommended default | 
| ILP Miner | Petri Net | Precise, mathematically grounded | Heavy computational cost | Small/medium logs where precision is critical | 
| DFG Discovery | Directly-Follows Graph | Very fast, intuitive visualization | Lacks formal semantics, not executable | Quick insights, dashboards | 
Conformance Checking Methods
| Method | Pros | Cons | Best Use Case | 
|---|---|---|---|
| Token-Based Replay | Fast, intuitive, easy to compute | Less precise, may misrepresent deviations | Quick conformance estimation | 
| Alignment-Based Checking | Very precise, finds optimal matches | Computationally expensive for large logs | Audit scenarios, compliance checking | 
| Log Skeleton | Lightweight, structural conformance | Not as expressive as Petri net alignments | Quick structural validation | 
Performance Analysis
| Technique | Pros | Cons | Best Use Case | 
|---|---|---|---|
| Sojourn / throughput times | Easy to interpret, highlights bottlenecks | Needs reliable timestamp data | Detecting slow activities | 
| Time annotations on arcs | Visual enrichment of models | Only as good as the log quality | Identifying bottlenecks in process paths | 
| Case duration analysis | Summarizes case lifetimes | Doesn’t explain internal causes | SLA monitoring | 
Other Techniques
| Method | Pros | Cons | Best Use Case | 
|---|---|---|---|
| Trace Variants Analysis | Simple, shows different execution paths | Can explode with many variants | Exploratory analysis | 
| Trace Clustering | Groups similar behaviors | Choice of clustering algorithm impacts results | Finding behavior patterns | 
| Predictive Monitoring (via ML) | Anticipates outcomes, remaining time | Needs feature engineering, external ML models | Predictive SLA, early-warning systems | 
Key Takeaway:
- If you want robust discovery → use Inductive Miner.
 - If you need fast visualization → use DFG Discovery.
 - For compliance checks → prefer Alignment-based Conformance.
 - For real-life noisy data → Heuristics Miner is strong.