I am a Professor of Data Science at University of Eastern Finland. Before that, I was a senior researcher and head of the area Data Mining at
Databases and Information Systems
department. I am also an Adjunct Professor
(docent)
of computer science at
University of Helsinki.
My main research interest is in Algorithmic Data Analysis. In
particular, I am interested in applying matrix and tensor factorizations over non-standard algebras – for example, Boolean or Tropical algebras – to data mining problems. Modelling data mining problems, such as subgraph discovery, as matrix factorization problems allows us to utilize existing work from these seemingly unrelated fields and gives novel insights when developing new methods.
My other main brach of research is in redescription mining. I am particularly interested on the applications of redescription mining to other fields of science, such as biology, material sciences, and political science. Increasing the applicability of redescription mining or matrix and tensor methods requires advances in interactive and visual data mining; my research on interaction and visualisation naturally connects to the above topics.
Saskia MetzlerPauli MiettinenRandom Graph Generators for Hyperbolic Community Structures.
Proc. 7th International Conference on Complex Networks and Their Applications,
,
680–693.
10.1007/978-3-030-05411-3_54
[pdf (Springer)
| manuscript
| source code]
Sanjar KaraevSaskia MetzlerPauli MiettinenLogistic-Tropical Decompositions and Nested Subgraphs.
Proc. 14th International Workshop on Mining and Learning with Graphs (MLG '18),
.
[manuscript
| pdf (workshop)]
Esther GalbrunPauli MiettinenMining Redescriptions with Siren.
ACM Transactions on Knowledge Discovery from Data,
12(1),
,
6.
10.1145/3007212
[pdf (ACM)
| manuscript
| source code]
Stefan NeumannPauli MiettinenReductions for Frequency-Based Data Mining Problems.
Proc. 2017 IEEE International Conference on Data Mining (ICDM '17),
,
997–1002.
10.1109/ICDM.2017.128
[manuscript
| tech. rep.
| source code
| slides]
Sergey ParamonovDaria StepanovaPauli MiettinenHybrid Approach to Constraint-based Pattern Mining.
Proc. 11th International Conference on Rules and Reasoning (RR Rule-ML),
,
199–214.
10.1007/978-3-319-61252-2_14
[manuscript
| pdf (Springer)]
Esther GalbrunPauli MiettinenAnalysing Political Opinions Using Redescription Mining.
Proc. 2016 IEEE International Conference on Data Mining Workshops (ICDMW '16), Data Mining in Politics (DMiP) Workshop,2016,
422–427.
10.1109/ICDMW.2016.0066
[manuscript
| pdf (IEEE)
| Workshop
| more information]
Janis KalofoliasEsther GalbrunPauli MiettinenFrom Sets of Good Redescriptions to Good Sets of Redescriptions.
Proc. 2016 IEEE International Conference on Data Mining (ICDM '16),2016,
211–220.
10.1109/ICDM.2016.0032
[manuscript
| pdf (IEEE)
| source code]
Saskia MetzlerStephan GünnemannPauli MiettinenHyperbolae Are No Hyperbole: Modelling Communities That Are Not Cliques.
Proc. 2016 IEEE International Conference on Data Mining (ICDM '16),2016,
330–339.
10.1109/ICDM.2016.0044
[tech. rep.
| manuscript
| pdf (IEEE)
| source code]
Stefan NeumannRainer GemullaPauli MiettinenWhat You Will Gain By Rounding: Theory and Algorithms for Rounding Rank.
Proc. 2016 IEEE International Conference on Data Mining (ICDM '16),2016,
380–389.
10.1109/ICDM.2016.0049
[tech. rep.
| manuscript
| pdf (IEEE)
| source code]
Nelson MukuzePauli MiettinenInteractive Constrained Boolean Matrix Factorization.
Proc. KDD 2016 Workshop on Interactive Data Exploration and Analytics (IDEA '16),
.
[manuscript
| Workshop]
Sanjar KaraevPauli MiettinenCancer: Another Algorithm for Subtropical Matrix Factorization.
Proc. 2016 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery (ECML PKDD '16), LNCS vol. 9852,
,
576–592.
10.1007/978-3-319-46227-1_36
[manuscript
| pdf (Springer)
| source code]
Sanjar KaraevPauli MiettinenCapricorn: An Algorithm for Subtropical Matrix Factorization.
Proc. 2016 SIAM International Conference on Data Mining (SDM '16),
,
702–710.
10.1137/1.9781611974348.79
[manuscript
| pdf (SIAM)
| source code]
Tetiana ZinchenkoEsther GalbrunPauli MiettinenMining Predictive Redescriptions with Trees.
Proc. 2015 IEEE International Conference on Data Mining Workshop (ICDMW '15),
,
1672–1675.
10.1109/ICDMW.2015.123
[manuscript
| pdf (IEEE)
| more information]
Pauli MiettinenGeneralized Matrix Factorizations as a
Unifying Framework for Pattern Set Mining: Complexity Beyond
Blocks.
Proc. 2015 European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD '15), LNCS vol. 9285,
,
36–52.
10.1007/978-3-319-23525-7_3
[manuscript | pdf (Springer) | slides]
Sanjar KaraevPauli MiettinenJilles VreekenGetting to Know the Unknown Unknowns: Destructive-Noise Resistant Boolean Matrix Factorization.
Proc. 2015 SIAM International Conference on Data Mining (SDM '15),
,
325–333.
10.1137/1.9781611974010.37
[manuscript
| pdf (SIAM)
| source code
| presentation]
Pauli MiettinenInteractive Data Mining Considered Harmful (If Done Wrong).
Proc. KDD 2014 Workshop on Interactive Data Exploration and Analytics (IDEA),
,
85–87.
[pdf]
Dóra ErdősPauli MiettinenWalk'n'Merge: A Scalable Algorithm for Boolean Tensor Factorization.
Proc. 13th IEEE International Conference on Data Mining (ICDM'13),
,
1037–1042.
10.1109/ICDM.2013.141
[pdf (IEEE) | manuscript
| tech. rep.
| source code]
Dóra ErdősPauli MiettinenDiscovering Facts with Boolean Tensor Tucker Decomposition.
Proc. 2013 ACM International Conference on Infortmation and Knowledge Management (CIKM '13),
,
1596–1572.
10.1145/2505515.2507846
[pdf (ACM)
| manuscript
| data]
Ervina CerganiPauli MiettinenDiscovering Relations using Matrix Factorization Methods.
Proc. 2013 ACM International Conference on
Information and Knowledge Management (CIKM '13),
,
1549–1552.
10.1145/2505515.2507841
[pdf (ACM)
| manuscript]
Jan RamonPauli MiettinenJilles VreekenDetecting Bicliques in GF[q].
Proc. 2013 European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD '13),
,
509–524.
[pdf (Springer) | manuscript]
Pauli MiettinenFully Dynamic Quasi-Biclique Edge Covers
via Boolean Matrix Factorizations.
Proc. 1st ACM SIGMOD Workshop on
Dynamic Networks Management and Mining (DyNetMM '13),
.
10.1145/2489247.2489250
[pdf (ACM)
| manuscript]
Esther GalbrunPauli MiettinenA Case of Visual and Interactive Data Analysis: Geospatial Redescription Mining.
ECML PKDD '12 Workshop on Instant Interactive Data Mining (IID '12),
[manuscript |
Workshop]
Esther GalbrunPauli MiettinenSiren: An Interative Tool for Mining and Visualizing Geospatial Redescriptions—Demo.Proc. 18th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD-2012),
,
1544–1547.
10.1145/2339530.2339776
[pdf (ACM)
| manuscript
| more information]
Esther GalbrunPauli MiettinenFrom Black and White to Full Colour:
Extending Redescription Mining Outside the Boolean World.Proc. 2011 SIAM International Conference
on Data Mining (SDM2011),546–557.
[journal version |
pdf (SIAM)
|
manuscript
|
source code]
Pauli MiettinenSparse Boolean Matrix Factorizations.Proc. 10th IEEE International
Conference on Data Mining (ICDM2010),935–940.
10.1109/ICDM.2010.93.
[pdf
(IEEE) |
slides]
Fabrizio GrandoniAnupam GuptaStefano LeonardiPauli MiettinenPiotr SankowskiMohit SinghSet Covering with Our Eyes Closed.Proc. 49th Annual IEEE Symposium on
Foundations of Computer Science (FOCS),347–356. [pdf (IEEE)]
Proc. European Conference on
Machine Learning and Knowledge Discovery
in Databases (ECML PKDD), Part I,Lecture Notes in Artificial Intelligence5211,Springer17.
10.1007/978-3-540-87479-9_15.
[pdf
(Springer)]
Saara HyvönenPauli MiettinenEvimaria TerziInterpretable Nonnegative Matrix
Decompositions.Proc. 14th ACM SIGKDD
International Conference on Knowledge Discovery & Data Mining
(KDD),345–353.
10.1145/1401890.1401935.
[pdf
(ACM) |
slides]
Arianna GalloPauli MiettinenHeikki MannilaFinding Subgroups having Several Descriptions:
Algorithms for Redescription Mining.Proc. SIAM International
Conference on Data Mining (SDM),334–345. [pdf (SIAM)]
Pauli MiettinenTaneli MielikäinenAristides GionisGautam DasHeikki MannilaThe Discrete Basis Problem.Knowledge discovery in databases: PKDD 2006
– 10th European conference on principles and practice of
knowledge discovery in databases, Berlin, Germany, September 2006,Lecture Notes in Artificial Intelligence,4213,
Springer
,
335–346.
(PKDD Best Paper)
[journal version | manuscript |
source code]
Theses
Pauli MiettinenMatrix Decomposition Methods for Data
Mining: Computational Complexity and Algorithms.
Publications of Department of Computer Science, A-2009-4,
Department of Computer Science, University
of Helsinki2009
(Ph.D. thesis, monograph).
Certificate of Recognition,
ACM
SIGKDDDoctoral
Dissertation Award, 2010.
[pdf]
Pauli MiettinenThe Discrete Basis Problem
. ReportC-2006-010,
Department of Computer Science, University of
Helsinki2006 (M.Sc. thesis).
[pdf |
source code]
Other writings
Sanjar KaraevJames HookPauli MiettinenLatitude: A Model for Mixed Linear–Tropical Matrix Factorization.
arXiv:1801.06136 [cs.LG]
.
[pdf (arXiv)
| source code]
Stefan NeumannPauli MiettinenReductions for Frequency-Based Data Mining Problems.
arXiv:1709.00900 [cs.CC]
.
[pdf (arXiv)
| source code]
Sanjar KaraevPauli MiettinenAlgorithms for Approximate Subtropical Matrix Factorization.
arXiv:1707.08872 [cs.LG]
.
[pdf (arXiv)
| source code]
Stefan NeumannRainer GemullaPauli MiettinenWhat You Will Gain By Rounding: Theory and Algorithms for Rounding Rank.
arXiv:1609.05034 [cs.DM]
.
[pdf (arXiv)]
Saskia MetzlerStephan GünnemannPauli MiettinenHyperbolae Are No Hyperbole: Modelling Communities That Are Not Cliques.
arXiv:1602:04650 [cs.SI]
.
[pdf (arXiv)]
[an error occurred while processing this directive]
Pauli MiettinenJilles VreekenMDL4BMF: Minimum Description Length for Boolean Matrix Factorization.
Research Report MPI-I-2012-5-001,
Max-Planck-Institut für Informatik
.
[pdf
| source code]
Pauli MiettinenA review of Mathematical Tools for Data
Mining: Set Theory, Partial Orders, Combinatorics by Dan A.
Simovici and Chabane Djeraba.SIGACT News42(2),43–46. 10.1145/1998037.1998049
[pdf (ACM)]