Motivation and General Description
A major challenge for third generation data mining and knowledge discovery systems is the integration of distributed data/knowledge resources (which are highly diverse in nature in terms of representation and data formats) and computer systems (tools for data integration, data mining and knowledge discovery). The first generation data mining systems supported a single algorithm or a small collection of algorithms that are designed to mine attribute-valued data. Today's second generation systems support high performance interfaces to databases and data warehouses, and provide increased scalability and increased functionality; for example, second generation systems can mine larger and more complex data sets and provide increased flexibility by supporting a data mining schema and a data mining query language.
The emerging third generation data mining and knowledge discovery systems should be able to mine distributed and highly heterogeneous data found on intranets, extranets, grid or cloud and integrate efficiently with operational data/knowledge management and data mining systems. Another important feature is advanced support to the user. The key technologies that will make third generation data mining and knowledge discovery possible are:
- slim, composable and interoperable implementations of data mining and knowledge discovery tools available as services on the Web or grid,
- meta-data (semantic annotations) repositories of all the different ingredients (data, data mining and application specific services, application areas, workflows, results, available expertise) of successful data mining applications. This includes human-coded knowledge as well as cases of previous data mining applications and machine-induced (meta-learning) knowledge,
- intelligent discovery assistance to facilitate and (partially) automate the construction of knowledge discovery workflows (of service compositions) and resulting in faster, cheaper and improved usage of data mining.
- KD specific social network, marketplace and cloud computing solutions to offer, find, share, sell, execute, and distribute all the above ingredients and results of data mining applications.
Lines of research covered by the workshop
Of particular interest are methods and proposals that address the following issues:
- Theoretical framework for third generation data mining and knowledge discovery
- Inductive databases, constraint-based data mining and inductive queries
- Service-oriented approaches to data mining
- Meta-level annotations and search for data mining services
- Multiple-source learning or learning from heterogeneous data including text & images
- Integrating prior knowledge (probabilities, ontologies) into data mining
- Data mining ontologies, in particular novel ontological representation schemes for handling quantitative data and data streams
- Data mining workflows/scenarios
- Data mining on the grid and cloud computing
- KDD-Process Modeling and Business Process Modeling
- Intelligent Discovery Assistance, e.g. planning to construct KD-workflows
- Generic and Application specific Design-Pattern for KD-workflows
- Exploitation of ontologies of tasks and methods
- Representation and usage of learning goals and states in machine learning
- Retrieval, adaptation and reuse of KD-workflows
- KD control knowledge, i.e. when to (not) use which service
- Meta-learning and exploitation of meta-knowledge
- Ontologies/Semantic Repositories for KD-services, -workflows and -experiments
Other areas may be covered, provided they are relevant towards the overall aims of the workshop.
Workshop Date and Proceedings
The Workshop will take place during ECML/PKDD 2011 in Athens on Friday, 9th September 2011 from 10:30 till 13:45.
Goals and Target Audience
This workshop intends to gather contributions supporting third generation data mining and knowledge discovery, elaborating a service-oriented approach to information fusion, for the needs of exploratory data analysis in the framework of inductive databases, enriched with ontology information available from the Web.
Given the growing amount of information available on the net, this workshop should be of interest to knowledge engineers, as well as students, researchers and practitioners interested in data mining and advanced methods for knowledge discovery. The workshop will also concern researchers in databases, automated planning and in software engineering, for whom data mining is an “application area”. Finally, it will not fail to attract researchers and practitioners in semantic Web technologies, as the ultimate fulfillment of a truly Semantic Web resides in the possibility of extracting not only readily available information but also deep knowledge in the form of underlying patterns and regularities.
The aim of the workshop is to explore the possibilities of this new area, offer a forum for exchanging ideas and experience concerning the state-of-the art, permit to bring in knowledge gathered in different but related and relevant areas and outline new directions for research. It is expected that the workshop will help to create a sub-community of DM researchers and practitioners interested to explore these new venues to DM problems and help thus to advance the research and potential for this new type of KD systems.
History of this workshop and related events
This workshop merges two successful workshop-series: the PlanLearn Workshop and the SoKD-workshops:
- The first PlanLearn-07 Workshop (proceedings) was organized at ECML/PKDD-2007 in September 2007 in Warsaw, Poland.
- The second PlanLearn-08 Workshop (proceedings) was organized at ICML/COLT/UAI 2008 in Helsinki, Finland.
- The third PlanLearn-10 Workshop (proceedings) was organized at ECAI 2010 in Lisbon, Portugal.
- The first SoKD 2008 Workshop (proceedings) was organized at ECML/PKDD-2008 in September 2008 in Antwerp, Belgium.
- The second SoKD 2009 Workshop (proceedings) was organized at ECML/PKDD-2009 in September 2009 in Bled, Slovenia.
- The third SoKD 2010 Workshop (proceedings) was organized at ECML/PKDD-2010 in September 2009 in Barcelona, Spain.
The PlanLearn workshops can be seen as continuation of the series of workshops on Meta-learning (ex. at ICML-05, ECML-00, ICML-99 etc.).
The organization of this workshop is partially supported by the European Community 7th framework program ICT-2007.4.4 under grant number 231519 "e-Lico: An e-Laboratory for Interdisciplinary Collaborative Research in Data Mining and Data-Intensive Science".
Workshop organizers (Program Chair /Co-Chairs)
- Jörg-Uwe Kietz, University of Zurich, Switzerland mail
- Simon Fischer, Rapid-I, Germany
- Nada Lavrac, Jozef Stefan Institute, Slovenia
- Vid Podpecan, Jozef Stefan Institute, Slovenia
- Abraham Bernstein, University of Zurich (Switzerland)
- Alexandros Kalousis, University of Geneva (Switzerland)
- Carlos Soares, LIAAD, University of Porto (Portugal)
- Christophe Giraud-Carrier, Brigham Young University (USA)
- Hendrik Blockeel, Leuven University (Belgium)
- Joaquin Vanschoren, Leiden University (Netherlands)
- Jörg-Uwe Kietz, University of Zurich (Switzerland)
- Katharina Morik, University of Dortmund (Germany)
- Michael Berthold, Konstanz University (Germany)
- Nada Lavrac, Jozef Stefan Institute (Slovenia)
- Pavel Brazdil, University of Porto (Portugal)
- Saso Dzeroski, Jozef Stefan Institute (Slovenia)
- Simon Fisher, Rapid-I GmbH (Germany)
- Stefan Rüping, FhG-IAIS (Germany)
- Filip Zelezny, Czech Technical University (Czech Republic)