Search:
Monitoring Business Activity

Knowledge Discovery for Activity Monitoring

We study how knowledge discovery, data mining, and machine learning technologies can be used to find patterns in data for purposes of monitoring activity.  While this project focuses on intelligence applications (e.g., business intelligence, competitive intelligence, government intelligence), the technologies apply much more broadly: to fraud detection, stock monitoring, customer relationship management, and so on.  In particular, activity monitoring often involves complicated and not-completely understood relationships between entities, and the entities and relationships themselves can be described to varying degrees of specificity.  In this work, we concentrate on the problem of learning patterns from related entities in a time-varying environment, for monitoring activity to alert users of important events.

Typical, flat-file, accuracy-based pattern learning is ill suited to finding important patterns in these domains.  Pattern learning must be able to capitalize on the relationships between entities, and on the attributes of related entities, and on changes over time, both in the data streams and to the web of related entities.  Furthermore, the problems share other characteristics that render them problematic for traditional pattern learning. The volume of data is huge, but the number of interesting training data (positive examples) may be small.  Traditional algorithms have problems in such situations. Being able to analyze explicitly the tradeoff between false alarms and misses is crucial.  Unlike many seemingly similar applications such as document classification, but similar to situations like fraud detection, it may be important to have a very low miss rate, even if that means analysts have to deal with large numbers of false alarms. In applications such as these, producing effective rankings of cases can be more effective than straight classification.

Finally, due (especially) to the small number of positive training examples, it is essential to involve human experts in the process. Experts can inject background knowledge in several ways, and different ways require pattern-learning algorithms to be able to accept background knowledge to different degrees.  For example, active learning techniques allow experts to label particularly useful data points, without having to understand the internal workings of the learned model.  On the other hand, having comprehensible models can facilitate the inclusion of domain experts, for interactive learning.

Relevant Publications

  • Activity Monitoring
  • Probability estimation and ranking for classification tasks
  • The knowledge discovery process
  • Learning with relational knowledge
     

Activity Monitoring

Probability estimation and ranking for classification tasks

The knowledge discovery process

Learning with relational knowledge