projects

Projects and Research Groups

NOTE: This list is under development and will be expanded in due course. If you (or your group) would like to be added to this list, please send us a message

People

Groups

  • BioMinT is a EU-funded Research Project (2003-2005)
    • Aim: developing tools for content-based and knowledge-intensive information retrieval and extraction.
    • Applications: annotation of Swiss-Prot and PRINTS proteomics databases
    • Methods:
  1. IR: Query expansion + Ranking
    1. query is protein or gene name
    2. expand it using synonym database (using 14 different databases)
    3. generate and execute PubMed query
    4. retrieve documents, filter and rank by relevance
  2. Named Entity Recognition (recognition of Biological Entities), and IE
    1. evaluation of external tools: Yapex, KeX, GAPSCORE
    2. learning approaches for species classification
    3. plan to train a generic shallow parser over GENIA
  3. providing results as database slot fillers
  • TextPresso
    • IR, IE and QA
    • interface base on simple IR querys, or category based interface
      • works on text that has been pre-annotated (how?)
    • IE planned, not yet available
    • not using learning (markup done manually?)
    • one simple domain (C. elegans)
    • Corpus of 2700 papers and 16000 abstracts
    • open-source, freely available
  • PASTA Result of an EPSRC project (1998-2001) Described recently in Bioinformatics
    • IE system (MUC style)
    • focusing on the role of amino acids residues in protein active sites
    • tokenizaton, POS tagging, NE recognition, parsing, discourse interpretation, template extraction, templates are then used to fill a Relat DB