A short introduction to "text mining"

When you are studying a certain domain of interest, two phases may be distinguished:

  1. acquisition,
  2. then analysis of collated information.

Acquisition is often done through various software tools and techniques (search engines, smart agents, push, ...) on a one shot or systematic basis.

Next step consists in sorting, classifying and archiving the information, for an immediate or delayed usage. Then comes its analysis. One objective of the whole process is the follow-up of knowledge. Acquisition phase is constantly improving and the resulting volume of information grows permanently. Therefore, if it is still possible to present to deciders large textual information, supporting decision-making though efficient computer-based solutions are yet in infancy: actually, it is possible to sort and tag data to ease its perception and understanding by human beings, but one has to detect also weak signals. Over the past years came several text mining tools exploring this way; all attempt to replace textual reading by graphical symbolics: thematic maps, knowledge trees, dendritic maps, ...

Once the novelty effect has vanished, many users realize that:

It is at this very point that Calliope© text mining tool shines: it adds a dynamic dimension to the thematic maps, helping the User to interprete the maps representing the various time periods of a domain of interest. The ultimate goal is to sort the noteworthy terms of the many documents into three categories reflecting the evolution of their importance within the text: emerging terms, stables or declining. This dynamic analysis certainly does not substitute to the User appreciation and judgment, but simply suggests him possible considerations and paths. The only questioning "why does this term actually take power ?" is a good thinking stimulation, an improvement in respect to the only visualisation of a static map. Calliope supports the User in his query of answers thanks to its interactivity between maps, trend curves and documents retrieval.