Named Entities in the Cognitive Workbench

The Cognitive Workbench database goes beyond just storing real-world entities and their meta-information. Using natural language processing technologies, the Cognitive Workbench extracts from the unstructured text itself references to times and people, as well as geographic names like cities, company names, diseases, drugs, genes, and potentially many other categories. These too are stored in the Cognitive Workbench, with relations to the documents from which they are extracted and to each other if they appear together in the text.

Using this information, the graph database can serve as a basis for really working with the knowledge, rather than having to work with documents. This can either be used for taking intelligent preemptive actions on behalf users, like suggesting adding a calendar entry in response to an email confirming an appointment. Or it can be used to produce new knowledge from observing unexpected patterns or possible deductions.

For example, the small graph database in the Figure below shows how the Cognitive Workbench can use data on users’ mobile devices to make deductions. Seen as a set of linked propositions, the Cognitive Workbench can use generic graph-based reasoning algorithms to learn that the user is going on vacation June 1st, and suggest a calendar entry and email notification to that effect:

Named Entities

Named Entities in the Cognitive Workbench

The linguistic processing performed in the Cognitive Workbench can take place dynamically – occurring only when the user takes some action that triggers it – or when data is imported into the Cognitive Workbench. Which one to use may involve trade-offs:

  • The Cognitive Workbench is more responsive when linguistic processing has been done in advance.
  • Reasoning over an already-constructed graph provides better output, since algorithms have the time to consider many more possible relations between entities in the graph.
  • Preprocessing data at import time makes the Cognitive Workbench graph database much larger, and therefore it consumes more of the resources of a mobile device.

In many cases, compromises between the two approaches are possible. A preprocessed database of news items, for example, makes it possible to immediately link from any news story containing a public figure to all stored news items related to him or her. To process a large news archive on the fly every time the user reads a single article is very impractical. On the other hand, an archive of news articles will contain many references to time, few if any of which are relevant to the task of finding related news stories. An application for news stories might therefore store all named entities refering to persons in the graph database, but find time references on the fly.

1 Star2 Stars3 Stars4 Stars5 Stars (No Ratings Yet)