How To Tell Stuff To A Computer

Mail

Shortcuts For Experts

Intro
RDBMS/XML
FOL
Frames
Description Logics
A.I.
RDF
UMLS
Google
Conclusion

Making Knowledge Representation Easier

One things about knowledge representation is rather surprising is that one of the simplest ways to represent knowledge, first order logic, can also be the most powerful. (There are some systems that are better for dealing with uncertain or "fuzzy" information, but besides that, first order logic is pretty powerful as far as KR systems go)

Unfortunately, with this power come certain costs- In certain ways it can make knowledge representation unnecessarily hard:

Efficiency: As we explored in the last chapter, performing resolution in first order logic is slow and cannot be done perfectly if the data is too complex.
Psychology: When humans organize knowledge in their heads, they like to bunch objects into groups and hierarchies in order to make the information more usable. First order logic is a very "flat" system- Anything can have any relationship to anything else. This can turn a database of knowledge into "spaghetti" that humans find difficult to work with.
Fudgeability: First order logic can be very unforgiving- Even storing simple facts, like "all dogs have teeth", can be impossible because you just know that somewhere, somehow, you're bound to run across a dog without teeth. This isn't really a theoretical problem with first order logic, because you can always break dogs down into toothed_dogs and toothless_dogs, but it would be impractical to do that as a general rule. What would be nicer is to have some way of saying "dogs typically have teeth", and then, if we run across an exception, we could just state that that unfortunate canine is an exception to the rule.

Many systems have been developed that help address one or more of these difficulties by limiting some of the power of first order logic in exchange for other advantages. One important one for A.I. programmers is to state that any or statement can have no more than one item in it that has a not in front of it. For esoteric reasons, it is dramatically more efficient to reason with these types of statements, called Horn clauses. The computer language Prolog is based on the use of Horn clauses.

Another approach is to somehow arrange all the data using objects. Roughly speaking, by restricting the layout of our data by forcing into into a hierarchy of objects, we can make it easier for the computer to perform resolution on our objects and can also make it far easier for humans to understand the data, since humans are very adept at thinking about information that has been structured in such a way. These approaches also maintain enough of the power of first order logic to remain useful. In the next section we'll explore the use of objects in representing data in greater detail.

Representing Knowledge With Objects

Using objects to represent data is a natural process for humans. The basic steps of this process are basically the following:

Take all individuals that we need to keep track of and place them into different buckets based on how similar they are to each other. Each bucket is given a descriptive based on what objects it contains.
Since the individuals in a given bucket are at least somewhat similar, we can avoid needing to describe every inconsequential detail about each individual. Instead, properties that are common to all individuals in a bucket can just be assigned to the entire bucket at once. Properties are typically either primitive values (such as numbers or text strings) or may be references to other buckets.
Some buckets will be more similar to each other than others and we can arrange the buckets into a hierarchy based on the similarity.
If all buckets in a branch in the tree of buckets share a property, the information can be further simplified by assigning the property only to the parent bucket. Other buckets (and individuals) are said to inherit that property.

Depending on the type of object technology we are dealing with, buckets may have different names- Classes, Frames, or Nodes are some common terms for this idea.

The earliest form of this idea were network diagrams, which were originally mostly informal drawings of nodes (our buckets) and arrows representing the relationships between them:

A Network Diagram

This is clearly a very intuitive way to represent information- However, it is not a very rigorous way to represent data- For instance, in the diagram above, it says that "A nose is an organ" and that "A dog nose is a nose"- Even though these both involve an "is a" relationship, these are, intuitively, two different types of "is a" relationships (since the first statement is a new piece of information, whereas the other relationship can be roughly deduced from other parts of the picture) Network Diagrams do not, in their basic form, offer a mechanism for representing these more subtle relationships details.

Eventually, these ideas spread in many directions and are now most commonly found in three different incarnations. First of all, computer language designers used these ideas to help structure computer algorithms, building computer languages such as Simula, Smalltalk, and Java. These languages are called object-oriented languages. The remaining two incarnations of this idea, which we will discuss next, evolved out of the A.I. community and are of more interest to us in terms of their potential for representing knowledge.

Neat and Scruffy Objects The most important problem all A.I. researchers face is without dispute: A.I. is really, really hard. When faced with such difficult problems, researches can use two approaches to overcome it- One approach is to keep the design of the software stupid simple, simple enough that any difficulties are manageable in complexity. The other approach is to use mathematical techniques to overcome problems by intellectual rigor.

Because of this, the A.I. community has had two approaches relating objects: The pragmatic approach, concerned with practical, imperfect methods for performing useful things such as recognizing things in pictures, scheduling delivery routes and other tasks where having a solution doesn't need to be perfect, but merely good enough. The basic object approach these pragmatists developed that is of interest to knowledge researchers is frame-based-reasoning.

The other approach is more theoretical, focused on strict logical and symbolic methods for solving problems. This approach lead to the concept of description logics. Both of these object methodologies are based on idea of using abstract objects to categorize our information. Since these two ideas are still being refined, they continue to evolve towards each other- Frame-based systems continue to become more rigorous as ways are found to mathematically formalize them, whereas description logics continue to become more pragmatic as they incorporate more ideas from frame-based systems.

The Pragmatists and Their Frames >>