BOOLEAN LOGIC
Judith Seeley
LIS 4311 Information Access and Retrieval
November 9, 2000
Many of our computer databases utilize boolean logic as the basis of querying the database. Boolean logic has a much
older history than most computer users imagine. It is helpful to understand the background and theory behind this
concept, because this theory is the foundation on which contemporary computer science and information technology
has been built.
George Boole was an English mathematician. Born in 1815, he had no formal higher education, but had a natural gift
for mathematics. He studied Newton and other mathematicians from the eighteenth and nineteenth centuries. He began
submitting papers to mathematical journals and by 1844 was awarded a medal for discussing ways in which algebra
and calculus could be combined and applied to several other disciplines. In 1847, Boole wrote a paper entitled
"Mathematical Analysis of Logic" (Smith, 1993). His premise in this paper was the relationship of logic and
mathematics; he thought logic was more properly associated with mathematics rather than philosophy.
Boole received a professorship in mathematics at Queens College in Ireland, based on his writings. In his most famous
work, "An Investigation into Laws of Thought, on Which Are Founded the Mathematical Theories of Logic and
Probabilities," (Smith, 1993), published in 1854, Boole wanted to separate logic from philosophy and combine it with
algebra as a science unto itself (Smith, 1993).
Boole was able to analyze the mechanics of human reasoning and the result of this analysis became the principle of
information retrieval, from manual and mechanical, to electronic. Boole believed that reasoning involved either the
addition of different concepts to form more complex concepts or the separation of complex concepts in to smaller,
simpler concepts. (Smith, 1993). Various parts of speech, such as nouns and adjectives, were classes of objects;
additions and deletions of concepts were represented by plus (+), minus (-), and times (x). He believed language was
the important tool in thinking and reasoning. Boole related all the elements of language to the process of reasoning and
represented them using algebraic symbols (Smith, 1993).
Boole’s algebra is different from conventional algebra in two significant ways because he changed basic algebraic
concepts to fit his analysis of human reasoning. The first major difference is that Boole’s algebra is a binary system,
where there are only two possible values for any algebraic symbol. Any symbol will only have a value of 1 or 0. The
second difference is the use of the signs +, -, and x. In the Boolean system, these signify the combining or excluding of
concepts. The process of adding simpler ideas to form more complex ideas was described as the "operation of
aggregating parts into a whole where each part or concept is different from the other, joined together by the
conjunctions ‘or’ or ‘and,’ both of which are compared to the plus sign in algebra" (Smith, 1993).
The most interesting thing about this theory is that "or" and "and" both mean the opposite of what they would normally
mean in daily English usage. In ordinary usage, when words are connected with "or," the result is reduced, as in the
example, apples or oranges, meaning one or the other, but not both. In Boolean logic, apples or oranges would mean
either one or the other or both; the result is enlarged. In daily usage, "and" increases the result when it connects words.
In Boolean logic, "and" reduces the sum of the words it connects.
Another Boolean idea is that of excluding concepts. Complementation, as it is called, separates a part from the whole
and Boole used the minus sign to indicate this operation. He used the term "except" when discussing this concept. An
object may be excluded from the whole with the use of this term. In Boolean logic today, the word "not" is used to
indicate this operation.
Boole also discussed the idea of a collection of things or classes of objects (Smith, 1993). This led to the concept of
algebra of sets. Sets may have a finite or infinite number of items; overlapping sets have items common to more than
one set. The Boolean operators "or," "and," and "not" work with sets, to define the items within the sets.
"Or" represents the union (addition) of sets using the plus sign. With two sets, A and B, the defined set will be all items
in set A and all items in set B, without repeating any items. In this situation, when there are overlapping items that are
members of both sets, the total number of items in the sets is less than the sum of the items in both sets.
"And" represents the intersection of sets. This set is defined as only items that belong to both sets. "Not" represents the
exclusion of some of the items in a set.
In computer searching, "or" can result in retrieving a number of citations that will be less than the sum of the sets, since
some of the citations may be common to both sets. "And" used in the computer search requires that all concepts joined
by "and" be part of the citations retrieved. "Not" will remove some concepts from the citations retrieved.
George Boole’s work brought him recognition in his lifetime, but a practical application for his theories did not come
about until 1938. Claude E. Shannon, a research assistant at the Massachusetts Institute of Technology, discovered the
similarity between telephone switching circuits and Boole’s algebra. In Boole’s binary system, the values are either 1 or
0, similar to the telephone circuits, which are either on or off. Shannon found the Boolean algebra useful in explaining
the electrical circuits; he presented a paper at the American Institute of Electrical Engineers (AIEE) in 1938. Shannon’s
work with Boolean logic and electrical circuits became the foundation for the computer technologies that came later
(Smith, 1993).
Boolean algebra was applied to the indexing and storing of information in the late 1940’s and early 1950’s; scientists
were addressing the problems of indexing the huge amount of information and research that had been done during
World War II. Mortimer Taube and his associates from Documentation Incorporated were contracted by the U.S. Armed
Services Technical Information Agency; their task was to develop new methods of storing and retrieving this vast
amount of scientific information. Taube’s work was known as coordinate indexing (Smith, 1993).
Coordinate indexing is based on the premise that pieces of information are represented by descriptors (index terms)
that describe the content of the information; each significant word is indexed as a descriptor. This leads to a Boolean
binary system, where the descriptor is present or is not present in a document. This binary system uses the principles
of Boolean algebra. Selecting the appropriate descriptors will retrieve the documents that contain the descriptors.
Taube developed the "term index," where the search is done using specific terms, and the "item index," where the
search is done using specific documents. Mechanized indexing and retrieval of information was a natural outgrowth of
coordinate indexing. IBM adding machines, Rapid Selectors and Univacs were used for these early mechanized efforts;
these machines were sensitive to punched holes, dots, or magnetic impulses. All these machines operated on the theory
of Boolean algebra when retrieving desired information (Smith, 1993).
The Science Citation Index and the Social Science Citation Index are early examples of the publication of printed
indexes. Other printed indexes, such as Biological Abstracts and the Monthly Catalog of the U.S. Government, followed.
In the 1960’s, computers were introduced; the U.S. government was the first to use computers for information
retrieval. Dialog, the first commercial information retrieval system, started in 1972 (Smith, 1993).
Much of the confusion in setting up a Boolean query may stem from the fact that Boole’s "and" and "or" operators are
the opposite of the way we use these words in our daily life. People who have not been exposed to the theory of
Boolean logic may not use the terms properly and the system may not be able to interpret their queries.
(Korfhage, 1997). A study done in Hawaii by Diane Nahl and Violet Harada (Tenopir, 1997) found that students often
confused the operators, completely neglected to use them, omitted necessary concepts when using "and," and added
unnecessary items in their queries. They believe, however, that Boolean searching may be more precise, if beginners
learn to use the system to their advantage. Nahl and Harada recommend teaching "Boolean thinking" and to encourage
students to understand how search engines apply Boolean logic (Tenopir, 1997).
Part of the reason I chose this topic to research was that I sometimes became confused as to what my Boolean query
should be in order to maximize my search results. I realize my uncertainty was due to my lack of training in the
Boolean logic process. This has been a valuable learning experience for me.
REFERENCES
Adler, I. (1961). Thinking Machines A Layman’s Introduction to Logic, Boolean Algebra and Computers. New York: The
New American Library.
Korfhage, R. (1997). Information Storage and Retrieval. New York: John Wiley & Sons, Inc.
Shannon, C. (1938). A Symbolic Analysis of Relay and Switching Circuits. American Institute of Electrical Engineers
Transactions 57, 713-723.
Smith, E. (1993, June). On the shoulders of giants: from Boole to Shannon to Taube The origins and development of
computerized information from the mid-19th century to the present. Information Technology and Libraries, 12,
217-226.
Tenopir, C. (1997, May). Common end user errors. Library Journal, 122, 31-32.