Classification
Theory and the Internet: A move toward Mulitdimensional Classification
Susan
Irwin
University
of Denver
March
6, 2001
Human beings seem to have an
innate need to organize (Taylor, 1999). The need to organize large amounts of
knowledge and information led to the development of classification schemes and
other organizational tools. Rapid changes in the way information is generated
and accessed lead to changes in the way information is organized for retrieval
(Chan, 1995).
The first edition of the Dewey
Classification (DDC) was published in 1876 when knowledge was expanding at a
fast rate and the public library system was becoming a more open system.
The DDC provided a new organizational scheme for dealing with these
changes. Today, the Internet allows information to be generated and packaged in
digital format. Can library classification improve access and retrieval of
Internet resources? This paper
examines the purpose of library classification and explores how classification
schemes can add value to Internet collections.
“Regardless
of the nature of the information resource, the need to express its content,
describe its format, facilitate its access, and enable its use remains
constant” (Dillon & Jul, 1996 p.212-13). Library classification schemes
have four main purposes. First, they order the fields of knowledge in a
systematic way. Second, they bring related items together in a helpful sequence.
Third, they provide orderly access to the shelves either for browsing or via the
catalog. Finally, they provide an exact location for an item on the shelves (Dittmann
& Hardy, 2000 p.8).
A
library is primarily a store of information packages (Taylor, 1999).
Taylor defines an “information package” as an instance of recorded
information such as a book, article, video, Internet document or electronic
journal. Classification schemes bring systematic order and control to the
collection so that an information package can be retrieved according to a
particular aspect of its character. Library classification schemes also improve
retrieval through enhanced subject access via Online Public Access Catalogs. A
successful classification scheme is one that saves the time of the user by
creating an order convenient to the user (Mortimer, 2000).
Like
a library, the Internet is a store of information packages, but without
systematic order. Lynch (1997) claims that the Internet is merely a huge amount
of disorganized data, a "chaotic repository for the collective output of
the world's digital printing presses." Search engines furnish access and
retrieval points for Internet resources but not all of them allow users to
navigate using subject categories. Since each search engine has its own
vocabulary and organizational scheme, a search using one search engine may turn
up too many hits while the same search using a different search engine may
result in no hits at all. The
advantage of library classification in this situation is that it can create
cohesion across diverse information stores by establishing a shared conceptual
context (Albrechtsen & Jacob, 1998 p. 310).
Classification systems increase access by grouping, or
collocating, information packages together according to their subject content.
The major classification systems are founded on General Classification
Theory (Marcella & Newton, 1994). General
Classification Theory is based on four principles:
(1)
Every
thing, object etc. has to have a distinct and unambiguous description of its
unique qualities.
(2)
Principles
involving likeness and distinctness must be used in creating classes.
(3)
Hierarchies
and other relational methods are necessary in order to group fundamental
characteristics and to identify fundamental differences clearly.
(4)
The
final system should appear as a logical progression from general to particular
(Richmond, 1990).
t1As a result, classification
schemes define subject content through schedules of classes, divisions and
sections that move from the general to the specific (Mortimer, 2000).
By establishing relationships between information packages,
classification systems supply context and furnish end-users with a way to
discover potentially helpful items.
All Internet search engines do not use the same controlled vocabulary or
organizational scheme. Some of the organizational schemes have foundations in
library classification schemes such as DDC, while others "have adopted or
invented rough-and-ready subject classification schemes" (Dillon & Jul,
1996 p. 199). The Online Computer Library Center compared 50 subject categories
from Yahoo to the DDC and to the Library of Congress Classification (LCC) to
determine how well library classification schemes compare to Internet
classifications in terms of general topic coverage (Vizine-Goetz, n.d.). Both
the DDC and the LCC have e sufficiently wide topic coverage for classifying
Internet resources” (MacLennan, 2000).
Once
the relationships between the classes, subclasses, divisions and sections are
established, library classification systems assign notation to each. Notation is
used to assign call numbers to individual information packages.
The call number assigned works as a collocation device.
For a library, this means that information packages are ordered according
to call number. Information packages that are related in some manner are shelved
together using this number. This
allows patrons to browse related items in a specific area of the stacks.
Although it is a popular library use, there are limits to the
effectiveness of browsing in a library's shelves.
In a library, non-book items that are shelved separately, as well as
items in circulation and Internet resources would not be brought to the
attention of the browser. Instead,
the patron would have to browse each collection's location. Patrons may miss
relevant information because they only browsed one section.
In a catalog's online environment, browsing becomes more powerful because
it shows the whole collection (Chan, 1995) and allows browsing the
classification numbers of all information packages. “Classification browsing
in the online catalogue [sic] is …the only way that users can bring together
all of the various formats now available in our libraries” (Elrod, 2000 p.23).
The call number assigned to an information package also works as a
location device. “Ultimately, in a physical world, each object must rest in
its place-both absolutely and, in a classified collection” (Dillon & Jul,
1996 p. 204). The assignment of an
exact location allows for easier retrieval and re-shelving. However, the current
practice of assigning a single location for an information package is somewhat
limiting. If the subject of an information package falls into two different
classes, it must nevertheless be located in a single place by class.
In a library materials can only be organized in a sequential or linear
arrangement.
Internet
resources do not require a physical location in the stacks and are freed from
the need for a linear arrangement. A
resource can be stored in a place remote from the place of access. Physical
collocation is not necessary because the resources can be grouped
"on-the-fly" as a result of searching or browsing, regardless of their
actual storage place (Dillon & Jul, 1996).
Without
the need for linear arrangements, subject classification can be taken a step
further. Instead of making the
difficult choice between two classes, multidimensional classification can be
contemplated. Metadata may be one
solution to providing multidimensional classification.
A defined set of metadata elements, such as the Dublin Core, with
repeatable elements would allow for simultaneous subject categories.
A
commonly cited problem, regarding the classification of Internet resources, is
that these resources are not stored in a physical setting. Rather, they are
stored in cyberspace and their locations may change. Uniform Resource Locators
(URLs) provide the address for the place of the resources on the Internet. The
longevity of an average URL can be measured in weeks (Younger, 1997). Again,
library classification offers a solution. Weinberg (1998) suggests that changes
in location are similar to the reclassification of a book in a library.
Repeatable metadata elements also would allow for multiple Internet addresses.
The
task of evaluating Internet resources for classification and cataloguing may
seem overwhelming. Over 7 million sites already exist on the Internet and the
number grows daily (OCLC, n.d.). The number of sites seems relatively small when
compared to the over 43 million unique bibliographic records of OCLC's WorldCat
database (OCLC, 2001). However, if the Internet is to thrive as a new means of
information communication, something like library classification will be needed
to organize and retrieve information (Lynch, 1997).
“ A good library does not provide haphazard access to an
information jungle” (Toub, 1997 p.148). Classification schemes are used to
provide systematic order, bring related items together, provide access through
browsing and provide an exact location for each information package.
Library classification schemes also add value to Internet collections.
They supply a shared context and create cohesion across diverse
information technologies through established systems.
The notation devices, combined with online technology, allow browsing of
an entire collection regardless of format.
In addition, freed from the need to provide a single physical location,
multidimensional classification can be approached through the use of metadata.
Multiple subject classifications create more points of intellectual
access for the end-user.
Albrechtsen, Hanne & Jacob, Elin.
(1998). The Dynamics of Classification Systems as Boundary
Objects for
Cooperation in the Electronic Library. Library Trends, 47 (2), 293 –
312.
Chan, Lois M. (1995). Classification,
Present and Future. Cataloging and Classification
Quarterly, 21(2),
5-17.
Dillon, Martin & Jul, Erik. (1996).
Cataloging Internet Resources: The Convergence of Libraries
and Internet
Resources. Cataloging and Classification Quarterly, 22 (3/4), 197 –
238.
Dittmann, Helena & Jane Hardy.
(2000). Learn Library of Congress Classification. Lanham,
Maryland: The
Scarecrow Press.
Elrod, J. McRee. (2000). Classification
of Internet Resources: An AUTOCAT Discussion.
Cataloging
and Classification Quarterly, 29 (4), 19 –
38.
Lynch, Cliffiord. (1997). Searching the
Internet. Scientific American, 276 (3), 49-56.
MacLennan, Alan. (2000). Classification
and the Internet. In Rita Marcella & Arthur Maltby
(Eds.) The
Future of Classification, (pp. 59-68).
Hampshire, England: Gower.
par
Marcella, Rita & Robert Newton.
(1994). A New Manual of Classification. Hampshire. England:
Gower.
Mortimer, Mary. (2000). Learn Dewey
Decimal Classification (Edition 21). Lanham, Maryland:
The Scarecrow
Press.
OCLC. (n.d.) Web Statistics. Retrieved
March 5, 2001,
http://www.oclc.org/news/research/webstatistics.shtm.
OCLC. (2001). Product Statistics.
Retrieved March 5, 2001,
http://www.oclc.org/news/product/statistics.shtm.
Richmond, Phyllis. (1990). General
Theory of Classification. In Betty G. Bengston & Janet S.
Hill (Eds.), Classification
of Library Materials. (pp.16-26). New York: Neal-Schuman Publishers, Inc.
Taylor, Arlene G. (1999). The
Organization of Information. Englewood, Colorado: Libraries
Unlimited
Toub, Stephen E. (1997). Adding Value
to Internet Collections. Library Hi Tech, 15 (3/4),
0 Retrieved February 3, 2001, from Emerald database,
http://dandini.emerald-library.com.
Vizine-Goetz, Diane. (n.d.) Using Library Classification Schemes for Internet
Resources. OCLC
Internet
Cataloging Project Colloquium, Position Paper.
Retrieved February 5, 2001, from the World Wide Web: http://www.oclc.org/home.
Vizine-Goetz, Diane. (1997). From book
classification to knowledge organization: Improving
Internet
resource description and discovery. American Society for Information Science,
24 (1) 24-27. Retrieved from OCLC First Search database, http://newfirstsearch.oclc.org.
Weinberg, Bella Hass. (1998). Improved
Internet access: Guidance from research on indexing and
classification.
American Society for Information Science, 25 (2) 26-29. Retrieved from
OCLC First Search database, http://newfirstsearch.oclc.org.
Younger, Jennifer A. (1997). Resources
Description in the Digital Age. Library Trends, 45 (3)
462-487.