Introducing SKOS
June 22, 2005
SKOS (Simple Knowledge Organization System), recently introduced by the W3C, is a model for expressing knowledge organization systems in a machine-understandable way, within the framework of the Semantic Web. The SKOS Core Vocabulary is an RDF (Resource Description Framework) application. Using RDF allows data to be linked and merged with other RDF data by Semantic Web applications. SKOS Core provides a model for expressing the basic structure and content of concept schemes, including thesauri, classification schemes, subject heading lists, taxonomies, terminologies, glossaries, and other types of controlled vocabulary. This article will provide some examples for using SKOS and discuss the general principles of building such knowledge bases.
Introduction
The Semantic Web is a vision for the future of the Web in which information is given explicit meaning, making it easier for machines to process and integrate information available on the Web. The Semantic Web relies on XML's ability to define schemes and RDF's flexible approach to representing data. The next element required for the Semantic Web is OWL, the Web Ontology Language, which can formally describe—using, most commonly, a logical formalism known as Description Logic—the semantics of classes and properties used in Web documents.
OWL adds a layer of expressive power to RDF and provides powerful tools for defining complex conceptual structures, which can be used to generate, among other things, rich metadata. However, the class-oriented, logically precise modeling required to construct useful web ontologies is demanding in terms of expertise, effort, and therefore cost. In many cases this type of modeling may be unnecessary or unsuited to requirements. So there is a need for a language to express vocabularies of concepts for use in semantically rich metadata, which is powerful enough to support semantically enhanced search, but simple enough to be undemanding in terms of the cost and expertise required to use it.
The SKOS Core Vocabulary is a set of RDF properties and RDFS classes that can be used to express the content and structure of a concept scheme as an RDF graph.
As an example of the kind of structure SKOS was designed to represent, let's look at the example definition of the word "canals" from Alexandria Digital Library Thesaurus:
canals » A feature type category for
places such as the Erie Canal.
Used for: » The category canals is
used instead of any of the following.
- canal bends
- canalized streams
- ditch mouths
- ditches
- drainage canals
- drainage ditches
- … more …
Broader Terms: hydrographic structures » Canals is a sub-type of "hydrographic structures."
Related Terms: » The following is a list of other categories
related to canals (non-hierarchical relationships)
- channels
- locks
- transportation features
- tunnels
Scope Note: Manmade waterway used by watercraft or for drainage, irrigation, mining, or water power.
Now let's represent this complex structure using the SKOS Core Vocabulary:
|
The corresponding machine-readable representation in RDF-XML (source code):
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:skos="http://www.w3.org/2004/02/skos/core#"> <skos:Concept rdf:about="http://www.my.com/#canals"> <skos:definition>A feature type category for places such as the Erie Canal</skos:definition> <skos:prefLabel>canals</skos:prefLabel> <skos:altLabel>canal bends</skos:altLabel> <skos:altLabel>canalized streams</skos:altLabel> <skos:altLabel>ditch mouths</skos:altLabel> <skos:altLabel>ditches</skos:altLabel> <skos:altLabel>drainage canals</skos:altLabel> <skos:altLabel>drainage ditches</skos:altLabel> <skos:broader rdf:resource="http://www.my.com/#hydrographic%20structures"/> <skos:related rdf:resource="http://www.my.com/#channels"/> <skos:related rdf:resource="http://www.my.com/#locks"/> <skos:related rdf:resource="http://www.my.com/#transportation%20features"/> <skos:related rdf:resource="http://www.my.com/#tunnels"/> <skos:scopeNote>Manmade waterway used by watercraft or for drainage, irrigation, mining, or water power</skos:scopeNote> </skos:Concept> </rdf:RDF>
SKOS Concept Modeling and Labeling
The current edition of SKOS Vocabulary replaces the earlier SKOS Core 1.0 Guide published by the SWAD-Europe Thesaurus Activity. The origins and background of technologies preceding SKOS are well defined in a XTech 2005 Proceedings SKOS report, so let's skip history description and go directly to language definition.
Let's look at the RDF-XML more closely. The skos:Concept
class says that a
resource is a conceptual resource. This sounds vague, but according to the RDF Semantics
standard: assertion is any expression which is claimed to be true; class is a
general concept, category or classification; resource is an entity or anything in the
universe. Actually skos:Concept
is used to define an atomic conceptual
resource. In the example above, the SKOS document defines a thesaurus entry for the
entity
"canals".
skos:Concept
is not the only class available in SKOS. There are also other
top-level classes:
skos:Collection
is a meaningful collection of concepts. Labelled collections can be used with collectable semantic relation properties (skos:narrower
), where you would like a set of concepts to be displayed under a node label in the hierarchy;skos:CollectableProperty
is a property which can be used with askos:Collection
;skos:ConceptScheme
is a set of concepts, optionally including statements about semantic relationships between those concepts. Thesauri, classification schemes, subject-heading lists, taxonomies, terminologies, glossaries and other types of controlled vocabulary are all examples of concept schemes;skos:OrderedCollection
is an ordered collection of concepts, where both the grouping and the ordering are meaningful.
SKOS Core uses labeling properties to assign tokens to a resource, where the token
is
intended to denote the resource in natural language or other representations intended
for
human consumption. The skos:prefLabel
and skos:altLabel
properties
allow you to assign preferred and alternative lexical labels to a resource. Under
normal
circumstances prefLabel
and altLabel
values can be considered
synonyms. However, when labeling resources of type skos:Concept
, it is not
necessary to restrict preferred and alternative lexical labels to precise synonyms.
For
example, the following is valid:
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:skos="http://www.w3.org/2004/02/skos/core#"> <skos:Concept rdf:about="http://www.my.com/#good"> <skos:prefLabel>good</skos:prefLabel> <skos:altLabel>bad</skos:altLabel> </skos:Concept> </rdf:RDF>
Abbreviations and acronyms may also be used to label concepts, and the choice of
whether
to use them as preferred or alternative terms is unconstrained. However, misspelled
words
are normally included among the hidden labels. A hidden lexical label is a lexical
label for
a resource, where you would like that character string to be accessible to applications
performing text-based indexing and search operations, but you would not like that
label to
be visible otherwise. To assign a hidden lexical label to a resource, use the
skos:hiddenLabel
property. The most common use of hidden labels is to include
misspelled variants of other lexical labels. The value of the properties
skos:prefLabel
and skos:altLabel
should be a plain literal. A
plain literal is a character string with optional language tag, and the language
tag may be used to restrict the scope of a lexical label to a particular language.
The
values permissible as language tags are given by RFC3066. Here's an example:
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:skos="http://www.w3.org/2004/02/skos/core#"> <skos:Concept rdf:about="http://www.my.com/#good"> <skos:prefLabel xml:lang="en">good</skos:prefLabel> <skos:altLabel xml:lang="en">bad</skos:altLabel> <skos:prefLabel xml:lang="fr">bon</skos:prefLabel> <skos:altLabel xml:lang="fr">mauvais</skos:altLabel> </skos:Concept> </rdf:RDF>
Symbolic labeling means labeling a concept with an image. To assign preferred and
alternative symbolic labels to a concept, use the skos:prefSymbol
and
skos:altSymbol
properties.
Adding Description of a Concept
There are eight properties you can use to add human-readable documentation to the
description of a concept. The properties are skos:publicNote
,
skos:privateNote
, skos:definition
, skos:scopeNote
,
skos:example
, skos:historyNote
, skos:editorialNote
and skos:changeNote
. Descriptive notes for a concept can be public or private
(skos:publicNote
, skos:privateNote
). Only
skos:editorialNote
and skos:changeNote
are private notes, others
are public. Thus a skos:definition
is also a skos:publicNote
, a
skos:editorialNote
is also a skos:privateNote
and so on. To
clarify the difference between skos:definition
and skos:scopeNote
,
a definition should be an attempt to completely explain the meaning of a concept,
whereas a
scope note may consist of partial information about what is or is not included within
the
meaning (or scope) of a concept. To clarify the difference between a
skos:historyNote
and a skos:changeNote
, a history note is a
piece of information intended for users of the scheme, documenting significant changes
to
the meaning, form, or state of a concept, whereas a change note is intended for documenting
fine-grained changes to a concept for the purposes of administration and management.
There are three recommended usage patterns for the SKOS Core documentation properties:
- Documentation as an RDF Literal
- Documentation as a Related Resource Description
- Documentation as a Document Reference
An RDF Literal is the simplest pattern for using the SKOS Core documentation properties, where the property value (i.e. the object of the triple) is an RDF literal. This is the way we used it in our example SKOS document:
<skos:scopeNote>Manmade waterway used by watercraft or for drainage, irrigation, mining, or water power</skos:scopeNote>
Actually this is a simplified example; presented in core RDF it will look a bit more
complicated (using rdf:value
tags). Related Resource Description allows you to
structure documentation as a related resource description. Document Reference is a
pattern
that allows you to refer to documentation that is itself a document, via the URI of
that
document. For example,
<skos:scopeNote rdf:resource="http://www.my.com/note.txt"/>
Adding Relationships
The SKOS Core Vocabulary includes the following properties for asserting semantic
relationships between concepts: skos:semanticRelation
,
skos:broader
, skos:narrower
and skos:related
. In a
property hierarchy semanticRelation
is the top semantic relationship and others
are children relationships. To assert that one concept is broader in meaning (i.e.
more
general) than another, where the scope (meaning) of one falls completely within the
scope of
the other, use the skos:broader
property. To assert the inverse, that one
concept is narrower in meaning (i.e. more specific) than another, use the
skos:narrower
property. This is how we used it in our example document:
<skos:Concept rdf:about="http://www.my.com/#canals"> <skos:broader rdf:resource="http://www.my.com/#hydrographic%20structures"/> </skos:Concept>
The properties skos:broader
and skos:narrower
are each other's
inverse. Both the properties skos:broader
and skos:narrower
are
transitive properties. To assert an associative relationship between two concepts,
use the
skos:related
property:
<skos:Concept rdf:about="http://www.my.com/#canals"> <skos:related rdf:resource="http://www.my.com/#channels"/> <skos:related rdf:resource="http://www.my.com/#locks"/> </skos:Concept>
Collecting Concepts Together
You can create and define a meaningful group of concepts. However, meaningful collections
of concepts are still unstable and can be changed in the future. SKOS Core has special
vocabulary to handle collections. However, RDF has some generic vocabulary
(rdf:Bag
and rdf:Seq
) to handle ordered and unordered groups of
resources; while preparing a W3C Working Draft, there has been extended discussion
in
mailing lists as to whether these should be used. The choice has been made provisionally
not
to use rdf:Bag
and rdf:Seq
for this purpose. (See the explanation if
you're curious.)
To define a meaningful collection of concepts, use the skos:Collection
class
and the skos:member
property. To assign a lexical label to a collection, use
the rdfs:label
property. The most common use of a labelled collection is to
enhance a hierarchical display. You can describe narrower and broader relationships
between
a concept and a collection. The class skos:CollectableProperty
supports a
generic mechanism by which collections can be involved in semantic relationships (and
other
sorts of statements). To define an ordered collection of concepts, use the
skos:OrderedCollection
class with the skos:memberList
property.
An ordered collection may also have a label (use rdfs:label
). Ordered
collections can be used with semantic relation properties in the same way as unordered
collections (skos:OrderedCollection
is a subclass of
skos:Collection
).
Usually concepts are defined in relation to other concepts, as part of an internally
coherent concept scheme. As mentioned in the introduction, a concept scheme is defined
here
as a set of concepts, optionally including statements about semantic relationships
between
those concepts. The skos:ConceptScheme
class allows you to assert that a
resource is a concept scheme.
Some Open Issues
There are still some open issues, where no firm consensus has been reached and where readers can potentially help to improve future SKOS W3C Recommendations.
- Relationship to RDFS and OWL ontologies. There is a subtle difference between SKOS Core and other RDF applications like FOAF. SKOS Core allows you to model a set of concepts as an RDF graph. Other RDF applications, such as FOAF, allow you to model things like people, organizations, places etc. as an RDF graph. Technically, SKOS Core introduces a layer of indirection into the modeling.
- Mapping Concept. You can assert a mapping relationship between any two conceptual resources. The property owl:sameAs implies that two resources are identical in every way, and should not be used to express the fact that two conceptual resources share the same meaning.
- Concept Scheme versioning. You can use different URIs for different namespaces handling different vocabularies. However, while using Dublin Core specification, there can be special tags and attributes used for versioning.