3 Metadata Standardization
OK, so what is a metadata “standard”?
This is a bit confusing. Let’s start with what an element is. Remember that card catalog card?
Each thing on it that determines the description of the book is an element. Elements (above) include, for instance, author (Macphail, James Robert Nicolson, 1858-1933), date (1944), and document dimensions (23 cm). In general, such things are called access points. Access points are the points of reference by which people find and retrieve documents. Other access points might include the publisher, the series, or a general description. But when access points — the things by which we find, retrieve, and use documents — become formalized, that is when you use the same access points to describe all the documents in a collection, we call them elements.
An element set, then, is a list of elements that we use to describe all the documents in a collection. If I wanted to organize my personal music collection, I might create an element set that looks like this:
Artist
Album Title
Label
Year of Release
Length
Format
Track Listings
Roughly, if and when an element set becomes commonly adopted by a community of practice, we call it a standard. And if and when a commonly used and agreed upon standard becomes machine-readable as an encoded format, we call it a schema.
This LibGuide shows some common general-purpose standards, the most common of which is called Dublin Core, which has a basic element set consisting of Title, Creator, Subject, Description, Publisher, Contributor, Coverage, Date, Type, Format, Rights, Source, Language, Relation, and Identifier. In addition to Dublin Core as a general purpose standard for describing digital documents, there are also discipline-specific standards, such as Darwin Core for life science, or the Text Encoding Initiative (TEI) for describing literary documents in the humanities.
But there are different types of standards, aren’t there?
Yes, there are principally four types: structure, content, value, and exchange.
Structure: Above, I’ve talked about structural standards, which generally means the element set or schema.
Content: There are also standards for metadata content. That is, once you have a set of elements, you still don’t necessarily know how to fill them out. So content standards like Cataloging Cultural Objects (CCO) or Describing Archives: A Content Standard (DACS) can help you figure out exactly how to input values for each element.
Value: Value standards refer to the authority files, controlled vocabularies, and controlled formats that can be used to input values into element fields. For instance, if my element is “Author,” should I use the Library of Congress Name Authority Files to input “Whitman, Walt, 1819-1892” rather than simply typing “Walt Whitman”? Should I use Library of Congress Subject Headings, or Dewey Subject Headings, or the Sears List to fill in a “Subject” field? These are examples of using, respectively, authority files and controlled vocabularies as value standards. But you also need to decide on controlled formats. In the “Date” field, for instance, should you put “January 1, 1989,” or “1989-01-01,” or “1 JAN 89”?
Exchange: Finally, there’s the standard of exchange. Essentially, what file format do you use so that your metadata records are interoperable with other systems? XML? MARC? HTML? A spreadsheet application? How is your metadata housed, preserved, and transferred? Typically with cataloging records, we use MARC, and with metadata records, we use XML. Don’t worry, there are programs that do the encoding for you!