The magazine of the Melbourne PC User Group
A Little Bit About Metadata
Jennifer Gawne |
|
The Net is horribly big and not at all organised, and don't we "resource discoverers" know it! There are few navigation aids, and search engines only access about 20% of the resources in the surface Web. The "hidden" Web, made up of resources in formats such as PDF, databases, etc., is hardly touched by the engines at all. The large search engines tend to access only North American commercial sites that are in English.
(Lawrence and Giles). Scary, eh? Even some individual Web sites are so large that it is difficult, if not impossible, to find one's way through them to the required resource!
A solution, in the form of "Metadata" is available to assist Web resource discoverers and creators in overcoming these problems. The governments of Australia, New Zealand, the United States and the United Kingdom are particularly active in encouraging the addition of metadata tags to resources.
So What Is It?
Metadata literally means "data about data", but that isn't terribly useful. Even Tim Berners-Lee's definition "machine-understandable information about Web resources or other things" is pretty broad, but he did at least narrow it to "machine-understandable".
Metadata is information that is added to a resource to:
- enhance retrieval by providing structured elements, for example, to describe the subject
- assist authentication by informing resource discoverers who, when etc. created the resource
- assist the resource creator and/or owner in managing their information by providing elements such as the dates when the resource was modified or by which the information is no longer valid
- express rights management, that is, specify the ownership of the resource and any conditions that apply to its use. This aspect is particularly important in e-commerce
- provide content rating, that is, specify the audience for whom the resource is intended
- enhance collaboration by bringing together access to resources that are actually distributed through different areas or different organisations
As I have a background in librarianship, the metadata I usually deal with is descriptive metadata which is very like the cataloguing that libraries have traditionally used to describe their holdings and provide access such as consistent headings for names and subject headings. As the Internet is not so much a library as a publication medium, it is important that creators of the resources take responsibility for describing them before releasing them into the great big www. There are a lot of feral resources out there, and search engines are struggling to return a reasonable number of search results, let alone provide any context for those results.
If you would like to see an example of descriptive metadata: in an Internet browser, go to View, Page source, or to View Source to bring up a new frame filled with html in a structure that looks more than a bit like a library's cataloguing record. For example, go to
http://www.pictureaustralia.org/ and look at the metadata elements that have been added to define the creator, contributors, subject, etc. of the site.
Actually, just go to PictureAustralia and enjoy it - it is a beautiful site (see Figures
1 and 2). It exemplifies the benefits of adding metadata to resources as it allows the picture collections of the National Libraries of Australia and New Zealand, the National Archives, the War Memorial, the University of Queensland, State Libraries and other participants to be accessed
simultaneously. While the National Library of Australia maintains the XML datastore that makes this access possible, the organizations retain the ownership and rights of the resources.
|

Figure 1. The start of a long list of "Picture Trails" offered at
www.pictureaustralia.org
|
|

Figure 2. Walking down the Federation Trail
|
If you did look at the PictureAustralia metadata you may have noticed that the "scheme" is AGLS which stands for Australian Government Locator Service. This is a structure for metadata maintained by the National Archives of Australia and is fundamental to the Australian Government's online strategy to provide all possible government information and services via the Internet by the end of this year.
AGLS is only one of the metadata standards in use. Dublin Core (DC), the scheme developed by the World Wide Web Consortium (W3C) is intended to be adapted and customised for different communities - AGLS is an extension of DC developed for Australian government usage. The Law & Justice Foundation of New South Wales has further adapted DC to develop Justice Sector Metadata Standards, which are also compliant with AGLS. Many state governments in the United States apply another schema, the Global Information Locator Service (GILS) that was developed under the aegis of the US government in 1994. Some groups require great precision in their metadata description. For example, the Australia-New Zealand Land Information Council (ANZLIC) has developed a set of metadata elements for describing geospatial data sets, and this scheme has been further refined for the
Marine and coastal data directory of Australia. A list of Web references follows for those who are interested in delving further.
The consistent application of metadata standards makes it possible to bring access to resources grouped by origin or source together into
portals such as the imaginatively named australia.gov.au http://www.australia.gov.au/, and fed.gov.au
http://www.fed.gov.au/. Gateways such as HealthInsite
http://www.healthinsite.gov.au/ and Business Entry Point
http://www.business.gov.au/ provide access to resources grouped by subject area. The provision of such Web sites is a boon to the information discoverer who is able to search across several sites and to know the origins and subject context of the search results.
So Does It Work?
Despite the enthusiasm of many, especially governments, descriptive metadata is not being widely applied. The reason given for this reluctance by many site creators is that search engines do not recognise it. The search engines do not recognise metadata partly because they fear that it can contain spam, and mostly because few sites, especially commercial ones, actually apply it. So around and around we go.
Where metadata is applied properly, it certainly can provide benefits to resource creators and owners by providing a structure and information that aids in managing their resources. It assists resource discovery by aiding the development of portals and gateways, and providing elements that aid in access and authentication, such as subject headings and dates.
Metadata is probably the best chance we have of deriving some consistency in access to and management of information on the Internet. Certainly, site creators need to dedicate appreciable resources to metadata description, but there are side benefits in information management - hopefully, there will be fewer resources that are put on the Net and then forgotten!
About The Author
Jennifer Gawne, jennig@caval.edu.au is a librarian who works at CAVAL Collaborative Solutions where she coordinates a variety of information and library projects including metadata and foreign language cataloguing. She also trains in metadata and Web resources.
Reprinted from the December 2001 issue of PC Update, the
magazine of Melbourne PC User Group, Australia
|