Monday, October 14

12.00 District Architecture for Networked Edition: Technical Model and Metadata

Valdo Pasqui and Antonella Farsetti

Abstract: District Architecture for Networked Editions (DAFNE) is a research project funded by the Italian Ministry of Education, University and Research aiming to develop a prototype of the national infrastructure for electronic publishing in Italy. The project's initial target concerns the scientific and scholarly production in the human and social sciences. The organizational, legal, technical and business aspects of the entire digital publishing pipeline have been analysed. DAFNE system will support the request-offer chain by promoting the integration between the digital library and the electronic publishing districts. In this paper we present the main results of the project's first year of activity. First a quick outlook about the actors, objects and services is presented. Then the functional model is examined bringing out the distinction between information content and digital objects. Afterwards the technical model is described. The system has a distributed architecture, which includes three categories of subsystems: Data Providers (i.e. the publishers), Service Providers and External Services. Data and Service Providers interact according to the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH). Finally DAFNE metadata is discussed. Metadata permeates the whole publishing chain and DAFNE metadata set is based on already defined domain-specific metadata vocabularies. Dublin Core Metadata Initiative (DCMI) and Publishing Requirements for Industry Standard Metadata (PRISM) are the main reference standards. Open Digital Rights Language (ODRL) and Open Archival Information System (OAIS) are the two other relevant models which complete DAFNE metadata specification.
The authors are thankful for support from Parco scientifico e tecnologico Galileo

12:30 Metadata in the Context of The European Library Project

Theo van Veen, Robina Claypan

Abstract: The European Library Project (TEL), sponsored by the European Commission, brings together 10 major European national libraries and library organisations to investigate the technical and policy issues involved in sharing digital resources. The objective of TEL is to set up a co-operative framework which will lead to a system for access to the major national and deposit collections in European national libraries. The scope of the project encompasses publisher relations and business models but this paper focuses on aspects of the more technical work in metadata development and the interoperability testbeds. The use of distributed Z39.50 searching in conjunction with HTTP/XML search functionality based on OAI protocol harvesting is outlined. The metadata development activity, which will result in a TEL application profile based on the Dublin Core Library Application Profile together with collection level description, is discussed. The concept of a metadata registry to allow the controlled evolution of the application profile to be inclusive of other cultural heritage institutions is also introduced.

14:00 ARCHON - A digital Library that Federates Physics Collections

K. Maly, M. Zubair, M. Nelson, X. Liu, H. Anan, J. Gao, J. Tang and Y. Zhao

Abstract: Archon is a federation of physics collections with varying degrees of metadata richness. Archon uses the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) to harvest metadata from distributed archives. The architecture of Archon is largely based on another OAI-PMH digital library: Arc, a cross archive search service. However, Archon provides some new services that are specifically tailored for the physics community. Of these services we will discuss approaches we used to search and browse equations and formulae and a citation linking service for arXiv and American Physical Society (APS) archives.

14:30 Linking Collection Management Policy to Metadata for Preservation

Maria Luisa Calanag, Koichi Tabata and Shigeo Sugimoto

Abstract: In an environment of rapid technological change, collection managers face the challenge of ensuring that valuable resources remain accessible when there are changes to the technological context in which those resources are embedded. In this context of requiring “accessibility over time”, digital preservation initiatives also demand for interoperability, or as what Hedstrom calls temporal interoperability. But first, libraries, especially in the academic world, need some general guidelines to assist in selectively choosing digital resources which are of great need to collect and preserve. This paper attempts to provide some structure for the concepts and ideas on a general collection management decision guide in the form of a requirements analysis framework that may assist in determining the metadata granularity required for digital resource management within an archive. The objective is for metadata and mechanisms to be shared among digital archives, but policies can be tailored to the requirements of the organization.

15:00 Semantic web construction: An Inquiry of Authors' Views on Collaborative Metadata Generation

Jane Greenberg, W. Davenport Robertson

Abstract: A robust increase in both the amount and quality of metadata is integral to realizing the Semantic Web. The research reported on in this article addresses this topic of inquiry by investigating the most effective means for harnessing resource authors' and metadata experts' knowledge and skills for generating metadata. Resource authors, working as scientists at the National Institute of Environmental Health Sciences (NIEHS), were surveyed about collaborating with metadata experts (catalogers) during the metadata creation process. The majority of authors surveyed recognized cataloger expertise is important for organizing and indexing web resources and support the development of a collaborative metadata production operation. Authors discovered that, as creators of web resource intellectual content, they too have knowledge valuable for cataloging. This paper presents the study’s framework and results, and discusses the value of collaborative metadata generation for realizing the Semantic Web.

15:30 Preliminary Results from the FILTER Image Categorisation and Description Exercise

Jill Evans and Paul Shabajee

Abstract: Although there are now vast numbers of digital images available via the Web, it is still the case that not enough is known or understood about how humans perceive and recognise image content, and use human language terms as a basis for retrieving and selecting images. Consequently, we cannot be sure that the image resources we are creating within the constraints of orthodox description models are actually being found, accessed and used by our target audiences. There is an increasing belief that the difficulties of image management and description should be led and defined by the needs of users and by their information seeking behaviours. The Focusing Images for Learning and Teaching – an Enriched Resource (FILTER) project is investigating, through an online image description exercise, the ways that users describe different types of images. 41 images of varying original types and subject content were placed online. Individuals were invited to participate in the exercise by describing both the subject content and ‘type’ of each image. Through analysis of the exercise results, FILTER hopes to obtain an understanding of the ways in which people describe images and the factors that influence their approaches to image description; FILTER will also examine the resulting implications for visual resources metadata. Preliminary analysis of data indicates that there is little consensus on the use of terms for image description or on categorisation of images into ‘types’. Initial findings also indicate that certain image types may be easier to categorise than others. Analysis will continue over the next few months with the object of identifying common themes or patterns amongst participants.

Tuesday, October 15

12:00 Building Educational Metadata Application Profiles

Norm Friesen, Jon Mason and Nigel Ward

Abstract: Metadata schemas relevant to online education and training have recently achieved the milestone of formal standardization. Efforts are currently underway to bring these abstract models and theoretical constructs to concrete realization in the context of communities of practice. One of the primary challenges of these efforts has been to balance or reconcile local requirements with those presented by domain-specific and cross-domain interoperability. This paper describes these and other issues associated with developing and implementing metadata application profiles. In particular, it provides an overview of metadata implementations for managing and distributing Learning Objects and the practical issues that have emerged so far in this domain. The discussion is informed by examples from two national education and training communities – Australia and Canada.

12:30 Exposing Cross-Domain Resources for Researchers and Learners

Ann Apps, Ross MacIntyre and Leigh Morris

Abstract: MIMAS is a national UK data centre which provides networked access to resources to support learning and research across a wide range of disciplines. There was no consistent way of discovering information within this cross-domain, heterogeneous collection of resources, some of which are access restricted. Further these resources must provide the interoperable interfaces required within the UK higher and further education 'information environment'. To address both of these problems, consistent, high quality metadata records for the MIMAS services and collections have been created, based on Dublin Core, XML and standard classification schemes. The XML metadata repository, or 'metadatabase', provides World Wide Web, Z39.50 and Open Archives Initiative interfaces. In addition, a collection level database has been created with records based on the RSLP Collection Level Description schema. The MIMAS Metadatabase, which is freely available, provides a single point of access into the disparate, cross-domain MIMAS datasets and services.

14:00 Integrating Schema-specific Native XML Repositories into a RDF-based E-learning P2P Network

Changtao Qu, Wolfgang Nejdl and Holger Schinzel

Abstract: As its name implies, a native XML repository supports storage and management of XML in the original hierarchical form rather than in some other representations. In this paper we present our approach for integrating native XML repositories into Edutella, a RDF-based E-learning P2P network, through mapping native XML database schemas onto Edutella Common Data Model (ECDM) and further translating ECDM?s internal query language Datalog into XPath, the local query language of native XML repositories. Due to the considerable incomparability between ECDM and the XML data model, a generic integration approach for schema-agnostic native XML repositories is found to be unrealistic. Thus our investigations are focused on three schema-specific native XML repositories, respectively based on DCMES, LOM/IMS, and SCORM XML binding data schema. Since these three metadata sets are the most popularly applied learning resource metadata specifications in E-Learning and the native XML repositories containing their XML binding metadata have also been developed and applied for several years, our integration approach satisfactorily addresses the current usage of Edutella in E-Learning despite that a generic integration approach for schema-agnostic native XML repositories has not been implemented.

14:30 Building Digital Books with Dublin Core and IMS Content Packaging

Michael Magee, Netera Alliance; D'Arcy Norman, University of Calgary; Julian Wood, University of Calgary; Rob Purdy, University of Calgary; Graeme Irwin, University of Calgary

Abstract: The "Our Roots, Nos Racines" project is designed to provide online access to Canadian Cultural Heritage materials. The initial phase of the project is digitizing books and placing them online. The University of Calgary was retained to work with the CAREO and the BELLE projects to modify their existing educational object repository to meet these needs. The solution created XML-based IMS container packages of Dublin Core metadata and manifests of all the components that were used to create the online digital books.

15:00 The Virtual image in Streaming video Indexing

Piera Palma, Luca Petraglia and Gennaro Petraglia

Abstract: Multimedia technology has been applied to many types of applications and the great amount of multimedia data need to be indexed. Especially the usage of digital video data is very popular today. In particular video browsing is a necessary activity in many kinds of knowledge. For effective and interactive exploration of large digital video archives there is a need to index the videos using their visual, audio and textual data. In this paper, we focus on the visual and textual content of video for indexing. In the former approach we use the Virtual Image and in the latter one we use the Dublin Core Metadata, opportunely extended and multilayered for the video browsing and indexing. Before to concentrate our attemption on the visual content we will explain main methods to video segmentation and annotation, in order to introduce the steps for video keyfeature extraction and video description generation.

15:30 The Use of the Dublin Core in Web Annotation Programs

D. Grant Campbell

Abstract: This paper examines the implications of annotation programs, such as Annotea, for the development of the Dublin Core. Annotation programs enable multiple users, situated far apart, to comment on a Web-mounted document, even when they lack write access, through the use of annotation servers. Early indications suggest that the Dublin Core can significantly enhance the collaborative authoring process, especially if the full set of elements is used in a project that involves large numbers of users. However, the task of adapting DC elements and qualifiers for use in annotation threatens to increase the complexity of the scheme, and takes the Dublin Core far from its connections to traditional library cataloguing.

Wednesday, October 16

12:00 A comprehensive Framework for Building Multilingual Domain Ontologies: Creating a Prototype Biosecurity Ontology

Boris Lauser, Tanja Wildemann, Allison Poulos, Frehiwot Fisseha, Johannes Keizer, Stephen Katz

Abstract: This paper presents our ongoing work in establishing a multilingual domain ontology for the biosecurity portal. As a prototypical approach, this project is embedded into the bigger context of the Agricultural Ontology Service (AOS) project of the Food and Agriculture Organization (FAO) of the UN. The AOS will act as a reference tool for ontology creation assistance and herewith enable the transfer of the agricultural domain towards the Semantic Web. The paper focuses on introducing a comprehensive, reusable framework for the process of semi-automatically supported ontology evolvement, which aims to be used in follow-up projects and can eventually be applied to any other domain. Within the multinational context of the FAO, multilingual aspects play a crucial role and therefore an extendable layered ontology modelling approach will be described within the framework. The paper will present the project milestones achieved so far: the creation of a core ontology, the semiautomatic extension of this ontology using a heuristic toolset, and the representation of the resulting ontology in a multilingual web portal. The reader will be provided with a practical example for the creation of a specific domain ontology which can be applied to any possible domain. Future projects, including automatic text classification, and ontology facilitated search opportunities, will be addressed at the end of the paper. These Semantic Web application scenarios initiate the motivation for creating domain ontologies.

12:30 The MEG Registry and SCART: Complementary Tools for Creation, Discovery and Re-use of Metadata Schemas

Rachel Heery, UKOLN; Pete Johnston, UKOLN; Dave Beckett, ILRT, University of Bristol; Damian Steer, ILRT, University of Bristol

Abstract: SCART is an RDF schema creation tool designed for use by implementers working within digital library and learning environments. This schema creation and registration tool is being developed to work in conjunction with registry software. SCART will provide implementers with a simple tool to declare their schemas, including local usage and adaptations, in a machine understandable way based on the RDF Schema specification. This tool is optimised for use by the Metadata for Education Group, projects and services within the UK providing resource discovery in the domain of education. By providing a complementary creation tool and registry the aim is to facilitate easy discovery of existing schemas already registered in a schemas registry, and to enable implementers to re-use these existing schemas where appropriate.

14:00 Does metadata count? A Webometric Investigation

Alastair G. Smith

Abstract: This study investigates the effectiveness of metadata on websites. Specifically, the study investigated whether the extent of metadata use by a site influences the Web Impact Factor (WIF) of the site. The WIF is a Webometric measure of the recognition that a site has on the web. WIFs were calculated for two classes of sites: electronic journals and NZ University Web Sites. The most positive correlation was found between the substantive WIF of the electronic journal sites and the extent of Dublin Core metadata use. The study also indicates a higher level of metadata use than previous studies, but this may be due to the nature of the sites investigated.

14:30 Using Dublin Core to Build a Common Data Architecture

Sandra Fricker Hostetter

Abstract: The corporate world is drowning in disparate data. Data elements, field names, column names, row names, labels, metatags, etc. seem to reproduce at whim. Librarians have been battling data disparity for over a century with tools like controlled vocabularies and classification schemes. Data Administrators have been waging their own war using data dictionaries and naming conventions. Both camps have had limited success. A common data architecture bridges the gap between the worlds of tabular (structured) and non-tabular (unstructured) data to provide a total solution and clear understanding of all data. Using the Dublin Core Metadata Element Set Version 1.1 and its Information Resource concept as building blocks, the Rohm and Haas Company has created a common data architecture for use in the implementation of an electronic document management system (EDMS). This platform independent framework, when fully implemented, will provide the ability to slice and dice data across the enterprise, enable interoperability with other internal or external systems, and reduce cycle time when migrating to the next generation tool.

15:00 Using Web Services to Interoperate Data at the FAO

Andrea Zisman, John Chelsom, Niki Dinsey, Stephen Katz and Fernando Servan

Abstract: In this paper we present our experience of using Web services to support interoperability of data sources at the Food and Agriculture Organization of the United Nations. We describe the information bus architecture based on Web services to assist with multilingual access of data stored in various data sources and dynamic report generation. The architecture preserves the autonomy of the participating data sources and allows evolution of the system by adding and removing data sources. In addition, due to the characteristics of Web services of hiding implementation details of the services, and therefore, being able to be used independently of the hardware or software platform in which it is implemented, the proposed architecture supports the problem of existing different technologies widespread in the FAO, and alleviates the difficulty of imposing a single technology throughout the organization. We discuss the benefits and drawbacks of our approach and the experience gained during the development of our architecture.

15:30 Design of a federation service for digital libraries: the case of historical archives in the PORTA EUROPA Portal (PEP) Pilot Project

Marco Pirri, Maria Chiara Pettenati and Dino Giuli

Abstract: Access to distributed and heterogeneous Internet resources is coming up as one of the major problem for future development of the next generation of Digital Libraries. Available data sources vary in terms of data representation and access interfaces, therefore a system for federating heterogeneous resources accessible via the Web is considered to be a crucial aspect in digital libraries research and development. Libraries as well as institutions and enterprises are struggling to find solutions that can offer the final user an easy and automatic way to rapidly find relevant needed resources among heterogeneous ones. This paper reports the conception of a federated service for three different digital historical archives maintained by the European University Institute (EUI) in Florence. This situation requires careful consideration of interoperability issues related to uniform naming, metadata formats, document models and access protocols for the different data sources. In this paper we will present the design approach for the digital archives federation services to be developed in the Porta Europa Portal (PEP) Pilot Project. The PEP pilot project specialised portal should provide high quality information, selected according to the criteria of originality, accuracy, credibility together with the cultural and political pluralism derived from the EUI's profile. The information in Porta Europa will be: relevant, reliable, searchable and retrievable.

Thursday, October 17

14:00 Describing Services for a Metadata-driven Portal

John Roberts

Abstract: This paper describes New Zealand E-government activities supporting the discovery of services through the use of Dublin Core-based New Zealand Government Locator Service (NZGLS) metadata. It notes the issues faced in collecting service metadata from agencies to populate a new whole-of-government portal. The paper then considers the adequacy of the metadata schema for service description, and identifies a difficulty in applying definitions which refer to the content of the resource to a process-like resource such as a service. Three approaches this challenge are suggested: creating a surrogate description to provide a source of content; treating the information exchanged in conducting the service as the content; and using additional contextual metadata. The adequacy of the schema for covering all the users needs for discovering and using a service is examined, and the need for metadata about specific service delivery points and conditions is noted. Finally, it is observed that future stages of e-government will require more sophisticated descriptions of services to support processes beyond discovery.

14:30 New Zealand Government Implementation of a DC-based Standard: Lessons Learned, Future Issues

Sara Barham

Abstract: This paper summarises key implementation issues encountered with the New Zealand Government's discovery level Dublin Core-based metadata standard, NZGLS. In particular, it discusses the processes used to create and manage NZGLS-compliant metadata throughout New Zealand’s core public service agencies. This metadata is being used to support the New Zealand government’s new service-focussed portal.

15:00 Visualising Interoperability: ARH, Aggregation, Rationalisation and Harmonisation

Michael Currie, Meigan Geileskey, Liddy Nevil and Richard Woodman

Abstract: This paper proposes a visualisation of interoperability. For some time, resource managers in many organisations have been acting on faith, creating ‘standards compliant’ metadata with the aim of exposing their resources in discovery activities. In some cases, their faith has led them to miss the very essence of the work they are doing, and they have not got what they worked for. The authors report a case study where significant work has been done over a number of years by government agencies in Victoria, Australia. A number of agencies have implemented, more or less, the DC-based Australian Government Locator Service application profile, at least for their local use. They have always intended to do this with as much precision as possible, with the long-term aim of developing a fully interoperable system. In the case study, typical would-be records for seven government departments were studied and it was shown that the tiniest, and typical, variation in use of the standard can be expected to thwart the aims of interoperability in significant ways. The authors make visible how the creep can be shown to lead away from interoperability and how it might be contained in the future. To do this, they use a 3-step approach of ‘aggregation, rationalisation and harmonisation’ to expose the problems with ‘nearly good enough’ interoperability and the benefits of good interoperability.

15:30 Metadata Pilot at the Department for Education and Skills, UK

Julie Robinson

Abstract: This paper indicates the Department for Education and Skills’s (DfES) practical approach to tackling metadata and surrounding issues. A metadata pilot project was set up by the Library and Information Services Team to develop a metadata scheme for departmental use. Using the Dublin Core based, e-Government Metadata Standard (e-GMS), Library staff developed a draft metadata standard for departmental web pages. Library staff applied the metadata standard by metatagging pages on a test web site. The metatagged pages were tested against the search engine. Work started on the pilot in September 2001. The pilot was successfully completed in November 2001. Further developments are ongoing.