Each descriptor is defined by normative and non-normative parts. The normative parts consist of the descriptor's syntax, semantics and binary representations of these.

Chapter 2: MPEG-7: The Multimedia Content Description Standard

The optional, non-normative parts are the recommended extraction and similarity matching methods [ 6 ]. Many low-level features can be extracted from the content in fully automatic ways e.

Recommended feature extraction algorithms are included in the non-normative parts of some descriptors. To allow for industry competition and to take advantage of expected improvements in technology, they are not a mandatory part of the standard. The same approach applies to similarity-based querying of descriptor values in which results are ranked in order of degree of similarity with the query. A recommended similarity matching method may be specified within a descriptor's non-normative component but it is not required for interoperability.

Is it possible to standardize certain descriptors e. How can one compare the performance of descriptors with overlapping functionality in the CEs? How can one link procedural code e. How can one define complex composite descriptors such as parameterized arrays in the DDL? A Description Scheme DS specifies the structure and semantics of the relationships between its components, which may be both Descriptors and Description Schemes. The following concepts are used within the DS group to describe audiovisual content: Syntactic structure - the physical and logical structure of audiovisual content, e.


Semantic structure - breakdown based on semantic meaning, e. Syntactic-semantic links - the associations between syntactic elements and semantic elements. At the top level it consists of: A collection of Syntactic structure DSs, i. A collection of Semantic structure DSs, i. Syntactic-semantic links DSs - which relate the syntactic elements to the semantic elements;. MetaInfo DS - this contains descriptors carrying author or publisher-generated information, e.

MediaInfo DS - this contains descriptors related to the storage media, e. Model DS - this provides a way to describe the classification methods for audiovisual data or the correspondence between the current audiovisual content and other content through different models;. Figure 1. The Generic Audiovisual Description Scheme. There is a certain amount of redundancy and overlapping functionality between the different DS proposals which have been included.

Some of the DS proposals which have been integrated are extremely complex and of dubious applicability.

Unless a library of basic simple DSs is provided, many potential users who want simple bi-level multimedia metadata structures will find the MPEG-7 standard simply too bewildering or intimidating to use. The DDL has to be able to express spatial, temporal, structural, and conceptual relationships between the elements of a DS, and between DSs.

It must provide a rich model for links and references between one or more descriptions and the data that it describes. It also has to be capable of validating descriptor data types, both primitive integer, text, date, time and composite histograms, enumerated types. In addition, it must be platform and application independent and human- and machine-readable. After evaluating the DDL proposals, the recommendation was that -- although none of the proposals satisfied all of the requirements, the proposal from DSTC [ 9 ] provided the best starting point for further DDL development.

Top 10 facts why you need a cover letter? Username Password. New to Wisdomjobs? Sign up. The MPEG-7 technology covers the most recent developments in multimedia search and retrieval. This book presents a comprehensive overview of the principles and concepts involved in a complete chain of AV material indexing, metadata description based on the MPEG-7 standard , information retrieval and browsing.

The book offers a practical step-by-step walk through of the components, from systems to schemas to audio-visual xxii PREFACE descriptors. It addresses the selection of the multimedia features to be described, the organization and structuring of the description, the language to instantiate the description, as well as the major processing tools used for indexing and retrieval of images and video sequences.

The accompanying electronic documentation will include numerous examples and working demonstrations of many of these components. Researchers and students interested in multimedia database technology will find this book a valuable resource covering a broad overview of the current state of the art in search and retrieval. Practicing engineers in industry will find this book useful in building MPEG-7 compliant systems, as the only resource outside of the MPEG community available to the public at the time of publication.

Chapter 2, by Pereira and Koenen, outlines the various activities within MPEG-7 that gained momentum towards the end of , culminating in the final standard in This format has been designed so as to efficiently compress and transport MPEG-7 descriptions.

The organization of this section is based on the functionality provided by the various Description Schemes. Chapter 6 provides an overview of the entire section. Chapter 7 discusses elementary Descriptions Schemes or Descriptors that are used as building blocks for more complex Descriptions Schemes.

The tools available for description of a single multimedia document are reviewed in chapter 8. The most important features related to content management and description, including low-level as well as high-level features, are analyzed. A detailed presentation of the corresponding set of tools is given in Sections IV visual features and Section V audio features.

The main functionalities supported by the tools of Chapter 8 include search, retrieval and filtering. Navigation and browsing are supported by a specific set of tools described in Chapter 9. Furthermore, the description of collections of documents or of descriptions is presented in Chapter Finally, for some applications, it has been recognized that it is necessary to define in a normative way the user preferences and the usage history pertaining to the consumption of the multimedia material.

This allows, for example, matching between user preferences and MPEG-7 content descriptions in order to facilitate personalization of the processing. These tools are described in Chapter Section IV: Visual Descriptors This section begins with an overview in Chapters 12 and 13 describes color descriptors that represent different aspects of color distribution in images and video.

These include descriptors for a color histogram of a single image as well as a collection of images, color structure, dominant color, and color layout. Chapter 14 presents three texture descriptors: a homogeneous texture descriptor, a coarse level browsing descriptor and an edge histogram descriptor.

Chapter 15 presents descriptors that represent contour shape, region shape and 3-D shapes. The section concludes with motion descriptors in Chapter Chapter 18 describes the spoken content technology in more detail. Sound recognition and sound similarity tools are outlined in Chapter The applications are broadly classified into search and browsing related, and mobile applications. Chapter 20 covers some interesting search and browsing applications that include real time video retrieval, browsing of TV news broadcast using MPEG-7 tools, and audio and music retrieval.

Chapter 21 discusses two interesting mobile applications. The demonstrations on the DVD include video browsing and shot retrieval, and search and browsing of images using texture. We hope that researchers and graduate students will find this useful in their work. Our special thanks to Leonardo Chiariglione, the convenor of MPEG, for his encouragement and support throughout the course of this project. We would also like to thank Dr. Lutz Ihlenburg of Heinrich-Hertz-Institut, Germany, for assisting on editorial issues and for providing many valuable comments and suggestions. We extend our thanks to the many reviewers who helped editing individual chapters.

Special thanks to Dr. Hyundoo Shin and Dr. Yanglim Choi for their support during the past three years. He would like to thank all the members of the vision research laboratory at UCSB for then" help in putting together this manuscript. MPEG is intimately connected to digital audio and video. However, when MPEG made its first breath, bits were already abundant, but they were kind of 'heavy' bits. No one thought of storing or moving a song around in digital form when this meant to store or move 50 MB, unless this was done in a special environment like a studio.

The only known way of moving audio and video was in the shape of analog waveforms. Audio files became manageable, the more so if the user was willing to get the music with some artifacts in exchange of a reduced file size or reduced transmission time. The number of television programs started multiplying by orders of magnitude. First, because more television programs in digital form could be packed in the bandwidth that used to carry one television program and second, because of the ability to make new offerings, thanks to the new economies of scale made possible by audio and video in digital form.

Compact discs could be used to store movies, and new types of compact discs were even invented to store movies in new forms.