My library button
  • Book cover of Expert Oracle Practices

    This book is an anthology of effective database management techniques representing the collective wisdom of the OakTable Network. With an emphasis upon performance—but also branching into security, national language, and other issues—the book helps you deliver the most value for your company’s investment in Oracle Database technologies. You’ll learn to effectively plan for and monitor performance, to troubleshoot systematically when things go wrong, and to manage your database rather than letting it manage you.

  • Book cover of Database Support for Queries by Image Content
  • No image available

    Abstract: "The R*-Tree index is a popular multidimensional index used in several extensible and GIS-oriented database systems. In this paper, we show that a simple refinement of the search algorithm of the R*- Tree -- which is common to all variants of the R-Tree -- offers significant speedups in most cases, with little or no worst-case performance penalty. The idea is essentially to use a conjunction of linear constraints (rather than a minimum bounding retangle) [sic] to approximate the query and to use this tighter bounding envelope to determine when the query overlaps with an R*-Tree node. This raises an important question: How can we efficiently check whether the query envelope overlaps the minimum bounding box for a tree node? Linear Programming (LP) offers one solution, but it is susceptible to numeric approximation errors. One of the contributions of this paper is a new algorithm for performing this check check that is more efficient than LP and free from numeric errors. We also present several theoretical results characterizing this algorithm. From a practical standpoint, adding the proposed constraint query refinement to existing R*-Tree implementations is straightforward. Using implementations of R*-Trees on top of the SHORE storage manager, we present experimental results (using TIGER census data for California and Wisconsin, and the Sequoia 2000 benchmark data set) that provide strong evidence in support of the proposed refinement. Our results demonstrate that the CPU overhead of our more complex overlap test, vis-a-vis the traditional minimum bounding box intersection test, is minor. On the other hand, the (CPU and I/O) gains can be considerable, especially for queries that are asymmetrically oriented with respect to the R*-Tree axes. The results are especially surprising given that we make no changes at all to the insertion and deletion algorithms. Finally, linear constraints are a very powerful tool for formulating queries, and a very important application of our results is that we can efficiently support a broad new class of such queries on multidimensional datasets drawn from a variety of domains that go well beyond GIS and spatial applications. We illustrate this, with experimental results, using queries over a five-dimensional projection of the widely used Compustat financial database (which contains over 300 dimensions!)."

  • No image available

    Abstract: "Current image retrieval systems have many important limitations. Many are specialized for a particular class of images and/or queries. The more general systems support relatively weak querying by content (e.g., by color, texture or shape, but with no deeper understanding of the structure of the image). Few (if any) have addressed the issue of truly large collections of images, and how the underlying techniques scale. There are many aspects to a DBMS supporting image retrieval by content. In this paper, we focus on a data model and give an example of a data definition language (DDL) for image data, and demonstrate the gains to be had by incorporating such a DDL in a general-purpose image DBMS. Specifically, we make contributions in five areas: (1) A proposal for a data model for images. (2) The use of DDL definitions for guiding automatic feature extraction, using a constraint-based scheduling algorithm that calls upon a library of standard and specialized image analysis routines. (3) The use of extracted features (based on the data model and DDL definitions) in representing and indexing large sets of images, and in query formulation and evaluation. (4) A system architecture that supports the use of specialized feature extraction algorithms, which may be independently developed in various important application domains, and may rely upon domain-specific image analysis techniques. To our knowledge, this is the first proposal for the use of a non-trivial data model (coupled with an image description language) for processing large sets of images in a DBMS. We discuss the impact of the data model and the DDL on various aspects of the system, and experimentally demonstrate some major benefits of this approach. In particular, we show how very large image sets can be effectively queried -- using meaningful, domain-specific restrictions on the attributes and relationships of objects contained in images -- with users providing input only on a per-collection, rather than a per-image, basis. We show that the approach is scalable, and demonstrate that content-based querying of very large collections of images using a domain- independent image DBMS is a viable goal."