1 / 23
文档名称:

puting of iceberg queries using quantiling参考文献.pdf

格式:pdf   大小:313KB   页数:23页
下载后只包含 1 个 PDF 格式的文档,没有任何的图纸或源代码,查看文件列表

如果您已付费下载过本站文档,您可以点这里二次下载

分享

预览

puting of iceberg queries using quantiling参考文献.pdf

上传人:小泥巴 2022/1/18 文件大小:313 KB

下载得到文件列表

puting of iceberg queries using quantiling参考文献.pdf

相关文档

文档介绍

文档介绍:: .
erg queries have been recently identified as important queries for many applications
belonging to this category. These applications can be found in data mining [3, 20, 26],
information retrieval [15, 18, 24, 25], decision support and data warehouse [7], web
mining [9] and top k queries [10, 11]. The iceberg queries are formally introduced by
Fang et al. [12]. Detailed application examples have been also presented in [12]. These
queries have been extended to data cubes in [7]. Moreover, they are covered in database
textbooks; . [23]. These queries can be characterized by their huge input-small output.
The iceberg refers to the input, and the tip of it refers to the output. Typical applications
4748 Khaled AlSabti
of the iceberg queries can have very large databases; . several gigabytes or more [13].
Below, we give a formal definition of the iceberg queries that we consider in this work.

Problem statement: Iceberg queries are characterized as queries with a huge input and
small output. In this paper, we consider an important class of these queries, which
returns frequently occurring values from a set of attributes. Below, we present a formal
definition of these queries. Given a relation R that consists of n tuples each with m
attributes and a set of attributes ai1, ai2,..., aik, find the values of the tuples (. the tip of
the iceberg) which have attributes ai1, ai2,..., aik, that are replicated more than a pre-
specified threshold f. The assumptions are (1) relation R cannot fit into the main memory
and (2) f is a relatively large percentage so that the output of the query is very small
compared to the input. The type of the specified attribute(s) affects the computation
requirement of the problem. For categorical attribut