For the raster data, instead, we propose to use a modern compact data structure, the k 2-raster, which improves the performance of traditional methods. The vector dataset is stored and indexed in a traditional way, using an R-tree. In this work, we propose to use a new framework to store and manage raster and vector datasets. One example of these compact data structures designed for raster data, and the one achieving the best space/time trade-offs, is the k 2-raster, which will be used in this work, thus extending its functionality. This feature is known as “self-indexation”. In addition, many compact data structures are equipped with an index that, in the same compressed space, speeds up the queries. This strategy is sometimes called “in-memory” data management. By saving main memory, we obtain a more scalable system, but at the same time, we take advantage of a better usage of the memory hierarchy, and thus obtain better running times. Compact data structures use compression to reduce the size of the stored dataset, but with the novelty that the compressed data structure can be managed directly in compressed form, even in main memory. In the last few years, several authors have proposed the use of modern compact data structures to represent raster datasets. However, it has recently begun to be used as a way to obtain improvements in other dimensions, such as processing time or scalability. On the other hand, compression has been used traditionally with the aim of just reducing the size of the datasets in disk and during network transmissions. Some of them tackled the Join, or a close query, but in this case, these works suffer from limitations (data structures not functional enough, too restrictive join operations, size problems) that will be explained more in detail in the next section. Other previous contributions deal with the implementation of query operators that are explicitly defined for querying datasets in different formats. Unfortunately, no implementation details are given. As an example, the authors propose the query “return the coordinates of the trajectory of an aircraft when it was over a ground with altitude over 1,000”. Even a Join operator is suggested, which allows combining, transparently and interchangeably, vector datasets, raster datasets, or both. In, a single data model and language is proposed to represent and query both vector and raster data at the logical level. However, some previous research has addressed the problem using a joint approach. This is the solution for the zonal statistics operation of Map Algebra in, at least, ArcGIS and GRASS. For instance, the usual solution for queries that involve (together) raster and vector datasets is to transform the vector dataset into a raster dataset, and then to use a raster algorithm to solve the query. The two models are rarely handled together. Obviously, combining different data models becomes more difficult when dealing with large amounts of data.Īlthough there is a large body of research regarding the size, the analysis, and the heterogeneity of data, in the case of spatial data, in most cases, that research is focused either on the vector model or on the raster model separately. Nowadays, many application areas require the combination of data stored in different formats to run complex analysis. This big increase in the variety, richness, and amount of spatial data has also led to new information demands. Only taking into account the images acquired by satellites, several terabytes of data are generated each day, and it has been estimated that the archived amount of raster data will soon reach the zettabyte scale. The same phenomenon can be found in raster datasets, where the advances in hardware are responsible for an important increment of the size and the amount of available data. The advance of the digital society is providing a continuous growth of the amount of available vector data, but the appearance of cheap devices equipped with GPS, like smartphones, is responsible for a big data explosion, mainly of trajectories of moving objects. When dealing with spatial data, depending on the particular characteristics of the type of information, it may be more appropriate to represent that information (at the logical level) using either a raster or a vector data model.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |