This is the continuation of the previous article. For better understanding it is recommended to look through the earlier blog posts.
Specific traits and methods of spatial data visualization
Most of the spatial datasets suitable for energy source monitoring are presented in textual format. Due to the large amount of the information they are impossible to be processed directly. Thus, the main goal is facilitate the data perception as much as possible.
Spatial data visualization is one of the most recently revealed issues in big data analysis whose parameters are defined by geographical coordinates. Generally, datasets contain geographical coordinates in of the existing map projections and describe certain energy source or event at the particular point on the map.
Gathering data on a particular site is known as reverse geocoding (geocoding is a process of assigning geographical identifiers as, for instance, geographical coordinates in the form of longitude and latitude to the data records). Data visualization allows characterizing physical objects as well as abstract information. For example, a map showing the Internet availability in 2005 or so-called broadband map
As the geographical data is mainly visualized by assigning it to maps it is necessary to know the specific features of two main map types: vector and raster maps. Vector maps are a common term defining a vector-based collection of dataset containing information about Earth taken from various geographic information systems (GIS) provided at various levels of detail. Low resolution data coverage (corresponding to Level 0) is in the public domain and universal. At the same time global coverage data at medium resolution (corresponding to Level 1) is only partially available.
Data layers used in such maps are:
major road networks
hydrologic drainage systems
utility networks (cross-country pipelines and communication lines)
index of geographical names
GIS programs using raster maps encode geographic data in the pixel locations as well as the pixel values by means of so-called raster. Unlike vector graphics that allows to scale up to the quality of the device rendering maps easily raster maps cannot scale up to an arbitrary resolution without loss of quality, i.e. they are resolution dependent. A raster maps’ advantage of vector graphics is ability to process photographs and photo-realistic images, whereas vector maps are more efficient when it concerns typesetting or graphic design.
If we are talking about raster maps the main tool is color. In case of vector graphics developers are offered a more wide range of tools as data may be presented not only by pixels but the whole structures as points, lines, polygons. In this case besides adjusting color it is possible to control object size, line type, and apply object clustering. At present some of the most powerful data visualization tools using vector maps are CartoCSS and MapCSS. Actually, these instruments are style preprocessors based on Cascading Style Sheets (CSS) language used to define web page styles. In combination with GeoJSON data format these tools make creating proper visualization in web applications possible.
Two ways of numeric data representation are considered to be the most popular: by means of color and by means of size. But in most cases both methods are applied simultaneously. Visualization tools application is pure art. For instance, the map provided on the below figure is total household income visualization by means of bar diagrams. The problem is that long bar may overlap other diagrams.
In this case using color gradient is much more convenient. This visualization may lack detailed information (showing one single numeric parameter simultaneously), but it increases human perception rate.
As it has been mentioned above numeric data may be expressed by markers of different size.
One of the examples showing how the visualization of geographically distributed data may be helpful in analysis is the scheme created in 1860 by John Snow . All the cholera cases have been sited on the map. After having analyzed the information presented on the map it has been found out that cholera had been caused by water column at the center of the map.
Thus, visualization of geographically distributed data is commonly used by means of Earth surface maps. Obviously, in this case, for data analysis, information representation in the form of automated spatial system attached to the geographical coordinates is the most convenient method. In other words, the most efficient tool, in this specific case, is geographical information system application. They are currently presented by a great range of products, including geographic information systems of wide usage.
To be continued.