Over the past two decades, vast and rapid changes have been witnessed in the use and diffusion of information technologies. The introduction and growing use of the internet has exerted a substantial impact on everyday life, changing the way humans interact, consume information and conduct their daily activities. However, the adoption of information technologies has not been equally met by all members of society, resulting in gaps in access, usage and the type of on-line content consumed across socio-demographic, economic and spatial landscapes. The study of this phenomenon, known as the digital divide, is becoming increasingly important in recent years for policy purposes.
Commonly used methods and tools for the evaluation of the digital divide include surveys, structured interviews, open questionnaires and indicator analysis. These “self-report” methods, while very important and useful, are prone to several weaknesses. They are obtrusive, costly, unreplicable, have very little granularity with respect to regional analyses and are subjects to real sampling bias.
In this project, an innovative and novel approach for identifying, collecting, analyzing and visualizing the digital divide is presented, using unobtrusive methods. The main goal of the research is to supply the theoretical and practical underpinning for measuring and evaluating the digital divide using digital trace data.
In the framework of the study, six different digital trace data sources, parsed with reference to socio-demographic and spatial attributes, were used to analyze online user behavior, with the specific aim of studying digital gaps. The raw datasets were cleaned, processed, coded and analyzed, both on an individual and on a triangulation basis. The triangulation approach involved the combination and application of several methods and tools with the specific aim of facilitating the understanding of the digital divide phenomenon. This methodology was demonstrated by a case-study that investigated and analyzed digital gaps in the rights realization domain and involved the use of data stories that supplied systematic guidance for researching and understanding these divides. The data-driven stories were subsequently portrayed by data visualization. The design space of data visualization of trace data in the digital divide context was discussed, highlighting its multi-dimensional, time-oriented and multi-source characteristics. The research findings were presented using a wide range of descriptive and quantitative statistical methods as well as qualitative tools, involving textual analysis of on-line discussions.
The results of the research provide both a proof of concept and important insights regarding the use of digital trace data in the study of the digital divide and as to the ability of unobtrusive tools to replace self-report methods in this task. The findings of the research pointed out the existence of digital gaps, as reflected by usage volume (number of visits/distribution of visits), variety (the number of different website categories visited by the user) and content usage (type of on-line activities), with the latter category being the most significant in terms of gaps out of the three.