Introduction all major internet companies in some


This research paper aims to define the
concept of Big Data and identifies its violations of the current legal
frameworks surrounding privacy regulation. First, some background information
is presented to help understand the depth of the situation. Next, the direct
violations are discussed, which is then followed by a conclusion and the
aftermath of this state of affairs. 

We Will Write a Custom Essay about Introduction all major internet companies in some
For You For Only $13.90/page!

order now



The Luxembourg Data Protection Act
defines Big Data as “stemming from the collection of large structured or
unstructured datasets, the possible merger of such datasets, as well as the
analysis of these data through computer algorithms. It usually refers to
datasets which cannot be stored, managed, and analysed with average technical
means due to their size” (Van der Sloot, 2017).  It is the idea of ‘quantity over quality’; a
colossal amount of data about an individual spanning their demographic
information, internet activity, social interactions and so forth is amassed by
firms and agencies and is subject to further dissemination- the
conclusions/correlations resulting from this analysis are not applied to those
individuals specifically, but instead, on a more general level (Machanavajjhala and
Reiter, 2012). To prevent
the revelation of identities of individuals, certain characteristics or
personal information might be altered or fabricated to make it an anonymous
practice. This activity is employed by all major internet companies in some
form or the other- for example, to offer data-driven services such as Google’s
‘Flights and Prices Tracker’ or Amazon’s ‘Customers Who Bought This Item Also
Bought’- and is treated as an asset with a source of value creation
(Rubinstein, 2012). It extends beyond internet corporations to any company, or
even any government agency, that relies on statistical methods and data mining
algorithms to analyse large datasets to form surprising correlations. While it
may sound economically and socially beneficial in terms of enhancing
efficiency, improving decision-making, and increasing productivity, Big Data
raises serious informational privacy concerns. 
It challenges the underlying principles of the European Union Data
Protection Directive 95/46/EC (DPD), perhaps the most significant and important
privacy law in the world (Cate, 1994).

Adopted in 1995 by the European Union to
regulate the processing of personal data- that is, personal information
relating to an identified or identifiable person-, the Data Protection
Directive’s core principles aim to permit the processing of personal data only
for legitimate purposes. These include principles relating to data quality
(purpose limitation, transparency, data minimisation, and accuracy), consent,
access and rectification, confidentiality, and security (Rubinstein, 2012). It
also tackles the concerns regarding the flow of data within/outside the EU,
administrative matters, as well as how the DPD will be enforced. However, with
a rapidly changing technological environment arise dramatic changes in internet
usage, such as the surge of social networking sites where people willingly
share their personal data, the growth of cloud computing, the ubiquity of
mobile devices and transmission of geo-location information through sensors,
and the popularity of data mining technologies to aggregate and analyse data
from multiple sources (Van der Sloot, 2017). These dramatic changes directly
challenge the principles laid out by the DPD.


Violations of the Data Protection Directive Principles

Personal data must only be processed for specified
explicit and legitimate purposes and may not be processed further in a way
incompatible with these purposes; whereas Big Data enables the indiscriminate
gathering of personal data. This is collected through: mobile devices that
contain location-tracking features or apps sharing information with third
parties; interactions with smart environments or physical monitoring systems;
or the most common, social media, where users voluntarily upload significant
amounts of personal data about themselves or others (Rubinstein, 2012). While
users may enlist their data for personal purposes, organisations collect and
store this data to profit from it in the future. There is no transparency in
this process. A user is unaware of what information has been stored, where it
has been stored, or why it has even been stored.

The current data protection regime
focuses on data minimisation, the idea that processing of personal data must be
restricted to the minimum amount necessary. However, the trend with Big Data technologies-
data mining-  is to encompass as much
information as possible and store it infinitely. It applies complex algorithms
to make conclusions and form correlations by uncovering new information (Van
der Sloot, 2017). New computational frameworks, for example Apache Hadoop, are
also being developed to allow for distributed processing and storage of large
data sets across clusters of computers. It has been rare for data protection
authorities to have forced technology based firms to re-design their software/processes
to minimise data processing (Rubinstein, 2012).  

One of the main weakness in the DPD is
its heavy reliance on the notion of ‘informed choice’. Individuals almost never
read through or understand privacy policies of organisations- these policies
use ambiguous language and are easily modified by firms.

the emergence of data mining and registration techniques, people are unaware
that their behaviour is even being monitored (Rubinstein, 2012). Data subjects
have the right to request information about their personal data and how it has
been processed (Cate, 1994), however, with the rise of Big Data, individuals are
often unaware they are subjects of data use and thus, are unlikely to invoke
their right to information. Furthermore, because the data collection is so
widespread, it is almost impossible for individuals to assess each data process
to conclude whether it includes their personal data; and in the case it does,
but is unlawfully processed, individuals wouldn’t know how to counteract it
because of the vague legal framework (Machanavajjhala and Reiter, 2012). 

Moreover, because increasingly large data
sets are collected and analysed, the conclusions and correlations are
formulated on a general or group level. These statistical correlations or group
profiles are not qualified as personal data, but can be used to impact the
society and environment we live in considerably (Rubinstein, 2012). An
individual may not be identifiable directly, but is nonetheless affected by the
data processing. In effect, since this data transcends the individual level, it
is becoming less and less of a priority for data processors to keep accurate
and correct data about individuals, something the DPD explicitly prohibits (Van
det Sloot, 2017). Confidentiality and security of personal data is also essential
to data processing under the legal framework, but this is challenged due to
growing open data networks and an increase in the sharing of information
amongst organisations. It threatens the notion of ‘anonymisation’ that Big Data
promotes; users can be re-identified using non-personal data by comparing and
compiling information across multiple existing databases and building user
profiles. For example, as in the case of Netflix recently, where the company
released supposedly de-identified data analysing the viewing habits of more
than 480,000 people. However, computer scientists were able to identify several
customers by linking to an online movie ratings website; further leading them
to information regarding those users’ apparent political preferences and other
sensitive information (Machanavajjhala and Reiter, 2012).


Conclusion & Further Action

Evidently, as argued in the paper, the
current legal framework surrounding privacy- the Data Protection Directive-
fails to pace up to the rapid technological advancements, such as Big Data,
that pose systemic threats to how informational privacy is protected. The DPD
principles of having a specific explicit and legitimate purpose for data usage,
transparency, data minimisation, consent (informed choice), accuracy, access and
rectification, security and confidentiality of data are violated. Citizens’
personal data is compromised at the hands of Big Data processors and these breaches
of informational privacy only seem to escalate with time. As such, the European
Commission recognised the shortcomings of the DPD and its irrelevance in
today’s age of technology; in January 2012, it proposed to reform and replace
the DPD with the new General Data Protection Regulation (GDPR), which will be
put into effect May 2018. It addresses the deficiencies of the DPD and aims to
combat the serious threats to privacy posed by technology. How it will be
regulated, or if it will even be effective, can only be determined in the
future once compliance procedures have started.  








Cate, F.H.,
1994. The EU data protection directive, information privacy, and the public
interest. Iowa L. Rev.


A. and Reiter, J.P., 2012. Big privacy: protecting confidentiality in big
data. XRDS: Crossroads, The ACM Magazine for Students, 19(1),


Rubinstein, I.,
2012. Big data: the end of privacy or a new beginning?.


Van der Sloot, B. (2017) “Privacy
from a legal perspective”


Van der Sloot,
B. (2017). Big Data and privacy.


der Sloot, B. (2017) ‘How to assess privacy violations in the age of Big
Data? Analysing the three different tests developed by the ECtHR and adding for
a fourth one’, Information &
Communication Technology Law, p. 74-103.