2019 was the year the data market surpassed oil in valuation as we move increasingly into an era when our rights are rendered meaningless. Soon the default will shift from us interacting directly with our devices to interacting with devices we have no control over and no knowledge that we are generating data. Here are 10 ways in which this exploitation and manipulation is already happening.
1. The Myth of Free Wi-Fi
Invisible and insecure infrastructure is facilitating data exploitation
Many technologies, including those that are critical to our day-to-day lives do not protect our privacy or security. One reason for this is that the standards which govern our modern internet infrastructure do not prioritise security which is imperative to protect privacy.
An example of this is Wi-Fi, which is now on its sixth major revision (802.11ad). Wi-Fi was always designed to be a verbose in the way it signals to other Wi-Fi devices, however what made connectivity realisable in the early days has become a hindrance in our connected world.
Wi-Fi infrastructure was built to prioritise responsiveness and connectivity, rather than ensuring that users are able to communicate privately. And overall, people have not minded. The expectation of consumers is for Wi-Fi technology is that it just works and works quickly. To facilitate quick connectivity, many devices broadcast unique signals openly on the network in order to locate Wi-Fi access points. The information broadcast publically includes the names of other Wi-Fi networks the device has connected with before. This wouldn’t be a problem if access points were gibberish, but access points tend to reveal personal information, since network and device names are often highly descriptive, from John’s iPhone to Starbucks, Vodafone_Staff or 62 Britton St. As a result, with one broadcast from John’s iPad alone, we can know a large amount of information, all of which can be highly valuable for advertisers, social engineers, or nefarious hackers.
The insecurity of Wi-Fi is partially a result of how technology has expanded, some of which was unforeseen by its creators. However, the fact that technologies that are fundamental to our modern infrastructure are insecure, is not limited to old technologies. Despite the passage of time similar mistakes continue to be made. A good example would be the recent intervention by the government of the United Kingdom in the rollout of smart meters.
In 2016, British signals intelligence agency GCHQ took the unprecedented step of halting the deployment of smart-metered devices when it discovered they shared the same decryption key for deployment. Just as smart-meters were in the process of being rolled out to 53 million homes. A Whitehall official warned that someone breaching the key could start blowing things up.
As Dr Ian Levy, now head of the United Kingdom National Cyber-Security Center said later: “The guys making the meters are really good at making the meters, but they might not know a lot about making them secure. The guys making head-end systems know a lot about making them secure, but not about what vulnerabilities might be built into them”. This statement typifies the problem with the disconnect between manufacturers, vendors, implementers, and customers. A gap which can only be bridged with independent standardisation and regulations.
As both Wi-Fi, as well as the more recent case of smart meter standards in the UK illustrate, security and privacy is often an afterthought or bolt on, rather than built in by design. Such issues may be a result of how technology has expanded, some of which was unforeseen by the creators of the technologies fundamental to our modern infrastructure. Regardless of its cause, our devices, networks and infrastructures should be designed in a way that protects people
It is critical that civil society engage with regulators and standardisation bodies to produce technology that provides strong security, and thereby strong privacy, for users. Civil society and industry must also do a much more coordinated effort in education the public of the virtues of failing well. By this we mean that when security risks arise that the user is notified in a way which is understandable to them and where the danger is severe the product should fail rather than continuing in an unsafe state. Individuals understand the importance of fuses on their electronics, however the implementation of fuses in software remains mostly the purview of the technically literate and not of the masses. When devices and services are used they should minimise the cone of data they create. By this we mean that the verbosity of communications and logging should be at minimal during production use, superfluous data and metadata should be discarded, and where possible communication should be as privacy preserving as possible by not being uniquely identifiable or creating profile-able traits.
Privacy International has begun to engage with standards bodies to open a dialogue about ensuring future standards are able to handle the ongoing boom of connectivity in a security and privacy preserving way. Privacy International has forthcoming Data Exploitation Principles that it believes will be a set of guidelines standards bodies and implementers should follow when developing new technologies. Specifically Privacy International believes that technology such as devices, networks, services and platforms should not betray, or be capable of betraying, their users and that such items should not leak data.
5. The Myth of Device Control and the Reality of Data Exploitation
Our connected devices carry and communicate vast amounts of personal information, both visible and invisible.
What three things would you grab if your house was on fire? It’s a sure bet your mobile is going to rank pretty high. It’s our identity, saying more about us than we perhaps realise. It contains our photos, calendar, internet browsing, locations of where we go, where we’ve been, our emails, social media. It holds our online banking, notes with half written poems, shopping lists, shows our music taste, podcasts, our health and fitness data. It not only reveals who we speak to, but holds communications data and content, such as messages or photos, related to friends, family, business contacts.
You could search a person or search a house and never find as much information as you can in one device. If it was lost or broken, we feel the frustration and irritation of setting up anew. But what if someone else just took all that data, without your knowledge, without your permission. What if they could get more from your phone, than you knew existed. What if that someone else was the police?
In January 2017 Privacy International reported on an investigation by Bristol Cable into the unauthorised use of mobile phone examination tools by police, a practice which had undermined investigations into serious crimes. We followed up with Freedom of Information requests asking every police force in the UK whether they carry out mobile phone data extraction in low level crime cases and what company provided the tools.
Here’s what we found out: police in the UK use sophisticated technology to extract data from people’s phones contracting with Cellebrite, Aceso, Radio Tactics, XRY (an MSAB product), MSAB and Microsystemation (XRY product).
The companies aren’t shy about the benefits of their products. At a time when ‘the sheer amount of state stored [in mobile phones] is significantly greater today than ever before’ they claim ‘If you’ve got access to a sim card, you’ve got access to the whole of a person’s life’.
This includes access to information about you and your contacts, as highlighted above. But it doesn’t stop there. Products enable access to data beyond our knowledge and beyond our control, such as:
‘entered locations, GPS fixes, favourite locations, GPS info’;
‘system and deleted data’;
‘…inaccessible partitions of the device’
‘the acquisition of intact data, but also data that is hidden or has been deleted’
data from ‘beyond the mobile’ i.e. from Cloud storage
‘…copy of the entire flash memory’
What’s the problem?
We think that we own our phones, but what does this mean when there is data on our devices that we cannot access, that we cannot delete, that we cannot check for accuracy, and that is available only to those with sophisticated tools which are not accessible to everyone?
We are in a situation where our devices can betray us but we have little understanding of how, and what we can do about it. Only recently we learned that Uber tagged iPhones with persistent IDs that allowed it to identify devices uniquely after a phone had been wiped and configured from scratch.
At Privacy International we refer to the hidden data, the data beyond your view, as ‘Data in the wings’. Our concern around data in the wings does not just apply to mobile phones. As we discussed in our 2017 presentation at Re:Publica a criminal investigation involving an Amazon Echo in the US, saw Amazon protesting that no voice recording is stored on the device, yet the police were nonetheless keen to examine it and extracted data.
When law enforcement, with the power to arrest us, to charge us with offences, to remove our liberty, are able to purchase and use powerful extraction software to read data from our mobiles, from connected devices in our homes, and from the growing internet of things in public places, where consent of the owner or generator of the data is not deemed to be required,a lack of formal public debate, consultation, or legislative scrutiny is unacceptable.
This is not only because the police are obtaining vast quantities of datawithout consent for indefinite periods without clear oversight, guidance, or legislation, but because time and again they have proven unable to be trusted with our data.
In the UK, this was shown not only in the lax attitude towards encryption in relation to mobile phone data and in the use of databases for ‘not-work related reasons’, but also by the serious failings to protect highly sensitive information and a disdainful attitude towards data, which have been regularly reported over the years.
For example, in May 2017 when Greater Manchester Police was fined £150,000 after interviews with victims of violent and sexual crimes, stored unencrypted on DVD’s, got lost in the post. The Information Commissioner’s Office said that GMP ‘was cavalier in its attitude to this data and showed scant regard for the consequences that could arise by failing to keep the information secure.’
6. Super-Apps and the Exploitative Potential of Mobile Applications
For those concerned by reporting of Facebook’s exploitation of user data to generate sensitive insights into its users, it is worth taking note of WeChat, a Chinese super-app whose success has made it the envy of Western technology giants, including Facebook. WeChat has more than 900 million users. It serves as a portal for nearly every variety of connected activity in China. Approximately 30% of all time Chinese users spend on the mobile internet centers around WeChat and over a third of WeChat users spend over four hours a day on the service. WeChat’s multifunctional indispensability for many Chinese users make deep and integrated stores of personal data available for analysis and exploitation – by WeChat itself, by third parties or by the Chinese government.
What Is WeChat?
WeChat – known as Weixin (微信) in China – is a mobile application developed by Tencent, a Chinese technology company. WeChat first emerged as a chat service, permitting users to send messages using their mobile phones over the internet. Over time, however, Tencent has seamlessly integrated many other features into WeChat, transforming the application into the dominant gateway to the internet for its users. For instance, WeChat is also a social media platform, enabling users to follow each other and post updates, as well as to “subscribe” to other accounts, including media outlets, who use the WeChat platform to deliver content.
WeChat also has a “Wallet” feature, which serves as the hub for virtually any financial transaction. Using the Wallet, users can pay utility and credit card bills, transfer money to friends, and book and pay for taxis, food deliveries, movie tickets, hotels, flights and even hospital appointments, all without ever leaving WeChat. For example:
Didi Chuxing, China’s largest ride-hailing company, has a button embedded within the “Wallet,” which directs users to its service to book and pay for rides.
WePiao is a WeChat movie app, through which users select their city, theatre location and movie, and then pay for tickets.
Some hospitals have set up WeChat accounts, allowing users to use the platform to make appointments and pay for registration and medicine.
Part of WeChat’s success resides in its constant innovation of how users interact with the application. Take, for example, its deployment of the quick-response (“QR”) code. WeChat assigns each user a QR code, which serves as a digital ID, and also integrates a QR scanner into the application. WeChat users rely on QR codes to exchange contact information, make or receive payments (including in-store) or access web links, all without ever having to type anything into their mobile device.
In 2014, WeChat introduced a “Red Packet” feature, based on the Chinese tradition of exchanging red envelopes filled with cash on holidays or special occasions, such as weddings or birthdays. A “Red Packet” is a digital red envelope, which users can fill with predetermined amounts and send to other users. Over the 2016 Chinese New Year, 516 million users sent 32.1 billion red packets. The “Red Packet” feature, in turn, has reinforced user engagement with the WeChat “Wallet” and mobile payment system as well as boosted the growth of chat groups.
Tencent is also eyeing the growing integration of the “Internet of Things” and artificial intelligence (“AI”) into WeChat. In 2014, Tencent unveiled an API for smart devices, which permits hardware manufacturers to develop WeChat applications for those devices. And in 2015, it introduced an operating system for internet-connected devices. More recently, Tencent has begun dedicating resources to researching AI, opening labs in both China and the U.S. WeChat offers a rich resource for training AI including, for example, through the wealth of text and voice “conversational data” it collects.
WeChat’s dominance among Chinese mobile applications has caught the attention of Western technology giants, including Facebook. David Marcus, the Head of Facebook Messenger, has described WeChat as “inspiring” and openly discussed his plan for Messenger to incorporate WeChat-like features and services into its platform. Indeed, Messenger’s product roadmapdemonstrates this trajectory with plans, for example, to allow users to send money to one another or to purchase certain products or services directly on the platform.
And yet, rather than merely marveling at WeChat’s success at embedding itself into nearly every facet of everyday life, we should also think critically about its implications for individual rights and society as a whole. Consider first the information that WeChat can collect at the individual level:
Biometric information, such as voice data when logging in via “Voiceprint”
Contact lists shared with WeChat to connect with “Recommended Friends”
Location data – i.e. location of the device, IP address, and other information relating to location (e.g. geotagged photos) – while using WeChat
Log data, which includes web search terms, social media profiles visited, and details relating to other content accessed or requested using WeChat
Communications metadata (i.e. who, when, where) related to every chat, call and video;
Social media posts (text and photographic) and their metadata;
Bank and credit card details;
Financial transactions and their metadata, including payments of utility and credit card bills, to other users, and for everyday services such as meals, transportation, entertainment, travel, and even health.
As WeChat strives to embed its platform into the Internet of Things, the sensors on those devices – cars, toys, refrigerators – will also begin to generate data about user behavior. The result is that nearly every facet of our daily lives may soon be expressed as data and collected by WeChat. Pieced together, those bits of data can produce a rich, multi-layered profile of an individual, including his or her religious affiliation, political views and personality traits.
Consider further that WeChat can generate and collect this wealth of data at near-population scale. WeChat boasts over 937 million users, 768 million of whom are active daily. The vast majority of these users reside in China. In China’s Tier 1 cities, such as Beijing, Shanghai and Guangzhou, approximately 93% of individuals are registered WeChat users.
The data generated and collected by WeChat can yield inferences – accurate or inaccurate – about individuals as well as a broader population. Inferences can then be used to make decisions that affect people’s lives and even have a societal impact. At the same time, automated processing techniques like artificial intelligence are increasingly determining inferences and making decisions. The use of AI – a complex, computational system – makes it inherently different to understand and challenge both inferences and decisions.
Today, WeChat sends BMW advertisements only to a select group of users. Tomorrow, will it block services such as taxi-hailing or health appointment booking for certain categories of users? Today, WeChat manipulates timelines by censoring posts with certain keywords, depending on the location of the user. Tomorrow, will it manipulate timelines of users to determine their reactions to different political events? Today, WeChat user data may go towards a determination of creditworthiness by a peer-to-peer lending site. Tomorrow, will WeChat user data become integrated not only into credit scoring, but also into other records, such as health and employment histories?
Finally, and no less important, the collection of such vast stores of data make that data a honeypot for a variety of third parties. The most visible third party is the Chinese government, which has announced a plan to roll out a “social credit” system nationwide by 2020. That system – currently being piloted by various public and private bodies, but not Tencent – would seek to produce credit scores on the basis of an individual’s social and financial behavior, including internet activity. A person’s credit score could then be used to determine eligibility for a range of public and private services, such as school admissions, travel abroad and financial products. The depth and breadth of user data generated and collected by WeChat – as well as its own inferences generated from that data – make it a rich vein to tap for the social credit system.
China is also home to a pernicious ecosystem of private data brokers. Last year, Guangzhou’s Southern Metropolis Daily published an expose revealing how the sum of approximately $100 can unlock access to an astonishing amount of information about a person – including bank accounts, driving records, apartment rentals, hotel stays, and airline flights. The user data generated and collected by WeChat, which would encompass virtually all of this information, makes it a particularly attractive target to data brokers.
4. Fintech and the Financial Exploitation of Customer Data
Financial services are collecting and exploiting increasing amounts of data about our behaviour, interests, networks, and personalities to make financial judgements about us, like our creditworthiness.
Increasingly, financial services such as insurers, lenders, banks, and financial mobile app startups, are collecting and exploiting a broad breadth of data to make decisions about people. This is particularly affecting the poorest and most excluded in societies.
For example, the decisions surrounding whether to grant someone a loan can now be dependent upon:
A person’s social network: Lenddo, described by a journalist as “PageRank for people”, provides credit scoring based on a person’s social network;
The contents of a person’s smartphone, including who and when you call and receive messages, what apps are on the device, location data, and more: Tala, a California-based startup that offers loans in countries including Kenya.
How you use a website and your location: the British credit-scoring company Big Data Scoring analyses the way you fill in a form (in addition to what you say in the form), and how you use a website, on what kind of device, and in what location.
The car insurer Admiral attempted to use information from young drivers’ Facebook posts to develop a psychological profile, and offer them discounts on car insurance.
Financial services have begun using the vast amount of data available about individuals to make judgements about them and to use opaque artificial intelligence to rank and score people on their credit worthiness. As the founder and CEO of ZestFinance, Douglas Merill, put it: “We believe all data should be credit data”. ZestFinance uses their machine learning platform to work with financial institutions’ own data, as well as data from data brokers, to provide credit scoring. Their analysis revealed, for example, that borrowers who write in all-caps are less likely to repay their loans. In 2016, ZestFinance announced that the company had joined up with the Chinese search giant Baidu, to use their search data as the basis for credit scoring.
More ‘traditional’ credit reports contain data on issues like the status of your bank and credit accounts, and the positive and negative factors affecting your ability to get credit. In many countries, the data which can be included in credit reports – or must not be included – is regulated by law. Yet it is no longer the case that financial services consider only these finite sources of data relevant for credit-scoring or loan decision making: everything from what you post on social media to the type of phone you use has become considered relevant to financial decision-making. The use of these sources of data, rather than traditional credit files, is known as “alternative credit scoring”. Further, to analyse this growing amount of data, the decision-making process – often assisted by propriety artificial intelligence – becomes even more opaque; it is becoming harder and harder for an individual to understand or even query why they have been rejected for a loan or given a low credit limit.
Much of the discourse surrounding the use of alternative credit scoring, for instance, focuses on the notion of “inclusion”, and bringing in those groups who previously had no access to credit or financial services. However, there has been little consideration of the risk of exclusion emerging from the use of new forms of data by credit scoring companies.
For example, different groups of people use their phones and social media, in different ways. Some gay men in Kenya, for example, make use of multiple Facebook accounts, for their own safety and to give control over who knows what about aspects of their lives. However, as social media profiles are used to authenticate the identity of individuals, what impact does having multiple accounts have on the decisions made about credit scores? It has also been reported that if lenders see political activity on someone’s Twitter account in India, they’ll consider repayment more difficult and not lend to that individual.
Concerns surrounding data and varying degrees of opacity in often automated decision-making, extend beyond the credit sector, and are expanding across the financial services space. For example, in 2016, Admiral, a large car insurance provider, explored using the Facebook profiles of young drivers to provide discounts on car insurance. When prevented from doing this by Facebook, they turned to quizzes to try to profile the personality of young drivers.
The amount of data that the financial sector is gathering about our lives is increasing, and people are simultaneously being given limited options to opt-out of having their data collected and exploited in this way. An example of this expansion can be seen in the car insurance industry. Vehicles are increasingly becoming “connected” – meaning they use the internet in some way – and are basically drivable computers. Within the vehicles, telematics units collect data about how the vehicle is driven and how the internal components are functioning. This type of information is considered highly valuable by car insurers, which analyses how a person drives, as well as the locations they visit and when; this could be extended to look at how loud music is played in the vehicle, and more. Privacy International is conducting research into what data is held by the telematics units, and what data is transmitted by to the car companies and insurance providers.
More and more data are used to make or shape decisions that determine access to financial services, from sources that are far beyond the scope of what people might think as ‘financial’. There is more intrusion as more aspects of our lives are examined by the financial world to make judgements about who we are and what we might do. Because ever more data is used for credit scoring, this incentivises the generation of ever more data. The default option was previously (relatively) private – cash – but there is now many other alternatives. Plus, multiple actors are suddenly involved with your payment: depending on how you pay, it could involve your bank; the merchant’s bank; the credit card company; a mobile wallet provider; Apple or Google.
One example of this increasing, yet often hidden, loss of control for customers is financial companies’ collection of information related to howpeople fill out an online form, in addition to what they fill out within the form. For example, the scoring company Big Data Scoring does this through a cookie on a lenders’ website, that can gather data including how quickly you type in answers, what type of device you use, and your location.” Most people would not consider how valuable this data was in the decision making; the information you enter on a form becomes perhaps less important than how you fill it in.
There are potentially broad consequences for society stemming from financial data exploitation. How can people know, given the opacity of the decision-making, that the decisions being made are fair? And what about individuals who attempt to ‘game the system’ – can we be sure that the comments made someone’s Twitter profile are their real thoughts, or are they an attempt to get a better rate on a loan?
5. Profiling and Elections — How Political Campaigns Know Our Deepest Secrets
Political campaigns around the world have turned into sophisticated data operations. In the US, Evangelical Christians candidates reach out to unregistered Christians and use a scoring system to predict how seriously millions these of voters take their faith. As early as 2008, the Obama campaign conducted a data operation which assigned every voter in the US a pair of scores that predicted how likely they would cast a ballot, and whether or not they supported him. The campaign was so confident in its predictions that Obama consultant Ken Strasma was quoted boasting that: “[w]e knew who … people were going to vote for before they decided.” Has voter targeting gone too far?
Political campaigns rely on data to facilitate a number of decisions: where to hold rallies, which states or constituencies to focus resources on, and how to personalise communication and advertisement with supporters, undecided voters, and non-supporters.
Data-driven campaigning is nothing new. For decades, campaigns have been using and refining targeting, looking at past voting histories, religious affiliation, demographics, magazine subscriptions, and buying habits to understand which issues and values are driving which voters.
What is new, however, is the granularity of data that is available to campaigns. The US Republican National Committee, for instance, provides all Republican candidates up down the ballot with free access to “Data Center”, a query and data management tool that interfaces with more than 300 terabyte of data on 200 million voters, including 7,700,545,385 microtargeting data points. Similarly, the data analytics company and voter contact platform i360, which is financed by the Koch brothers, and maintains a database of over 250 million US citizens and voters, containing thousands of pieces of information that provide “the full picture of who they are, where they live, what they do and what is happening around them”. Together with millions of email addresses, phone numbers, and other personal data such as data gathered through donations, at rallies, and though merchandise, political campaigns have come to know our deepest secret.
It is in this context that Cambridge Analytica, the UK-based data analytics firm, has exploded onto the scene in 2016, following revelations that it might have played a role in both the US election and the UK Brexit referendum. The company claims to possess 5,000 data points on 220 million Americans, made up of psychological data culled from Facebook paired with vast amounts of consumer information purchased from data-mining companies.
Data brokers and other data mining companies all collect or obtain data about individuals (your browsing history, your location data, who your friends are, or how frequently you charge your battery etc.), and then use these data to infer additional, unknown information about you (what you’re going to buy next, your likelihood of being female, the chances that you are conservative, your current emotional state, how reliable you are, or whether you are heterosexual etc.).
Voter tracking also doesn’t end online. Shortly after the Iowa caucus in early 2016, the CEO of Dstillery, a “big data intelligence company”, told the public radio program Marketplace that the company had tracked 16,000 caucus-goes via their phones and matched them with their online profiles. Dstillery was able to identify curious behavioural patterns, such as people who loved to grill or work on their lawns overwhelmingly voted for Trump in Iowa.
Essentially, companies like Cambridge Analytica and i360 do two things: gather massive amounts of data about individuals, use these data to profile and predict even more intimate details about individuals, and use these profiles and predictions to personalise political messaging, such as ads on social media, and to inform strategic campaign decisions.
At a time when political campaigns can harvest troves of data, the media landscape itself is undergoing a radical transformation. Even though television, newspapers, and radio still play an important role, more and more people are getting their news and learn about political candidates through social networks. More importantly, these networks give us a sense of what others might think about issues and candidates.
Social media sites go at length to paint themselves as neutral platforms that are decidedly different from traditional media that sets agendas and makes the news, yet they inevitably come with their own biases. These often highly personalised spaces that are optimised for engagement and build around advertising. From newsfeed algorithms to content moderation, social media sites shape what kind of content we see and how frequently we are exposed to information or people we disagree with.
In the midst of this changing environment, political campaigns are microtargeting voters – whether through ads online, or in personalised phone calls and visits – to change their behaviour. How people feel about a political candidate, who they ultimately vote for, and whether they decide to vote at all, is shaped by a number of factors. Research has shown that even the weather can affect voting behaviour. However, our most conscious decisions are routinely influenced by unconscious thought-processes, emotions, and prejudices. This is precisely why commercial and political public relations efforts are so focused on creating ever more intimate profiles about us. Take it from the Director of YouTube Advertiser Marketing at Google: “voter decisions used to be made in living rooms, in front of televisions. Today, they’re increasingly made in micro-moments, on mobile devices. Election micro-moments happen when voters turn to a device to learn about a candidate, event, or issue”.
The way in which data is used in elections and political campaigns is highly privacy invasive, raises important security questions, and has the potential to undermine faith in the democratic process.
Campaigns frequently rely on data that individuals might not have consented to disclosing in the first place. Large voter databases also frequently rely on commercially available data from data brokers, or publicly available records and data that is accessible online to build highly intimate profiles. This is especially concerning when sensitive information, such as political beliefs or personality traits are inferred from completely unrelated data using profiling. Personality traits, for instance, can be predicted from likes on social media. The fact that commercial data, public records, and all sorts of derived information are used for political campaigning would come as a surprise to most people.
There is also a security component to the amassing of sensitive data. Databases that are compiled for strategic political communication often contain highly sensitive information. Such databases frequently get hacked, leaked, stolen, or simply shared, as the 2017 accidental exposure of a database holding personal data on almost 200 million US citizens demonstrates. Data security is further threatened by the fact that databases are frequently shared, once a candidate drops out, or when the election is over. Nearly every single 2016 US presidential candidate has either sold, rented, or loaned their supporters’ personal information to other candidates, marketing companies, charities, or private firms.
Voter profiling also raises more fundamental issues. The use of data in elections is merely the tip of the iceberg in an economy where consumers are increasingly becoming the product, not the consumer. The entire (online) advertising ecosystem is organised around real-time flow of data about peoples’ lives and has been built to modify people’s behaviour for profit. Companies that have accumulated years of sensitive data on billions of people around the world may be able to change people’s actual behaviour at scale.
6. Connected Cars and the Future of Car Travel
As society heads toward an ever more connected world, the ability for individuals to protect and manage the invisible data that companies and third parties hold about them, becomes increasingly difficult. This is further complicated by events like data breaches, hacks, and covert information gathering techniques, which are hard, if not impossible, to consent to. One area where this most pressing is in transportation, and by extension the so-called ‘connected car’.
When discussing connected cars, it is important to define what is meant by a connected car. A connected car is any vehicle that is able to use sensor data to take affirmative actions, and which stores or retains that data for future processing or aggregation.
Cars have become inaccessible computers which collect increasingly granular data, not just about the car itself, but also behaviours of drivers. For example, Tesla, Inc. collects driving data of its cars owners to improve its own artificial intelligence systems, a practice which will only increase as most car companies introduce autonomous cars. One domain in which this can have potentially harmful implications is the context of car insurance.
For years, car insurance companies have been relying on annual mileage data to determine insurance rates. However, now that cars are becoming increasingly connected, insurers have an unprecedented ability to access and judge (what they deem to be) bad driving habits.
Insurance companies are already asking customers to install proprietary telematics units into their cars. These units can monitor many of the sensors, controllers, and actuators in the car, giving insurance companies unprecedented access to customers’ interactions with their cars. Such systems, in addition to the cars’ own internal diagnostic and monitoring systems, can track actions of the driver and score them. Some examples include:
how often and the amount of force applied to the brakes;
what time of the day a person is driving;
and in cars which use satellite navigation, the type of roads a person is traveling, which is learned based from driving style, such as over-revving, aggressive acceleration, or erratic steering inputs
A telematics system is similar in function to both the black box flight recorder found in aircraft mixed with the functionality of embedded systems, similar to what is inside mobile phones. What this means in practice is real-time monitoring and collection of data about core parameters of a car, along with maintenance of communications to external entities commonly over CAN (the car’s internal network, similar to the Local Area Networks used in homes and offices but scaled down and simplified for the interconnection of systems in a car), USB or wireless technologies such as mobile data, Bluetooth or WiFi.
Tesla uses a combination of telematics and sensor data to assist with its machine learning and autonomous driving systems. Telsa is aggregating this data and processing it to improve how its AutoPilot systems work for customers who have chosen to have it fitted. Customers who are not benefiting from the AutoPilot system still have their data collected by Tesla, to improve the overall functionality of the system.
Machine learning is the ability of an algorithm (a selection of operations carried out in a defined order or way), to use the outcome of one or a number of operations to infer a response to similar stimuli in future in a system of continuous self-improvement. Aggregation is the collection of data to identify trends in and recognise patterns – this is often referred to as big data analytics.
Although Tesla is currently collecting data primarily for the improvement of their own products and services, other manufactures are already showing an interest in similar data collection, along with third parties like Google. Based on past behaviour it is feasible to foresee a future where driving data could be readily shared with advertisers, insurance companies, local and municipal government, and law enforcement for a myriad of reasons.
7. Smart Cities and Our Brave New World
Cities around the world are deploying collecting increasing amounts of data and the public is not part of deciding if and how such systems are deployed.
Smart cities represent a market expected to reach almost $760 billion dollars by 2020. All over the world, deals are signed between local governments and private companies, often behind closed doors. The public has been left out of this debate while the current reality of smart cities redefines people’s right to privacy and creates new issues of exclusion.
The definition of a smart city varies – for IBM it is about finding “new ways for the city to work”, for Alphabet company SideWalk Labs it is about “building innovation to help cities meet their biggest challenges”, a research paper commissioned by the British Department for Business Innovation and Skills describes smart city solutions as “applying digital technologies to address social, environmental and economic goals”. Beyond the marketing discourse, it is important to focus on what the essence of smart cities is: the use of data collection and technology to provide services in a city.
Smart cities range from the all-encompassing backbone infrastructure offered by companies like IBM and Oracle, who offer service going from security to transport and energy – often connecting various government departments to facilitate the flow of data, to apps created by start-ups that allow you to report potholes to your local authorities.
Smart cities raise a series of problems. First of all, the right to privacy is entirely redefined in a smart city, as they create an environment where we are no longer expected to consent to the collecting, processing, and sharing of our data but instead the minute we step in the streets we are exposed to both government and corporate surveillance. And not only is there no opting out but more likely than not you will not even know that data about you is being collected.
In fact, you may not realise how much data is already being collected about you. In City of London for example, smart garbage bins were installed in 2012. The bins collected data from people’s phones to provide them with targeted advertising. People did not know their data was being collected as they walked by the bins nor did they realise the advertisement they were getting was specifically targeted to them until journalists exposed their existence a year later.
In a smart city, there is no longer such a thing as wandering around while no one knows you are there. In fact, a simple walk in the park is now enough for you to be tracked. In another example from London, Hyde Park aggregated data on age, gender, and location – based on data that was handed to them by network provider EE – to review the number of visitors and their typical journey through the park.
In the city of Singapore, sensors and cameras have been placed all over the city which has allowed the government to follow citizens step-by-step through an interface called Virtual Singapore. The government can even use the interface to run experiments and see how people would react to epidemics or earthquakes – based on the data that is being constantly collected about them.
While the discourse around smart cities has been largely focused on rewards and the positive changes smart cities bring about, once data is collected it can also be used for punishments. In Hong-Kong, for example, an anti-littering campaign group has been using the DNA found on discarded cigarette butts, gum, and condoms to reconstruct the face of the people who had left them in the streets. They then expose the portraits of the “culprits” in billboards across the city.
In Beijing, to tackle the issue of people stealing toilet paper from public toilets, the busiest public toilets have now been equipped with a facial recognition system. In order to obtain toilet paper, one needs to stand in front of a camera. And those who come too often are being denied.
What does it mean to live in a city where one is constantly tracked? We first have to bear in mind who will be the most exposed. Saudi Arabia, for example already has a system in place to text male guardians when women leave the country – what will happen to women in a smart city? We know that groups as varied as the security staff of malls, social media intelligence companies, and the Department of Homeland Security have all been monitoring members of the Black Lives Matter movement. So where will be a safe space for minorities and civil rights group to gather, meet, organise, and protest when each step is monitored in the city?
Thinking that a world where smart city data is collected and used to prevent protests is not science fiction, and China is already doing it. US intelligence firm Stratfor has been reporting on the creation of a “grid management system” that is effectively a spying apparatus merging data collected by the state, as well as CCTV footage and data from internet monitoring, to pre-empt social unrest.
We therefore know how smart cities – and Singapore is a perfect example of this – are becoming systems of real-time control. There may be some positive aspects of this – a smart transport system could help redirect traffic in the event of an earthquake – but we expose ourselves to the risk the city will be used against us, as we have seen in China. What if a ‘smart’ transport system was used to redirect traffic to keep people away from protests?
People are already starting to resist. In London the smart bins had to be removed after an outcry of privacy concerns. In Singapore, the government had created smart flats for the elderlies with sensors to detect their movement, people have started to use towelst o cover the sensors and protect their privacy.
Who do we build these cities for?
Beyond privacy it is worth asking who smart cities are being built for. Placing technologies at the heart of the services creates issues of exclusion for those who do not have the same access to technology. It is of particular concern for less-abled people or people who cannot afford access to technology. A worrying example from New York showed that the free Wi-Fi was only secure for Apple users and not for Android users.
The issue of gender is also important to bear in mind as studies have documented how women experience cities differently from men. While men tend to describe their journeys as going from home, to work and then back home, women tend to have much more complex journeys, which include picking up their children, buying the groceries, and visiting family members. Street harassment is also central in understanding why women have different perceptions of cities from men. In India, for example, the Indian government plans to build 100 smart cities – this has meant that the land of the most disfranchised has been forcibly acquired in order to build smart cities, which we suspect will be largely built for the wealthiest. We also know that India’s effort to “clean up” cities – that has come about at the same time as smart city projects started developing – has pushed away street vendors, whose very presence help make cities safer for women, as they mean streets do not end up being deserted and they can act as a point of contact for them.
Building smart cities for the wealthiest also result in failing to improve the lives of the majority of the population. In 2013, the city of Rio de Janeiro was awarded a World Smart City Award, as they prepared for the Olympics and the Football World Cup by setting up an Operation Centre and an Integrated Centre of Command and Control designed by IBM. The system aimed to improve public services, make the city safer, and more efficient. However, in 2016 research, published by Christopher Gaffney (University of Zurich) and Cerianne Robertson, showed Rio has not kept its promises and remains for most people a “dumb city”: one of the key issue highlighted in the research was that the smart city has been focusing on the wealthier areas leaving the majority of the city without improvement.
The next step for smart cities will be relying on algorithmic decision making to make the city more efficient and decide for instance how energy should be allocated, when and where lampposts should be turned on, how frequently the trains should run, etc. However, we know already how decisions made based on biased data reflects the biases in our society. Boston, for instance, set up an app to detect potholes in the streets, but if potholes are only fixed when the city is notified through this app what about areas where people are less likely to carry smart phones or download the app? We know that in the US the police has more data on crimes in areas where people of colour live and that predictive policing algorithms have been biased for that very reason. So how will using algorithms to allocate services affect the lives of city inhabitants? Will we risk denying people who are already struggling to afford energy access to the gas, electricity, and water they need to stay warm in the winter? Will we risk some streets in the dark at night and make those streets less safe for women? Will we risk reducing transport and access for people whose areas are already underserved?
Finally, smart cities present serious security concerns: as more and more hacks and cyberattacks demonstrate we are not yet doing a decent job at securing our databases and online systems. And yet we are sleep-walking into a world of internet of things where everything in our cities will soon become vulnerable to hacking.
In December 2015, Ukraine’s power plant was hacked leaving more than 80,000 people in the dark for at least three hours. The hackers also targeted the energy companies’ monitoring stations, which meant that the energy company did not notice anything unusual. Moreover the phone helpline was targeted as well with a TDOS attacks, meaning that the phone line received a flood of fake phone calls from outside Ukraine that prevented legitimate customers from calling and reporting the power cut.
Research has also already demonstrated how vulnerable street lamps could be. Now what if the sensors that are tracking us, our public transport systems or our energy grids fall again between the wrong hands? What are governments and companies really doing to insure it does not happen?
8. Invisible Discrimination and Poverty
Online, and increasingly offline, companies gather data about us that determine what advertisements we see; this, in turn, affects the opportunities in our lives. The ads we see online, whether we are invited for a job interview, or whether we qualify for benefits is decided by opaque systems that rely on highly granular data. More often than not, such exploitation of data facilitates and exacerbates already existing inequalities in societies – without us knowing that it occurs. As a result, data exploitation disproportionately affects the poorest and most vulnerable in society.
The gathering of data about us that determine what advertisements we see is usually invisible, with users knowing very little about what data is gathered and how it’s analysed. What we do know, however, is that results can be discriminatory: a study of the ads displayed by Google on news articles, for instance, are far more likely to show adverts on “executive-level career coaching” if Google believes the user is a man. In 2013, Harvard University’s Latanya Sweeney found that when searching for names online, black-identifying names are much more likely to get an ad suggestive of an arrest record.
Such discrimination can have real world consequences, most evidently perhaps in housing. Neighbourhoods can entrench some populations in poverty. For example, in the US, black people are far more likely than white people to live in areas of concentrated poverty, which has an impact on other aspects of their lives from access to jobs, education, and even happiness. Targeted housing adverts online can become an important part of this, since they decide which neighbourhoods are available to certain populations.
Employment is another field where automated decision making by opaque systems raises concerns. There’s an extent to which this was developed to prevent the type of “old boys network” that would lead to people only giving jobs to people that they know. But we cannot assume that algorithms are fairer than human beings; and systems are often trained on historical data and thereby replicate the existing biases of an employer. In the US, 72% of CVs are never read by a person, but rather are sifted and rejected by computer programs. The job applicant with knowledge of these systems will attempt to game the system, or at least influence the results, by mentioning the keywords in their CV that they think the system will be looking for. Personality tests are another tool that is used in job testing, where a questionnaire filled in by the applicant is used to reject candidates. The risk of discrimination is high here also – these are essentially questions about a candidates’ mental health, thus preventing people with mental health conditions from entering the job market.
There are also ways in which the tech we use deepen existing social economic inequalities. For example, while the smartphone market in India is expanding quickly, most people buy phones that use older versions of the Android operating system; 72% do not include the latest version at the time of purchase. For many of these phones, an update to a newer, more secure version of Android is simply not made available. As a result, the poorest are put in a position where their devices are less secure, and at risk of attack. The poor are denied the opportunity to be secure online.
From targeted advertising, to housing and employment, opaque, and often unaccountable systems have the power to reinforce inequalities in society. They can do this by excluding certain groups of people from information, the job market, or limiting their access to the benefits system.
The problem is that such discrimination can be invisible and even unintended. In the case of advertising, for instance, specifying “Whites Only” is illegal in many countries around the world. But, there are other characteristics that can be used to determine, with some degree of likelihood, a person’s race: musical tastes, for example, or identifying certain “hipster” characteristics that are more likely to apply to one race over another. As a result, an apartment that is specifically targeted at people with certain tastes and interests can inadvertedly exclude entire sections of the population. Those affected, will not realise that they been harmed at all. In the words of Michael Tschantz, a Carnegie Mellon researcher: “When the nature of the discrimination comes about through personalization, it’s hard to know you’re being discriminated against”. As a result, it’s difficult to challenge decisions that are unfair, unjust, or simply unlawfully discriminatory.
Another dimension to this problem is that those who are already marginalised are much more likely to be subjected to opaque and automated systems in the first place. While executives are headhunted, the CVs of those who work on low paying jobs with high turnover are subject to automated sorting. Another area where this is most evident is social security, and government benefits. In 2014, Poland introduced the profiling of unemployed people: based on a computerised test, people are placed into one of three categories. These categories affected the level of support that benefits recipients received from the state. However, the algorithm, and thus the decision-making that affects the lives of benefits participants, is kept confidential. This lack of transparency, and the risk of discrimination, is problematic for this vulnerable population.
The fact that opaque systems are making consequential decisions about people’s lives, means that it is essential that the nature of the decision-making is made explicit to the individual. Similarly, researchers, regulators, and watchdog organisations should be able to audit such systems for systematic biases. Given that the tools for employment, for example, are widely used – particularly for entry-level jobs – it essential that people are able to understand how these decisions about them are made in order to prevent a cycle of rejection, and contest illegal discrimination and unfair treatment.
9. Data and Policing – Your Tweet Can and Will Be Used Against You
Police and security services are increasingly outsourcing intelligence collection to third-party companies which are assigning threat scores and making predictions about who we are.
The rapid expansion of social media, connected devices, street cameras, autonomous cars, and other new technologies has resulted in a parallel boom of tools and software which aim to make sense of the vast amount of data generated from our increased connection. Police and security services see this data as an untapped goldmine of information which give intimate access to the minds of an individual, group, or a population.
As a result, the police have the ability to enter and monitor our lives on an unprecedented scale. As our online lives increasingly blend with our lives offline, the police are able to enter our personal lives and monitor our social interactions on social media, monitor public and private places with drones, collect our license plates using ANPR, capture our images on CCTV and body worn cameras, and use facial recognition technology. Much of this is invisible to the human eye, undermining comprehension of the seriousness of this intrusion. If we could physically see what is happening, there would be outrage, loud and clear.
The police aren’t doing this alone. Throughout the world, the police not only purchase software and hardware that enables highly intrusive means of gaining access into our lives. In a world where police, security agencies, and private companies have increased ability to collect more and more data about us, the police outsource collection and data analysis to third parties.
Information that we knowingly share on social media, such as posts, photos, and birthdays, as well as data that we unknowingly share on social media, such as the time of day we are active on the platforms, our location, and information (which can revel our mood, such as if we write in all upper or lower case) is all of value to the police and security services. The police and security services gain access to this data by collecting information in the public domain, as well as by accessing commercially available data in public and private databases via third party data brokers.
A slew of third party companies offer police and security services tools to pull information from social media, data broker databases, and public records, into a centralised hub, use software to organise and analyse the data, and turn it into actionable intelligence. This intelligence could be information about the likelihood of a suspect becoming violent, the movement of activists, or the likelihood someone is a terrorist.
In 2015, the US Justice Department and Federal Bureau of Investigation admitted that the watch-listing system and no-fly lists in the US was based on “predictive assessments about potential threats”. People who had never been charged or convicted, of a violent crime were being flagged by the system, with little ability or opportunity to understand or question what had prompted the system to flag them. Because the machine outputs the decision, without giving insight into why the decision was made, the systems’ human operators are not able to offer further explanation of the decision. Such a decision affects a person’s ability to move freely. Individuals are treated as guilty, reversing the burden of proof.
The Fresno Police Department in Fresno, California use a programme which looks at billions of data points from social media, arrest reports, property records, police body-worn camera footage, and more, to calculate threat scores of suspects. The police consulted the system after receiving an emergency call about potential domestic violence. Similar systems, which allows police to consult real-time information, are opening across the US, including in New York and Texas.
Seeing opportunity in the migration crisis, IBM is developing software it says will be able to predict if a migrant is an innocent refugee or a terrorist. The company is creating a system which brings together several data sources to analyse the probability of a passport’s authenticity and more. Some of the sources the system pulls data from include “the Dark Web […] and data related to the black market for passports”, as well as social media and phone metadata. The implications of such a system could be catastrophic for those who are already vulnerable.
Privacy invasion and chilling effects
As the surveillance of our lives by the police becomes more widespread, as well as more publicly understood, the way we use and interact on the web, and as a consequence, the way we interact offline, will become undermined. As our online movements, as well as how we move around online, are swept up and analysing by governments, police, and companies, we lose the ability to autonomously explore, interact, and organise.
Fostering a sense of omnipotent surveillance harkens back to 1970, when FBI headquarters sent out a memo urging agents to increase their interviews with activists to “enhance the paranoia endemic in these circles and will further serve to get the point across there is an F.B.I. agent behind every mailbox”.
The wide and varied forms of data collected by police, governments, and shadowy third-party data brokers, combined with increasingly reliance on opaque software for collection and analysis, and the increasing use of secret algorithms and/or complex machine learning, have resulted in powerful decision-making systems that are nearly impossible to challenge. As a result, we, as a society and as individuals, are left unable to understand how or why decisions about us are being made. Even knowing when a decision was made is becoming difficult.
Data that is used in these systems can never perfectly represent reality, and is therefore always partial, meaning that the decisions machines make are necessarily imperfect. Police and governments increasing their reliance on these systems while simultaneously lacking the technical expertise to understand the consequences of depending on decisions made by machines using flawed, inaccurate, partial, and biased data, is problematic and dangerous. Furthermore, without public understanding of what data is put into systems or how the machines make a decision, police, governments, individuals, and societies will have limited ability to understand the consequences of large-scale dependence on these systems.
The role of corporations
Despite clear risks associated with the use of exiting potentially biased datasets, companies such as IBM, Microsoft, Cisco, Oracle, and Palantir offer platforms which allow police to navigate through large datasets to facilitate their investigations and responses. Sold as a solution to the problem of analysing massive amounts of available data, police and security agencies have begun to trust and depend on the decisions that the software outputs. Yet still we lack a public debate about what it means to use this data, what is means to challenge it, and how the most vulnerable in society are disproportionately impacted it.
10. The Gig Economy and Exploitation
Gig economy jobs that depend on mobile applications allow workers’ movements to be monitored, evaluated, and exploited by their employers.
The so-called “gig economy” has brought to light employers’ increasing ability and willingness to monitor employee performance, efficiency, and overall on-the-job conduct. Workplace surveillance of gig economy workers often happens without employees’ awareness or consent. This is especially evident in the app-based gig economy, where apps act both as an important tool for employees to do their job, while also being a means for employers to conduct active surveillance of their workers.
Historically, employers, especially in the US, UK, and Europe have possessed the right to record employee activities and other details related to experience and performance. Examples of such surveillance include requirement of employee’s criminal history, continuously acquiring employee’s health records, check-card absence that records employees working time, and monitoring of employee’s office email. Generally speaking, in the US, UK, and Europe when employees are made aware of such surveillance, the monitoring has become a normal practice. Workplace surveillance is evolving however, and in the context of the gig economy and especially in the app-based gig economy, methods of surveillance have become both cheaper and more readily available, while increasingly surveillance is occurring without the awareness of the workers.
As technology evolves and changes the way we work, it also blurs the line between our private and work life. It also enables excessive datafication of employees by employers. Technologies used to supervise employees at work can include smartphones and wearable devices, both of which could be used to supervise employees in their private life. The consequences of such monitoring are becoming visible in the gig economy.
Our phones leak traces of our activities all the time such as our location, usage patterns, and habits, and apps within our phones gain access to the vast amounts of data on our phones, including contacts, photos, and more. Employers who require employees to interface with the company via a mobile application are able to access more information about their workers than workers would initially expect.
For example, to work for the delivery company Deliveroo, workers are required to download the Deliveroo app. The app constantly records information about the device and worker, such as what kinds of routes the worker takes to a location, how often they use the application, how long they wait in a restaurant, and how long they wait outside the customer’s house.
Similar practices occur across app-based gig economy jobs, including Uber, where the company even records data about how fast the drivers accelerate and how fast they brake.
Companies benefit greatly from collecting such detailed records as it allows businesses to quickly respond to changes in environments (increasing the price of an Uber ride when demand is high) and to demand strict efficiency from their workers (monitoring how long it takes for a Deliveroo cyclist to get from point A to point B). In addition to a variety of obvious exploitative practices of the gig company, the use of mobile applications to monitor workers, is especially problematic because companies do not only collect work-related data, but inevitably also trace the movements and behaviour of their drivers whenever they use their phone.
For example, a Deliveroo driver once called the company while making the delivery to inform that his battery is running low – only to find out that the company already aware of his battery condition. The driver is not aware that such personal information from his phone is easily traceable by the employer nor that the driver gives explicit consent to the action. The amount of data collected by the employers are massive, and it is clearly not the kind of data that is usually being collected from a ‘contractor’. This example has shown that the Deliveroo app, however important, clearly give access to other information on the worker’s phone (which is often the worker’s personal device) without the awareness of the phone’s owner.
Requiring employees to use the applications as the primary manager at work means that employees are constantly carrying around a device that enables an employer to potentially know the worker’s location and activity. Workplace surveillance goes beyond monitoring productivity in the workplace, and allows employers to potentially learn deeply personal information about their workers such as their outside-of-work behaviour and characteristics. This highly granular data can be used for performance assessment and in hiring decisions.
Such monitoring is typically done without consent and sometime awareness of employees, which makes it especially problematic. How the data collected about employees is used tends to be opaque, leaving no room for employee to challenge the process.
The difficulties of challenging the surveillance and unconsent data collection practice is made worse with the precarious status of employees in gig economy. Because there is no official employment agreement between employees and employers, employees have fewer rights and protections to challenge their employers. The risk of losing their job also hinders employees to challenge the surveillance practices. Companies may argue that employees give the employers permission to collect and use the data when they begin working at the company.
Furthermore, privacy of users and employees is at risk when the data is shared with third parties, such as the government. When Uber shares data about how its cars are moving around a location with the local authorities for the sake of traffic analysis, it includes the commuting pattern of drivers as well as users. This data can potentially reveal a detailed profile of users and drivers.
There are various similar examples. The latest Uber app version enables Uber to access users’ locations until five minutes after they got off the car, even if the app is not in running in the background of the user’s phone. Another notable case involving Uber was when Apple discovered that Uber continued to collect users’ data after the app was deleted from an iPhone. This practice is known as ‘fingerprinting’; where one can identify and trace a person’s phone even after the owner deleted the application. Apple forbid application developer to use fingerprinting since it is susceptible to privacy intrusion. Uber stopped the practice after Apple threatened to wipe out Uber from its application store.
The delivery service Deliveroo also provides examples. In 2016, a Deliveroo driver was terminated because the restaurant reported him picking up food with his helmet on. Without further discussion, the company instantly terminate his contract. This example shows how the surveillance is not only performed by the employers, but also by the partners (in this case the restaurants and merchants), the public, and the customers. This detailed information about driver’s activities are being used by the employer to assess their driver, but it also puts the driver at greater risks to lose their job. As a contractor, driver is not given appropriate room for discussions if such case happened.
Other data that the company may be collecting and generating about users and for what purpose they are collecting the data remain unknown. The opacity and secrecy of such surveillance is prone to privacy intrusion, as employees and users are often not aware that their personal data is being collected, let alone being processed for specific purpose. This has shown that our devices and apps are fundamentally able to betraying us.