The past five years has seen a steady stream of books identifying what is wrong with Big Tech. I list four of them, with the proposals they make to repair Big Tech, at the end of this blog. Between them, they present a confusing number of diagnoses and solutions to the problems of Big Tech.
Some of the solutions, such as breaking up Big Tech, would make the problem bigger. Others, such as prohibiting surveillance capitalism, are wishful thinking. In this blog, I focus on the root cause of the problem and present a minimal set of proposals that deals with the root problem. These proposals are still difficult to implement, but I think that any way forward should at least include these solutions.
The proposals that I will present below are (1) to increase transparency about what Big Tech does with our data, and (2) to create competition among content matching platforms by introducing interoperability. These proposals will not make all problems go away. But they are a good start.
First, what are we talking about?
What is Big Tech?
We are not talking about big tech companies like Siemens, Intel or Samsung, who manufacture technical products. Big Tech in this blog, and in the books listed at the end of this blog, refers to digital matching platforms, such as Google, Amazon, Twitter, Meta, Uber, AirBnB and others. These companies match supply and demand, often but not always with a third side, advertisements.
We need to distinguish two kinds of matching. Content matching platforms match the supply of web pages, videos, music, posts, tweets and other digital content with demand for digital content. Examples are Facebook, TikTok, YouTube, Twitter, Amazon Music, WeChat, etc.
Product matching platforms are marketplaces for products or services that are bought and sold on the platform. Examples are Amazon marketplace, Etsy, eBay, TaoBao, FlipKart, Uber, AirBnB, etc.
Both kinds of platforms get information from publishers and forward this to users who the platform thinks need this information. Google forwards web pages to people who search information. Facebook forwards posts to users. Amazon Marketplace forwards product information to potential buyers. Uber forwards driver information to riders.
When supply and demand match, some commercial transaction takes place. Here, the business models diverge. On Amazon Marketplace, a sale takes place. On Google Search and most social networks, a user sees the content they want, and also sees an advertisement. On Uber, a driver sells a ride. On the free version of Spotify, a user listens to an audio track, and also hears advertisements.
Is there a problem?
What is the problem with this? Not so long ago these companies were hailed as harbingers of a new economy that would benefit users, shareholders and the general public.
Today, Amazon, Apple, Facebook and Google are subject to anti-trust litigation for a variety of reasons, such as monopolizing online advertising, suppressing competition, and competing unfairly with their own customers. This blog is not about anti-trust, so I will not pursue these issues here.
The core problem with matching platforms that I want to focus on in this blog is that the business model uses people as resources from which to extract data to generate revenue. Data is the new oil, but unlike oil, data is extracted from people, not rocks. That creates a responsibility of platforms to the people they extract data from, that oil companies do not have with respect to the rocks they extract oil from. People must be treated differently from rocks.
Let’s discuss this responsibility before we discuss the way platforms generate revenue from the extracted data.
Extracting data from people to generate revenue is called surveillance capitalism. Some critics have suggested prohibiting this. No data collection, no “data voodoo dolls”, no microtargeting, no corporate surveillance.
However, that won’t work. I do not see how we can erase a part of the economy, of considerable size, that consists of companies that earn money with data collected about us. What we as data sources should do instead is demand transparency from these companies. We should demand that platform companies
- Be transparent about the data they collect about their users;
- Be transparent about the beneficiaries of the data;
- Allow independent auditors to check whether the company’s description of the data collected about people is correct and complete.
When researchers collect data about people in order to generate knowledge, they must obtain informed consent from the subjects. I see no reason why companies that collect data about people to generate money should not have the same obligations: acquire informed consent, prevent harm to the subjects, and be transparent about what you do with the data.
Platform companies are able to provide clear, crisp and explicit descriptions of the benefit they bring to the world. At the same time, they can be incomprehensible in their description of the data they collect about us. But to be transparent about the collected data is not to provide a 15000-word document in incomprehensible language. Providing clear descriptions requires a lot of effort.
So there is a fourth requirement here:
- Each government should create an agency with the power to force platforms to produce understandable descriptions of the data collected about people, in the language of those people.
Let’s call the descriptions of what a platform does with your data a transparency report.
Platforms extract data from people (1) to sell the data, (2) to construct profiles for targeted advertising, or (3) to give product or content recommendations to people.
For example, Twitter and LinkedIn sell user data by giving third parties paid access to it. Facebook and Google use data to build user profiles, used to sell targeted advertising services. And all matching platforms construct user profiles to recommend new content or products to users.
These revenue models should be described in the transparency report mentioned above, so that users know how the platform is financed with their data.
Advertising revenue models
There is an additional problem that content matching platforms have with advertising revenue models. Platforms like Facebook, which generate revenue from advertising, try to maximize “user engagement”, the time a user spends on the platform. This maximizes the revenue the platform can generate by showing advertisements to the user.
But it turns out that hate speech, misinformation and fake news engage users more that less extreme content. This has turned social networks into amplifiers of extreme content that creates outrage, polarization, and misunderstanding. Social networks have become a playing field for trolls and political manipulators.
To break the spell of advertising-based revenue models, we need competition of different revenue models. If users do not want to be exposed to advertising, they should have the choice to move to other platforms, that are not financed by advertising. If users do not like what a platform does with their data, they should be able to move to other platforms that don’t exploit their data. If they don’t like the fact that a platform amplifies extreme content, they should be able to move to a platform that does not amplify extreme content.
But for content matching platforms, users have no choice: If they leave a platform, they leave all their posts and friend lists behind. And from another platform they cannot communicate with their old friends on their old platform.
To create competition between content platforms, we need interoperability of platforms.
If users of different social networks could communicate just as users of different telco providers can, then the barrier to switch networks would be lower. And if they could take their data with them, the burden of switching cost is on the network rather than on the user. Social networks would then actually have to compete on the value they create for their users rather than for their shareholders only.
Networks could then attract users with advertising-free services paid for by a subscription, or by a small pay-per-message fee. They could compete on the level of moderation they offer. Once the network effect is taken away from a single company, its monopoly on content matching and digital advertising is gone.
Part of the solution to the problems created by content-based engagement is then to demand at least these two things:
- Interoperability of social networks and
- Portability of user data by users.
This is easier said than done. Interoperability would require global standards for sharing text, images, videos and music, similar to the way email and web pages are governed by common standards.
And even if that were done, this would not turn social networks into digital clones of telcos. It is clear what it means to be able to send a message from one network to another. But what does it mean for matching to be interoperable? Where telcos can interconnect point-to-point communications, social networks match a supply of content with a demand for content. The destination of content is decided by the platform, not by the user. It is not yet clear what the interoperability requirement means in this case.
And then there are hard problems for data portability to solve. The social graph that a network company constructs of a user is not a simple data structure to port. A few companies, including Google, Facebook and Twitter, work on data transfer standards but this is different from porting —moving— data from one network to another.
And what about free networks? Telegram, Signal, Parler and other free networks would love the opportunity to let their users send messages to any user of any social network. With sufficient interoperability, we can expect many users of Facebook to migrate to free networks without advertising or subscription fees. This may cause the free networks to collapse, but another scenario is that they may generate even more extremist content.
Despite the uncertainties about the complexity and the effects of interoperability and data portability, we know that without them, networks will not compete. In the absence of competition, we would have near-monopolists that dominate social networking globally, and that can avoid their responsibilities to the users from which they extract data. This would be the least desirable of all outcomes.
Interoperability is also one of the recommendations of the study on competition in digital markets of the House of Representatives. So interoperability and data portability remain on the list of demands to be realized.
And to force content-based matching platforms to take their responsibility the public they feed on, we need some additional transparency.
- Provide meta-information about content in an understandable way. For example, show the source of posts, including the country where the sender is located. Show the likelihood that a sender is a bot.
- Support researchers who study social networks and investigate the spread of (mis)information.
- Be transparent about what the company does about combating misinformation.
- Allow independent auditors to check whether the company’s description of the way it combats misinformation, corresponds with reality.
Providing and using social networks may be more complex than providing and using telecommunications. This also puts more burden on users, who should develop media awareness and wisdom. To deal with this,
- Create a public fund, filled by social networks platforms, to educate the public in the use of social media.
The fund could be used to develop teaching material for schools, and to develop simple and clear ways to present meta-information about social media content.
Before I summarize, I want to point out a few proposals that I think cannot be implemented or may have undesirable effects. Making social media responsible for all content they distribute is not going to work. No organization can take responsibility for the billions of messages published daily. Apart from the fact that it is not a good idea to let a commercial company censor they distribution of messages.
Another proposal has been to require users to identify themselves. This may help users to mark their words, but not just in a positive way. Dictatorial governments would love to be able to identify anyone who publishes a dissenting opinion.
To sum up, I think that matching platforms can be forced to take their responsibility to their data sources —their users— by forcing them to be transparent about the data they collect about their users, about the beneficiaries of the data, and about the way platforms generate revenue from this. They should provide comprehensible meta-information about posts, be transparent about what they do to combat misinformation, and allow researchers to analyze their networks.
All of this is no different from the reporting responsibility of companies to their shareholders. The responsibility of platforms to the people from whom they extract valuable data should be at least as great as their responsibility to shareholders.
All platform companies should accept auditors who check the platform transparency reports. And a government agency should check the comprehensibility of the reports.
To enable competition, and give users a choice, content matching platforms should be interoperable and implement data portability.
For too long now, the platforms have been hiding what data they collect about their users and how they earn money with this. They have too often rejected any responsibility for the effects of their business models on citizens. An increasing number of users seem to treat social networks as a poison they cannot avoid, a drug that they cannot do without. They distrust the networks and assume they are misguided by them but keep using them because they are part of today’s online infrastructure and there is no alternative. They would love to be treated as adults. It is time that the platforms accept responsibility to the users with whose data they generate so much revenue.
Appendix: Some proposals for repairing Big Tech
Below are some proposals to repair Big Tech. I included some of these proposals in my own.
Sinan Aral. The Hype Machine. How Social Media Disrupts Our Elections, Our Economy and Our Health – and How We Must Adapt. HarperCollins, 2020.
Discusses interoperability, data portability and identity portability as ways to introduce competition among social networks. Makes very clear that simply breaking up social media in smaller networks is not a solution. Proposes to combat misinformation by providing metainformation about content, reducing financial incentives to spread misinformation, increasing media literacy, applying machine learning to filter content, giving researchers access social network data, and improving platform policies.
Rana Foroohar. Don’t Be Evil. The Case Against Big Tech. Currency. 2019, 2021.
Makes numerous antitrust proposals, which I do not summarize here. About matching platforms, she proposes to pay every user for the data extracted from them, to store all data in a user workspace, preferably on a user device, to create a digital consumer protection bureau to protect users against discrimination, to open the black box of matching algorithms, and to put part of the revenue of matching platforms in a fund for public education.
Philip Howard. Lie Machines. How to Save Democracy from Troll Armies, Deceitful Robots, Junk News Operations, and Political Operatives. Yale University Press, 2020.
Proposes mandatory reporting about the ultimate beneficiaries of data, allow citizens to donate their data for beneficial purposes, reserve a percentage of ads for public purposes, expand the prohibition to earn profit from data about citizens, and introduce regular algorithmic audits.
Roger McNamee. Zucked. Waking Up to the Facebook Catastrophe. HarperCollins, 2019.
Makes a bunch of antitrust proposals, which I skip here. Does a number of radical proposals to forbid surveillance capitalism: no user profiles, no web tracking, no third-part market for data, no scanning of emails or other private documents, no corporate surveillance at public places or in homes. Plus a number of less radical proposals: prevent anonymity on social networks, identify bots, make social networks responsible for content (remove protection of section 230), introduce high penalties for data breaches, require an FDA-like process for transferring AI to a market, and inform users who is using their data.