Personalization paranoia or how I was stalked by Daniel Tiger

The thing about personalized ads and content on Facebook is that you don’t know exactly why the content you see ends up on your News Feed. While this algorithmic black box is well known to many and probably ignored by most, academic analysis of behavioural advertising rarely take a closer look at what personalized ads do to a person’s psyche.

poster-daniel-tigers
The Fred Rogers Company

The other day I was taking a daily scroll through my News Feed when I noticed an article from the Atlantic titled Daniel Tiger is Secretly Teaching Kids to Love Uber. For those of you without toddlers or a peculiar interest in kids’ TV shows, Daniel Tiger  is a friendly 4-year old tiger who teaches children how to cope with failure with happy-go-lucky songs.

Was the article served to me because I subscribe to the Atlantic’s Facebook page? I read several of their articles a week, so seeing an article from the Atlantic isn’t  too strange. However, I don’t see all of their articles, and the ones I do tend to be focused on topics related to the Internet economy (for obvious reasons).

Was it, in fact, the article’s reference to Uber, not Daniel Tiger, that made Facebook present this particular article to me? Or was it because Facebook had identified me as a parent and tended to suggest similar content to parents? Or did Facebook register that I googled the show at some point, and if I did, had I been signed into my Facebook account at the time or used private browsing? Or did Netflix share some of their viewing data with Facebook?

In this targeted online environment consent to terms and conditions and privacy notices make little sense. It is impossible to keep track of the myriad ways companies share and collect data, and a carte blanche is usually required to even begin using the service. While the goal might be efficient targeting to make advertisers happy, it results in personalization paranoia. Calling Facebook’s targeting a black box is therefore not an entirely accurate metaphor. I would prefer to call it a one-way mirror — everything we do is monitored, we’re vaguely aware of it, but we have no idea who’s watching.

 

The mystification of algorithms

Whenever I read stories on big data, it strikes me that journalists hardly ever know or care to explain what algorithms are or what they do. Take this example from the Economist’s recent special report on big data and politics:

Campaigners are hoovering up more and more digital information about every voting-age citizen and stashing it away in enormous databases. With the aid of complex algorithms, these data allow campaigners to decide, say, who needs to be reminded to make the trip to the polling station and who may be persuaded to vote for a particular candidate.

The Economist, March 26th, Special Report p.4

First, few seemed bothered with making a distinction between inferred intelligence and collected data. The quote above is an example of inferring information from existing databases – trying to figure out what kind of behaviour correlates with voting for a specific party. Since most databases are of a commercial nature,  I am guessing that they are trying to figure out if certain consumer behaviour, like buying organic milk, correlates with voting democrat.

In the case of protest movements, the waves of collective action leave a big digital footprint. Using ever more sophisticated algorithms, governments can mine these data.

The Economist, March 26th, Special Report p.4

The second example is about mining social media for data on dissidents and revolutionary action. There the data itself can be a source of “actionable intelligence” as Oscar Gandy would put it. There is nothing inherently sophisticated in looking for evidence of people participating in protest events on Facebook or finding protest movement chatter on Twitter.

Second, while the algorithms might be complex, they are usually employed in programmes that have relatively clear user interfaces. The Citizen Lab at the University of Toronto demonstrated that “net nannying” tools that are employed in schools, homes or businesses are also frequently used by authoritarian states for monitoring a whole nation’s communications.

While these reports give some insight into how data science is used to gain an advantage in politics or law enforcement, they tend to mystify the technologies and techniques involved. We are left confounded by this data magic that somehow changes the playing field. But the guiding principles are not that hard to understand, and using the programmes do not require a degree in computer science. We might not know exactly how the algorithms work, but we know what sources of information they use and what their purposes are.

Prism-slide-9
Slide illustrating how to search PRISM’s counterterrorism database

 

The shrinking long tail of online media

For many media companies,  online distribution has been seen as a practical solution to audience fragmentation. Those who are not interested in primetime content can satisfy their needs by shows that are available online, on-demand. The problem with this “long tail” solution is finding the right content for these fragmented audiences. Going through an extensive catalogue of different tv and radio shows won’t bring you any closer to satisfaction than simply succumbing to the alluring yet numbing power of  American Idol or Big Brother.

The solution to this particular problem is, naturally, personalization. In an interview for Wired, Netflix’s Neil Hunt stated that in the future, Netflix’s recommendation algorithm will be so accurate that it will be able to give users “one or two suggestions that perfectly fit what they want to watch now.”

Obviously, Netflix is not there yet:

Huffington post
Source: Huffington post

Snide remarks aside, Hunt’s vision is probably true, but not because Netflix is about to find the golden piece of code that will make this prediction of the future reality, but simply because media consumption is very, very predictable. In a Harvard Business School study from 2008, Anita Elberse found that the top 10 % of songs on the music streaming service Rhapsody accounted for 78 % of all plays and that the top 1 % accounted for nearly one-third of all plays (cited in Misunderstanding the Internet, 2012). The tail had gotten longer, sure, but the big profits were still made where the tail was the thickest. A quick glance at YouTube statistics would confirm this.

“Predicting” that people will want to see Game of Thrones after seeing the Walking Dead isn’t difficult, it’s just … probable. Personal preferences play in, of course, but I don’t think I’m going out on a limb when I say that 10 viewing profiles with appropriate standard recommendations would fulfil 90 % of all viewers’ needs.

The thing with predictions is that they effectively make the tip of the long tail obsolete. It’s more likely that primetime shows will be predicted, since it is quite probable that a viewer will be content with what’s offered. Suggesting less-popular shows is riskier, as the prediction is more likely to go wrong. Instead of watching one primetime show we’ll watch nothing but primetime, as recommended by algorithms.  At least with Netflix’s failed recommendations, it’s possible to find something completely unexpected.

Big Data Dystopia pt 2: Newspapers and web shops join forces

In an earlier post, I discussed the possible implications of banks and insurance companies converging. This post will focus on the convergence of newspapers and web shops.

In a nutshell, a daily newspaper’s greatest assets have usually been its reach and its credibility.

The past 20 years or so, newspaper subscriptions have been declining in most countries. Other media outlets are just as popular as newspapers’ websites, and the reach of newspapers is no longer as dominant as it used to be.

Credibility, however, works differently. Increased competition does not affect credibility negatively. A good review in the New York Times can lift something or someone fairly unknown from the margins to the mainstream.

It’s not news that many newspapers are struggling in the online ad market, even though the market is growing. Google and Facebook dominate, and little suggests that newspapers will be able to compete with the two online ad powerhouses. However, the two have not yet been that successful in sealing the deal; that is, getting people to actually buy products online.

One of Amazon’s greatest feats is doing exactly that.  With the help of its elaborate recommendation system, Amazon recommends products based on previous purchases and browsing history. Amazon’s algorithm can even identify you (and help you on your way) as a potential drug dealer if you choose to buy a certain scale.

What Amazon tries to achieve is increased credibility  through crowdsourcing customer reviews. Still, an anonymous, non-professional customer review is nothing like an article in The Guardian.

In 2013, Amazon owner Jeff Bezos bought the Washington Post. Bezos’ editorial aspirations aside, the move is likely to spur innovative cross-ownership business models. Similarly, Finnish newspaper Helsingin Sanomat has also launched their own web shop, Mitä Saisi Olla.  Although significantly smaller in scale, the message is clear: if online ads fail, online shops might be the answer.

Now, based on innovations in behavioural targeting and automatized tracking of reading patterns online, newspapers have more information on their readers than ever. Not all newspapers track their users of course, but those wishing to remain attractive to advertisers in this day and age should at least consider doing so. A third asset for newspapers has emerged: Deep knowledge about reading patterns can tell as much or even more than a person’s  Google search history. The articles we read, how much time we spend reading them and whether we recommend them to our peers are essential for understanding not only who we are but also who we strive to be.

This could lead to at least two outcomes. First, reviews and product benchmarks might be published alongside convenient links to the web store. A great book review can be the catalyst for a spontaneous one-click-buy.

Second, data on reading patterns can be  compared to consumption history, creating an even clearer picture of consumer interests. The web shop is no longer fully dependent on browsing history but can also rely on actual information on  consumers’ interests. Similarly, the newspaper can not only speculate on its readers’ consumption patterns, but actually convince advertisers that they know exactly what products their readers will buy.

The crux is  that such actions might damage the newspaper’s reputation. Let’s hope that the newspapers won’t be reduced to mere barkers for web shops.