Ribbonfarm

“There are idiots. Look around.” So said economist Larry Summers in a paper challenging the idea of efficiency in financial markets, a cornerstone of American capitalism. We’ve hit a point where the same can be said of efficiency in a cornerstone of American democracy, the marketplace of ideas: “There are bots. Look around.” The marketplace of ideas is now struggling with the increasing incidence of algorithmic manipulation and disinformation campaigns.

Something very similar happened in finance with the advent of high-frequency trading (the world I came from as a trader at Jane Street): technology was used to distort information flows and access in much the same way it is now being used to distort and game the marketplace of ideas. The future arrived a lot earlier for finance than for politics. There are lessons we can take from that about what’s happening right now with bots and disinformation campaign. Including, potentially, a way forward.

Efficiency, Technology and Manipulation in Finance

The technological transformation of financial markets began way back in the 1970s. The first efforts focused on streamlining market access, facilitating orders with routing and matching programs. Algorithmic trading began to take off in the 1980s, and then, in the 1990s, came the internet. When we talk about financial market efficiency, we’re really talking about information and access. If information flows freely and people can act on it via a relatively frictionless trading platform, then the price of goods, stocks, commodities, etc. is a meaningful reflection of what’s known about the world. The internet fundamentally transformed both information flows and access. News was incorporated into the market faster than ever before. Anyone with a modem could trade. Technology eliminated the gatekeepers: human order-routers (brokers) and human matching engines (known as ‘specialists’ in finance parlance) were no longer needed. The transition from “pits to bits” led to exchange consolidation; the storied NYSE acquired an electronic upstart to remain competitive. Facilitation turned into automation, and now computers monitor the market and decide what to trade. They route orders, globally, with no need for human involvement beyond initial configuration and occasional check-ins. News breaks everywhere, all at once and in machine-readable formats, and vast quantities of price and tick data are instantly accessible. The result is that spreads are tighter, and prices are consistent even across exchanges and geographical boundaries. Technology transformed financial markets, increasing efficiency and making things better for everyone. Except when it didn’t. For decades we’ve known that algorithmic trading can result in things going spectacularly off the rails. Black Monday, in 1987, is perhaps the most famous example: programmatic trading sell orders triggered other programmatic sell orders, which triggered still more sell orders, leading to a 20% drop in the market — and that happened in the pre-Internet era. Since then, we’ve seen unanticipated feedback loops, bad code, and strange algorithmic interactions lead to steep dives or spikes in stock prices. The Knight trading fiasco is one recent example; a stale test strategy was inadvertently pushed live and it sent crazy orders into the market, resulting in thousands of rapid trades and price swings unreflective of the fundamentals of the underlying companies. Crashes - flash crashes, now - send shockwaves through the market globally, impacting all asset types across all exchanges; the entire system is thrown into chaos while people try to sort out what’s going on. So, while automation has been a net positive for the market, that side effect -- fragility -- negatively impacts and erodes trust in the health of the entire system. Regular people read the news, or look at their E-trade account, and begin to feel like financial markets are dangerous or rigged, which makes them both wary and angry. Media and analysts, meanwhile, simplify the story to make a very complex issue more accessible, creating a boogeyman in doing so: high-frequency trading (HFT). The trouble is that “high-frequency trading” is about as precise as “fake news.” HFT is a catch-all for a collection of strategies that share several traits: extremely rapid orders, a high quantity of orders, and very short holding periods. Some HFT strategies, such as market making and arbitrage, are net beneficial because they increase liquidity and improve price discovery. But others are very harmful. The nefarious ones involve intentional, deliberate, and brazen market manipulation, carried out by bad actors gaming the system for profit. One example is quote stuffing, which involves flooding specific instruments (like a particular stock) with thousands and thousands of orders and cancellations at rates that exceed bandwidth capabilities. The goal is to increase latency and cause confusion among other participants in the market. Another example is spoofing, placing bids and offers with the intent to cancel rather than execute, and its advanced form, layering, where this is done at several pricing tiers to create the illusion of a fuller order book (in other words, faking supply and/or demand). The goals of these strategies is to entice other market participants — including other algorithms — to respond in a way that benefits the person running the manipulation strategy. People are creative. And in the early days of HFT, slimy people could do bad things with relative ease.

Efficiency, Technology and Manipulation in Ideas

Technology brought us faster information flows and decreased barriers to access. But it also brought us increased fragility. A few bad actors in a gameable system can have a profound negative impact on participant trust, and on overall market resilience. The same thing is now happening with the marketplace of ideas in the era of social networks. When the internet transformed media, in the late 1990s, flows of information and access changed: it was both easier to consume information and create and distribute it, though democratized publishing tools. Just as we eliminated gatekeepers in finance, we did so in media. But unlike the somewhat esoteric and rarefied world of high finance, anybody could play in this game. Especially with the advent of social networks in the mid-2000s. Content creation began to consolidate on a handful of platforms specifically designed to facilitate sharing and engineer virality through network effects. Our social platforms became idea exchanges, and unequal ones at that: popular content, as defined by the "crowd", rose to the top. If a crowd makes the effort, the systems are phenomenally easy to game; the crowd doesn’t have to be real people, and the content need not be true or accurate. We’re now in a period that’s strikingly reminiscent of the early days of HFT: the intersection of automation and social networking has given us manipulative bots and an epidemic of “fake news”. Just as HFT was a simplified boogeyman for finance, “fake news” is an imprecise term used to describe a variety of disingenuous content: clickbait, propaganda, misinformation, disinformation, hoaxes, and conspiracy theories. To break this down a bit more:

Clickbait is the low-hanging fruit; it’s generally profit-driven and more about piquing interest and getting a quick click than convincing someone of something.
Misinformation is generally spread accidentally. There is a lot of overlap with hoaxes — attempts to make an audience believe that something made-up is real. These run the gamut; some are simply practical jokes, others are darker.
Conspiracy theories take a scaffolding of real facts about something, and twist it to add intrigue. They are best dealt with simply by depriving them of oxygen. The internet isn’t particularly good at depriving sensational things of oxygen, so we’ve seen a steady increase in the reach of conspiracy theories.

D isinformation campaigns are the ones that matter most, and that’s because the goal of disinformation is specifically to introduce fear, uncertainty, and doubt. Content for a disinformation effort is disseminated strategically. The phony narrative often appears first on an “alternative” site, something outside of the mainstream press. In the quaint old days when KGB spies deployed the tactic, the goal was pickup by a major media property, because that provided legitimization and took care of distribution. Today, anonymity and the viral-sharing potential of social networks have eliminated the need for that step. Also, ironically, framing the story as something the “mainstream media” won’t touch resonates better with the media-distrusting folks who are most receptive to the content. Disinformation-campaign material is spread via mass coordinated action, supplemented by bot networks and sockpuppets (“fake people”). Once it trends, it’s nearly impossible to debunk. Social networks enable malicious actors to operate at platform scale, because they were designed for fast information flows and virality. Bots and sockpuppets can be used to manipulate conversations, or to create the illusion of a mass groundswell of grassroots activity, with minimal effort. It’s incredibly easy to deploy bots into a hashtag to spread or to disrupt a message — quote stuffing the conversation the way a malicious HFT algorithm quote stuffs the order book of a stock. It’s easy to manipulate ratings or recommendation engines, to create networks of sockpuppets with the goal of subtly shaping opinions, preying on proximity bias and confirmation bias. This would be a more manageable situation if the content remained on one platform. But the goal of a disinformation campaign is to ensure the greatest audience penetration, and achieving that involves spreading content across all of the popular social exchanges simultaneously. At a systems level, the social web is phenomenally easy to game because the big social platforms all have the same business model: eyes and ads. Since they directly compete with each other for dollars, they have had little incentive to cooperate on big issues. Each platform takes its own approach to troll-bashing and bot detection, with varying degrees of commitment; there’s no cross-platform policing of malicious actors happening at any kind of meaningful level. In fact, until a very notable event in November 2016, there was no public acknowledgement by Twitter, Facebook, or Google that there even was a problem. Prior to the U.S. Presidential election, tech companies managed to move fast and break things in pursuit of user satisfaction and revenue, but then fell back on slippery-slope arguments to explain why it was too difficult to rein in propaganda campaigns, harassment, bots, etc. They chose to pretend that algorithmic manipulation was a nonissue, so that they bore no responsibility for the downstream effects. Technology platforms are simply hosts of the content; they don’t create it. But as malicious actors get more sophisticated, and it becomes increasingly difficult for regular people to determine who or what they’re communicating with, there will be a profound erosion of trust in social networks. Markets can’t function without trust. So we’re at a point in which our marketplace of ideas bears striking resemblance to the financial markets in the early days of HFT: deliberate manipulation, unanticipated feedback loops, and malicious algorithms are poisoning the ecosystem, introducing fragility and destroying confidence. But unlike in finance, it’s no one’s job to be looking at this. It’s no one’s job to regulate this.

So now what?

Regulation is a dirty word, especially in Silicon Valley where the cult of disruption venerates business models that are often little more than regulatory arbitrage plays. The aversion to regulation isn’t entirely unwarranted; governments are often ill-suited to regulate new technologies, sometimes because they don’t understand them, and other times because they’re being hastily reactive in response to public outcry. In the case of finance, most people accept -- even welcome! -- some regulation to protect the little guy’s interests against the big and powerful, or the malicious and crooked. This wasn’t always the case. The primary regulator of U.S. financial markets is the the Securities and Exchange Commission (SEC). It was established in 1934, following the Great Depression: “Before the Great Crash of 1929, there was little support for federal regulation of the securities markets. Proposals that the federal government require financial disclosure and prevent the fraudulent sale of stock were never seriously pursued.” People saw the need for someone to be responsible for ensuring fair, orderly markets after the stock market crashed and wiped out the livelihoods of millions of people. While the marketplace of ideas hasn’t crashed just yet, we're seeing substantial fragility and loss of trust. We’re at the point where tech industry leaders should be collaboratively considering ways to prevent it. They haven’t been particularly motivated thus far. Much of the tech industry leadership has an idealistic libertarian alignment, and “the free speech wing of the free speech party” has not gone out of its way to prevent Russian bots from exercising their right to free speech. This attitude is going to be difficult to maintain in light of changing public sentiment. Europe is already talking about regulation, tech employees are staging internal mutinies, and one of Twitter’s founders is acknowledging a “broken” internet. Things are going to change. A middle way that may be more palatable is that of the SRO (self-regulatory organization). SROs are industry-funded, industry-established voluntary-participation frameworks that exist in dozens of industries, including medicine (the American Medical Association), legal (BAR associations), and real estate (Realtor associations). One of the main goals of an SRO is to keep the industry well-functioning, in order to avoid systemic problems that could lead to crises and wipe out trust. There are numerous industry SROs for consumer protection. When it became obvious that malicious HFT strategies were threatening the integrity and resilience of the financial markets, the financial industry’s largest SRO, FINRA, worked alongside the SEC to address the problem. FINRA began to require specific licensing and other requirements of HFT firms. Some exchanges banned specific HFT strategies outright from their platforms. Governments around the world, regulatory bodies, SROs, and exchanges all took steps to protect the integrity of the system, fix the loopholes, and maintain the trust of market participants. By contrast, the tech industry attempts largely uninspired, isolated tweaks despite the fact that the exchange-platforms have both a business case and a moral case for doing more. Becoming hosts of unchecked disinformation campaigns negatively impacts the three things businesses care most about: top line revenue, downstream profit, and mitigating risk. It will ultimately destroy the value of their networks. Social network companies often cite Metcalfe's Law (the value of a network is proportional to the square of the number of the nodes on the network) in IPO filings and shareholder reports. Here's a new corollary: they have to be real users. But the moral case should carry equal weight. The conversations that impact our democracy and shape our lives increasingly happen on this small collection of platforms. The downstream cost of serving users disinformation, conspiracies, and radicalized propaganda became clear in the elections of 2016 and 2017. We’re heading down the path of an arms race in algorithmic manipulation, in which every company, political party, activist group, and candidate is going to feel compelled to leverage these strategies. We’re at an inflection point, and only Big Tech has the power to reset the playing field. And in the meantime, the marketplace of ideas is growing increasingly inefficient as unchecked manipulation influences our most important conversations. There are bots. Look around.

14 Comments

Venkatesh Rao2017-05-23

Is there a proper market of lemons here where bad actors drive out good? Anecdotally, I think people do leave social media due to such manipulation, but I don't know whether it is a full-blown lemon market effect (because there are still more good actors than bad I think, at least the parts I haunt). But it could get there.

Also, it just struck me that the idea of evaporative cooling is a weaker form of full-blown lemon market effects.

Renee DiResta2017-05-24

Evaporative cooling is a great metaphor. I think Twitter in particular has had something of a market for lemons problem, but probably less due to disinfo bots and more related to simple mass coordinated action; there's no shortage of actual people who are up for seizing the opportunity to be jerks to complete strangers over some partisan issue. Harassment is a silencing tactic that is useful for an overall strategy to own a conversation - unfettered freedom of speech limiting freedom of assembly. As far as how common it is, that's tough to determine outside of the public "I quit Twitter" stories that we hear from celebs....however, Twitter recently rolled out abuse filtering tools that it suggests to users who are suddenly receiving a large quantity of messages with abusive language. So, if they're building tools specifically for it, it's likely not rare.
On Facebook, it's much harder to tell if someone is a bot because you can't see all of their comments in one place. So, it's actually more insidious: https://medium.com/data-for-democracy/sockpuppets-secessionists-and-breitbart-7171b1134cd5 I don't think people are leaving FB because of this, but the constant drumbeat of people calling each other "shills" in the comments under news articles indicates rising suspicion. I saw that kind of language regularly in conspiracist communities but it seems increasingly pervasive in more mainstream forums. (Since FB data is so locked down, it's hard to say that authoritatively.)

mattmc2017-05-25

Do you really interact with bots on Facebook? I only get to watch distant relatives call each other names because of politics/power concerns...

David Manheim2017-06-19

If you ever get fake news garbage that shows up on your newsfeed, odds are good there was something promoted or pushed by a bot. It may have been shared by a friend, but where did they get it from? Facebook's algorithms for what content to show you and others are being gamed, either directly or indirectly, to promote the content.

Davis Dulin2017-05-23

Bot post 0.o

"Bots and sockpuppets can be used to manipulate conversations, or to create the illusion of a mass groundswell of grassroots activity, with minimal effort." I liked this.

tbh I think its more interesting if we just leave humans out of it and let bots do their own thing. Elaborating on this might be breaking some taboo of relevance, but I find it fascinating to stretch the domain of technologies that are currently exploitable (i.e. customer service automation)...
"Multi-Agent Cooperation and the Emergence of (Natural) Language" https://arxiv.org/abs/1612.07182
"Emergence of Grounded Compositional Language in Multi-Agent Populations" https://arxiv.org/abs/1703.04908
"Learning to Communicate with Deep Multi-Agent Reinforcement Learning" https://arxiv.org/abs/1605.06676
making your own experiments: https://github.com/facebookresearch/ParlAI/blob/master/parlai/core/worlds.py#L247

More recently I've been fascinated with the question: "what’s the difference between some learning algorithm controlling blocks of a program, versus blocks of symbols that get grounded in to natural language?"

(1) an algorithm that controls blocks of a program are becoming known as "neural programmer interpreters" https://arxiv.org/abs/1704.06611 (ICRL best paper award 2017)

(2) an algorithm that controls blocks of symbols ("compositional symbolic language") are like the papers I initially linked to.

People are beginning to find relationships between (1) and (2):
"Program Synthesis for Character Level Language Modeling"
https://openreview.net/forum?id=ry_sjFqgx&noteId=ry_sjFqgx
"Learning A Natural Language Interface with Neural Programmer"
https://openreview.net/pdf?id=ry2YOrcge
"Translating Neuralese" https://arxiv.org/abs/1704.06960

While these answers are parametrically grounded, they are but toy academic problems. But I believe they point in the direction of how machine learning can encroach on our methods of thought, especially in relation to the end goal of Solomonoff Induction:
http://lesswrong.com/lw/dhg/an_intuitive_explanation_of_solomonoff_induction/

Bvee2017-05-23

Buyers - the most effective regulators - beware.

What makes you think buyers can fix systemic problems and externalities like this? We can't even filter out information we KNOW is wrong; http://www.danielgilbert.com/Gilbert%20et%20al%20(EVERYTHING%20YOU%20READ).pdf - The study shows that people are influenced by information; even when they are explicitly told the facts written in red were false, those "facts" still influenced their judgment.

Brin2017-05-24

Thanks. Interesting piece that provides a good summary of the state of social media. One issue that I think needs to be considered where discussing the reticence of social media organisations to address the issues that you've focussed on is the future of news media. While its east to position the Silicon Valley crowd as “the free speech wing of the free speech party”, I wonder how seriously they are thinking about long-term market share when considering acts of moderation.

While the current spambot army is an issue for the social media organisations themselves, the climate that those bots help create has arguably been far more destructive to traditional MSM. Surely some of the decision-making, or lack thereof, on the part of Twitter, Facebook etc is due to having one eye on the news revenue pie in the future?

Are these social media companies capable of moderating for disingenuous content?
If we assume the answer is yes, it follows that the next question asked is: What do they have to gain by not doing so?

mattmc2017-05-24

Even though it is only the analogy, it seems odd to blame the various stock market crashes on algos, when the most damaging events have been human initiated bubbles and crashes.

However, the argument also falls short on the journalism side. Political parties have been pushing out misinformation since they began. Dramatic headlines to drive attention are now more empirically designed via A/B testing, but they were always a goal. To put Nov. 16 as a point when we noticed this ignores the extreme level to which the mass media itself has been biased prior to the election, and reveals an underlying preference. It had long ago reached the point where supposedly unbiased sources had already lost credibility with large swaths of the electorate, which many Democrats did not notice because the propaganda was usually slanted their way. So many on the left believe that poor people on the right have been duped into supporting economic policies that give them smaller government handouts, when that may be their legitimate policy preference based on their principles being more important than their purses. Those on the right ended up creating their own news sources where they didn't have to hear their views insulted.

As a challenging example of how the government would need to regulate, the NY Times continues to publish the views Paul Krugman, masquerading as social science when they are nothing but political propaganda, made to mislead. I would suggest that if an algorithm fails to label him in a category to be disregarded as politically motivated speculation, it is probably being designed in a biased fashion. Labeling him as "opinion" is not enough. It's funny that you mention Metcalfe's law, as that is what he is often mocked for getting wrong in 1998, claiming the internet would be about as important as the fax machine. Of course, it is probably lower in impact on the list of things he has gotten horribly wrong, as most of the ones where he had more influence were in the field of economics. His Nobel award would probably lead most algorithms to label him a trustworthy expert in that subject, but even that was awarded for largely political reasons! If something manages to flag the latest Alex Jones nonsense and Krugman, we'll know we're getting somewhere.

Regulation in this area is really scary though- putting the government in charge of approving media is real fascism and authoritarianism. It is more than a matter of labeling things as true or false, even the fact checkers fall under attack when a meme about the AHCA not covering things turns out to not be true.

The media had the role of trying to determine where to focus our attention. However, it has fallen behind social media, which has scaled up the inherent desire in many for gossip, hearsay, and rumor. The search for the breaking news is pitiful. "Breaking" on CNN this afternoon was that they were "waiting for a Comey memo". You break into regular news to wait for something? Well, perhaps if your objective is to keep the focus on damaging the guy you don't like. Meanwhile, on Fox News, they are reporting on a months old court ruling to take the opportunity to drag the guy they don't like (Obama) through the mud. It's not bots that are the problem here, it's us.

Lawrence2017-05-28

http://media.riffsy.com/images/7b98ac3b5ad87c27de53b3c6b5cdef0d/tenor.gif

Hal Morris (@HalMorris3)2017-06-04

This is late, and unlikely to get a response, but I can't pass it up.

"If something manages to flag the latest Alex Jones nonsense and Krugman, we’ll know we’re getting somewhere"

I'm very interested in the idea of correlating someones method of arguing or reasoning to their degree of "good faith", and would be very interested in what such a "something" might look like.

P.S. What I'd like to see is any hint of an algorighm, however slight, applied to examples of Alex Jones' and Paul Krugman's writing.

Nolan Gray2017-05-31

I use bots to find other humans with similar interests because the current social media platforms are obsessed with replicating already existent real world connections to secure identity for accurate ad accounting. Who really wants to hang out with their angry racist family members on the internet? Error.

Craig_Hubley2017-09-10

"Media and analysts, meanwhile, simplify the story to make a very complex issue more accessible, creating a boogeyman in doing so: high-frequency trading (HFT)...about as precise as “fake news.” [meaning]: extremely rapid orders, a high quantity of orders, and very short holding periods. Some HFT strategies, such as market making and arbitrage, are net beneficial because they increase liquidity and improve price discovery. But other...involve intentional, deliberate, and brazen market manipulation [like][ quote stuffing, which involves flooding specific instruments (like a particular stock) with thousands and thousands of orders and cancellations at rates that exceed bandwidth capabilities. The goal is to increase latency and cause confusion among other participants in the market. Another example is spoofing, placing bids and offers with the intent to cancel rather than execute, and its advanced form, layering, where this is done at several pricing tiers to create the illusion of a fuller order book (in other words, faking supply and/or demand). The goals of these strategies is to entice other market participants — including other algorithms — to respond in a way that benefits the person running the manipulation strategy. People are creative. And in the early days of HFT, slimy people could do bad things with relative ease."

I remember this era and the solution to it was entirely obvious. It's quite easy to tell what is an arbitrage trade or market making, if you have any metric at all to describe the liquidity & price spread of the market. That is, you can tell when a gap is being filled and thus the market being made more liquid. You can also tell, based upon the holding time, whether human judgment of any kind was involved - obviously fundamental analysis doesn't change so much in two seconds between buying a stock and then selling it, so if it's not arbitrage or market making, it's speculation such as momentum trading. In other words, not something that deserves any special reward, and maybe something that should be taxed. The media focus on "high frequency" was not entirely wrong but that definition included some beneficial actions that were defensible, so the financial barons gained decades of immunity from new taxes on what is essentially gambling & volatility increasing. Also they focused on "derivatives" ignoring that these are necessary for stabilizing prices received by for instance farmers' co-ops. Depth of derived layers of instrument & frequency of trading are mere indicators of what is speculative gambling (which we tax very heavily in all other contexts) and what is insurance or hedging or liquidity guaranteeing (that should be charged less or not at all). We have avoided the question of how to tell one from the other for at least three decades... and it has to end.

Consider this: If every trade was taxed at a rate proportional to how long the instrument had been held, and the tax only exempted or reduced if a risk disclosure was made that revealed a liquidity, market making, hedging fundamental cashflow or other insurance type reason for the transaction... we would at least end up with a very large database of disclosed tax exemption claims directly from financial databases, in the hands of regulators who (even if no one else could ever see them, which I think would maybe be needed) could then at least understand the structure of the marketplace and the systemic risks. Also, those entities that chose not to disclose their reasons for a transaction or the underlying risk they were hedging, would at least be marked out as potentially risky players and perhaps (if say 80% of their transactions were of this nature) denied certain bailout or other protections should something go wrong - or just have to keep paying a higher tax rate than anyone who did disclose. In other words, put a dollar value on the disclosure itself, even if inaccurate, even if forged, even if synthesized to fool regulators, etc., on the sheer faith that the data itself would slowly reveal what is going on, and that the honest players could drive out the dishonest after a few years of being found lying about things *LESS*.

Yes this might imply committing to a common capital asset model for valuating the cashflows, for reporting purposes only, but not necessarily - it's only enough that the traders reveal that they used such a model and perhaps disclose the scenarios they considered (but not the weighting they used necessarily). Then we would again have a very good idea in the regulatory database of scenarios considered versus not, and whether legal requirements to consider tougher stress tests should be considered to stabilize the market.

Regarding politics, there are analogies that are obvious to me, but until we get the regime in place for finance, it's not likely we'll get any kind of systemic disclosure of bot activity or transparency. A capital asset model for even quite intangible assets is possible but to actually nail down "why" a political actor makes a declaration is impossible. One can't tell sincere advocacy by an idiot, from trolling.