There are bots. Look around.
“There are idiots. Look around.”
So said economist Larry Summers in a paper challenging the idea of efficiency in financial markets, a cornerstone of American capitalism. We’ve hit a point where the same can be said of efficiency in a cornerstone of American democracy, the marketplace of ideas:
“There are bots. Look around.”
The marketplace of ideas is now struggling with the increasing incidence of algorithmic manipulation and disinformation campaigns.
Something very similar happened in finance with the advent of high-frequency trading (the world I came from as a trader at Jane Street): technology was used to distort information flows and access in much the same way it is now being used to distort and game the marketplace of ideas.
The future arrived a lot earlier for finance than for politics. There are lessons we can take from that about what’s happening right now with bots and disinformation campaign. Including, potentially, a way forward.
Something very similar happened in finance with the advent of high-frequency trading (the world I came from as a trader at Jane Street): technology was used to distort information flows and access in much the same way it is now being used to distort and game the marketplace of ideas.
The future arrived a lot earlier for finance than for politics. There are lessons we can take from that about what’s happening right now with bots and disinformation campaign. Including, potentially, a way forward.
Efficiency, Technology and Manipulation in Finance
The technological transformation of financial markets began way back in the 1970s. The first efforts focused on streamlining market access, facilitating orders with routing and matching programs. Algorithmic trading began to take off in the 1980s, and then, in the 1990s, came the internet. When we talk about financial market efficiency, we’re really talking about information and access. If information flows freely and people can act on it via a relatively frictionless trading platform, then the price of goods, stocks, commodities, etc. is a meaningful reflection of what’s known about the world. The internet fundamentally transformed both information flows and access. News was incorporated into the market faster than ever before. Anyone with a modem could trade. Technology eliminated the gatekeepers: human order-routers (brokers) and human matching engines (known as ‘specialists’ in finance parlance) were no longer needed. The transition from “pits to bits” led to exchange consolidation; the storied NYSE acquired an electronic upstart to remain competitive. Facilitation turned into automation, and now computers monitor the market and decide what to trade. They route orders, globally, with no need for human involvement beyond initial configuration and occasional check-ins. News breaks everywhere, all at once and in machine-readable formats, and vast quantities of price and tick data are instantly accessible. The result is that spreads are tighter, and prices are consistent even across exchanges and geographical boundaries. Technology transformed financial markets, increasing efficiency and making things better for everyone. Except when it didn’t. For decades we’ve known that algorithmic trading can result in things going spectacularly off the rails. Black Monday, in 1987, is perhaps the most famous example: programmatic trading sell orders triggered other programmatic sell orders, which triggered still more sell orders, leading to a 20% drop in the market — and that happened in the pre-Internet era. Since then, we’ve seen unanticipated feedback loops, bad code, and strange algorithmic interactions lead to steep dives or spikes in stock prices. The Knight trading fiasco is one recent example; a stale test strategy was inadvertently pushed live and it sent crazy orders into the market, resulting in thousands of rapid trades and price swings unreflective of the fundamentals of the underlying companies. Crashes - flash crashes, now - send shockwaves through the market globally, impacting all asset types across all exchanges; the entire system is thrown into chaos while people try to sort out what’s going on. So, while automation has been a net positive for the market, that side effect -- fragility -- negatively impacts and erodes trust in the health of the entire system. Regular people read the news, or look at their E-trade account, and begin to feel like financial markets are dangerous or rigged, which makes them both wary and angry. Media and analysts, meanwhile, simplify the story to make a very complex issue more accessible, creating a boogeyman in doing so: high-frequency trading (HFT). The trouble is that “high-frequency trading” is about as precise as “fake news.” HFT is a catch-all for a collection of strategies that share several traits: extremely rapid orders, a high quantity of orders, and very short holding periods. Some HFT strategies, such as market making and arbitrage, are net beneficial because they increase liquidity and improve price discovery. But others are very harmful. The nefarious ones involve intentional, deliberate, and brazen market manipulation, carried out by bad actors gaming the system for profit. One example is quote stuffing, which involves flooding specific instruments (like a particular stock) with thousands and thousands of orders and cancellations at rates that exceed bandwidth capabilities. The goal is to increase latency and cause confusion among other participants in the market. Another example is spoofing, placing bids and offers with the intent to cancel rather than execute, and its advanced form, layering, where this is done at several pricing tiers to create the illusion of a fuller order book (in other words, faking supply and/or demand). The goals of these strategies is to entice other market participants — including other algorithms — to respond in a way that benefits the person running the manipulation strategy. People are creative. And in the early days of HFT, slimy people could do bad things with relative ease.Efficiency, Technology and Manipulation in Ideas
Technology brought us faster information flows and decreased barriers to access. But it also brought us increased fragility. A few bad actors in a gameable system can have a profound negative impact on participant trust, and on overall market resilience. The same thing is now happening with the marketplace of ideas in the era of social networks. When the internet transformed media, in the late 1990s, flows of information and access changed: it was both easier to consume information and create and distribute it, though democratized publishing tools. Just as we eliminated gatekeepers in finance, we did so in media. But unlike the somewhat esoteric and rarefied world of high finance, anybody could play in this game. Especially with the advent of social networks in the mid-2000s. Content creation began to consolidate on a handful of platforms specifically designed to facilitate sharing and engineer virality through network effects. Our social platforms became idea exchanges, and unequal ones at that: popular content, as defined by the "crowd", rose to the top. If a crowd makes the effort, the systems are phenomenally easy to game; the crowd doesn’t have to be real people, and the content need not be true or accurate. We’re now in a period that’s strikingly reminiscent of the early days of HFT: the intersection of automation and social networking has given us manipulative bots and an epidemic of “fake news”. Just as HFT was a simplified boogeyman for finance, “fake news” is an imprecise term used to describe a variety of disingenuous content: clickbait, propaganda, misinformation, disinformation, hoaxes, and conspiracy theories. To break this down a bit more:- Clickbait is the low-hanging fruit; it’s generally profit-driven and more about piquing interest and getting a quick click than convincing someone of something.
- Misinformation is generally spread accidentally. There is a lot of overlap with hoaxes — attempts to make an audience believe that something made-up is real. These run the gamut; some are simply practical jokes, others are darker.
- Conspiracy theories take a scaffolding of real facts about something, and twist it to add intrigue. They are best dealt with simply by depriving them of oxygen. The internet isn’t particularly good at depriving sensational things of oxygen, so we’ve seen a steady increase in the reach of conspiracy theories.
14 Comments
Is there a proper market of lemons here where bad actors drive out good? Anecdotally, I think people do leave social media due to such manipulation, but I don't know whether it is a full-blown lemon market effect (because there are still more good actors than bad I think, at least the parts I haunt). But it could get there.
Also, it just struck me that the idea of evaporative cooling is a weaker form of full-blown lemon market effects.
Evaporative cooling is a great metaphor. I think Twitter in particular has had something of a market for lemons problem, but probably less due to disinfo bots and more related to simple mass coordinated action; there's no shortage of actual people who are up for seizing the opportunity to be jerks to complete strangers over some partisan issue. Harassment is a silencing tactic that is useful for an overall strategy to own a conversation - unfettered freedom of speech limiting freedom of assembly. As far as how common it is, that's tough to determine outside of the public "I quit Twitter" stories that we hear from celebs....however, Twitter recently rolled out abuse filtering tools that it suggests to users who are suddenly receiving a large quantity of messages with abusive language. So, if they're building tools specifically for it, it's likely not rare.
On Facebook, it's much harder to tell if someone is a bot because you can't see all of their comments in one place. So, it's actually more insidious: https://medium.com/data-for-democracy/sockpuppets-secessionists-and-breitbart-7171b1134cd5 I don't think people are leaving FB because of this, but the constant drumbeat of people calling each other "shills" in the comments under news articles indicates rising suspicion. I saw that kind of language regularly in conspiracist communities but it seems increasingly pervasive in more mainstream forums. (Since FB data is so locked down, it's hard to say that authoritatively.)
Do you really interact with bots on Facebook? I only get to watch distant relatives call each other names because of politics/power concerns...
If you ever get fake news garbage that shows up on your newsfeed, odds are good there was something promoted or pushed by a bot. It may have been shared by a friend, but where did they get it from? Facebook's algorithms for what content to show you and others are being gamed, either directly or indirectly, to promote the content.
Bot post 0.o
"Bots and sockpuppets can be used to manipulate conversations, or to create the illusion of a mass groundswell of grassroots activity, with minimal effort." I liked this.
tbh I think its more interesting if we just leave humans out of it and let bots do their own thing. Elaborating on this might be breaking some taboo of relevance, but I find it fascinating to stretch the domain of technologies that are currently exploitable (i.e. customer service automation)...
"Multi-Agent Cooperation and the Emergence of (Natural) Language" https://arxiv.org/abs/1612.07182
"Emergence of Grounded Compositional Language in Multi-Agent Populations" https://arxiv.org/abs/1703.04908
"Learning to Communicate with Deep Multi-Agent Reinforcement Learning" https://arxiv.org/abs/1605.06676
making your own experiments: https://github.com/facebookresearch/ParlAI/blob/master/parlai/core/worlds.py#L247
More recently I've been fascinated with the question: "what’s the difference between some learning algorithm controlling blocks of a program, versus blocks of symbols that get grounded in to natural language?"
(1) an algorithm that controls blocks of a program are becoming known as "neural programmer interpreters" https://arxiv.org/abs/1704.06611 (ICRL best paper award 2017)
(2) an algorithm that controls blocks of symbols ("compositional symbolic language") are like the papers I initially linked to.
People are beginning to find relationships between (1) and (2):
"Program Synthesis for Character Level Language Modeling"
https://openreview.net/forum?id=ry_sjFqgx¬eId=ry_sjFqgx
"Learning A Natural Language Interface with Neural Programmer"
https://openreview.net/pdf?id=ry2YOrcge
"Translating Neuralese" https://arxiv.org/abs/1704.06960
While these answers are parametrically grounded, they are but toy academic problems. But I believe they point in the direction of how machine learning can encroach on our methods of thought, especially in relation to the end goal of Solomonoff Induction:
http://lesswrong.com/lw/dhg/an_intuitive_explanation_of_solomonoff_induction/
Buyers - the most effective regulators - beware.
What makes you think buyers can fix systemic problems and externalities like this? We can't even filter out information we KNOW is wrong; http://www.danielgilbert.com/Gilbert%20et%20al%20(EVERYTHING%20YOU%20READ).pdf - The study shows that people are influenced by information; even when they are explicitly told the facts written in red were false, those "facts" still influenced their judgment.
Thanks. Interesting piece that provides a good summary of the state of social media. One issue that I think needs to be considered where discussing the reticence of social media organisations to address the issues that you've focussed on is the future of news media. While its east to position the Silicon Valley crowd as “the free speech wing of the free speech party”, I wonder how seriously they are thinking about long-term market share when considering acts of moderation.
While the current spambot army is an issue for the social media organisations themselves, the climate that those bots help create has arguably been far more destructive to traditional MSM. Surely some of the decision-making, or lack thereof, on the part of Twitter, Facebook etc is due to having one eye on the news revenue pie in the future?
Are these social media companies capable of moderating for disingenuous content?
If we assume the answer is yes, it follows that the next question asked is: What do they have to gain by not doing so?
Even though it is only the analogy, it seems odd to blame the various stock market crashes on algos, when the most damaging events have been human initiated bubbles and crashes.
However, the argument also falls short on the journalism side. Political parties have been pushing out misinformation since they began. Dramatic headlines to drive attention are now more empirically designed via A/B testing, but they were always a goal. To put Nov. 16 as a point when we noticed this ignores the extreme level to which the mass media itself has been biased prior to the election, and reveals an underlying preference. It had long ago reached the point where supposedly unbiased sources had already lost credibility with large swaths of the electorate, which many Democrats did not notice because the propaganda was usually slanted their way. So many on the left believe that poor people on the right have been duped into supporting economic policies that give them smaller government handouts, when that may be their legitimate policy preference based on their principles being more important than their purses. Those on the right ended up creating their own news sources where they didn't have to hear their views insulted.
As a challenging example of how the government would need to regulate, the NY Times continues to publish the views Paul Krugman, masquerading as social science when they are nothing but political propaganda, made to mislead. I would suggest that if an algorithm fails to label him in a category to be disregarded as politically motivated speculation, it is probably being designed in a biased fashion. Labeling him as "opinion" is not enough. It's funny that you mention Metcalfe's law, as that is what he is often mocked for getting wrong in 1998, claiming the internet would be about as important as the fax machine. Of course, it is probably lower in impact on the list of things he has gotten horribly wrong, as most of the ones where he had more influence were in the field of economics. His Nobel award would probably lead most algorithms to label him a trustworthy expert in that subject, but even that was awarded for largely political reasons! If something manages to flag the latest Alex Jones nonsense and Krugman, we'll know we're getting somewhere.
Regulation in this area is really scary though- putting the government in charge of approving media is real fascism and authoritarianism. It is more than a matter of labeling things as true or false, even the fact checkers fall under attack when a meme about the AHCA not covering things turns out to not be true.
The media had the role of trying to determine where to focus our attention. However, it has fallen behind social media, which has scaled up the inherent desire in many for gossip, hearsay, and rumor. The search for the breaking news is pitiful. "Breaking" on CNN this afternoon was that they were "waiting for a Comey memo". You break into regular news to wait for something? Well, perhaps if your objective is to keep the focus on damaging the guy you don't like. Meanwhile, on Fox News, they are reporting on a months old court ruling to take the opportunity to drag the guy they don't like (Obama) through the mud. It's not bots that are the problem here, it's us.
http://media.riffsy.com/images/7b98ac3b5ad87c27de53b3c6b5cdef0d/tenor.gif
This is late, and unlikely to get a response, but I can't pass it up.
"If something manages to flag the latest Alex Jones nonsense and Krugman, we’ll know we’re getting somewhere"
I'm very interested in the idea of correlating someones method of arguing or reasoning to their degree of "good faith", and would be very interested in what such a "something" might look like.
P.S. What I'd like to see is any hint of an algorighm, however slight, applied to examples of Alex Jones' and Paul Krugman's writing.
I use bots to find other humans with similar interests because the current social media platforms are obsessed with replicating already existent real world connections to secure identity for accurate ad accounting. Who really wants to hang out with their angry racist family members on the internet? Error.
"Media and analysts, meanwhile, simplify the story to make a very complex issue more accessible, creating a boogeyman in doing so: high-frequency trading (HFT)...about as precise as “fake news.” [meaning]: extremely rapid orders, a high quantity of orders, and very short holding periods. Some HFT strategies, such as market making and arbitrage, are net beneficial because they increase liquidity and improve price discovery. But other...involve intentional, deliberate, and brazen market manipulation [like][ quote stuffing, which involves flooding specific instruments (like a particular stock) with thousands and thousands of orders and cancellations at rates that exceed bandwidth capabilities. The goal is to increase latency and cause confusion among other participants in the market. Another example is spoofing, placing bids and offers with the intent to cancel rather than execute, and its advanced form, layering, where this is done at several pricing tiers to create the illusion of a fuller order book (in other words, faking supply and/or demand). The goals of these strategies is to entice other market participants — including other algorithms — to respond in a way that benefits the person running the manipulation strategy. People are creative. And in the early days of HFT, slimy people could do bad things with relative ease."
I remember this era and the solution to it was entirely obvious. It's quite easy to tell what is an arbitrage trade or market making, if you have any metric at all to describe the liquidity & price spread of the market. That is, you can tell when a gap is being filled and thus the market being made more liquid. You can also tell, based upon the holding time, whether human judgment of any kind was involved - obviously fundamental analysis doesn't change so much in two seconds between buying a stock and then selling it, so if it's not arbitrage or market making, it's speculation such as momentum trading. In other words, not something that deserves any special reward, and maybe something that should be taxed. The media focus on "high frequency" was not entirely wrong but that definition included some beneficial actions that were defensible, so the financial barons gained decades of immunity from new taxes on what is essentially gambling & volatility increasing. Also they focused on "derivatives" ignoring that these are necessary for stabilizing prices received by for instance farmers' co-ops. Depth of derived layers of instrument & frequency of trading are mere indicators of what is speculative gambling (which we tax very heavily in all other contexts) and what is insurance or hedging or liquidity guaranteeing (that should be charged less or not at all). We have avoided the question of how to tell one from the other for at least three decades... and it has to end.
Consider this: If every trade was taxed at a rate proportional to how long the instrument had been held, and the tax only exempted or reduced if a risk disclosure was made that revealed a liquidity, market making, hedging fundamental cashflow or other insurance type reason for the transaction... we would at least end up with a very large database of disclosed tax exemption claims directly from financial databases, in the hands of regulators who (even if no one else could ever see them, which I think would maybe be needed) could then at least understand the structure of the marketplace and the systemic risks. Also, those entities that chose not to disclose their reasons for a transaction or the underlying risk they were hedging, would at least be marked out as potentially risky players and perhaps (if say 80% of their transactions were of this nature) denied certain bailout or other protections should something go wrong - or just have to keep paying a higher tax rate than anyone who did disclose. In other words, put a dollar value on the disclosure itself, even if inaccurate, even if forged, even if synthesized to fool regulators, etc., on the sheer faith that the data itself would slowly reveal what is going on, and that the honest players could drive out the dishonest after a few years of being found lying about things *LESS*.
Yes this might imply committing to a common capital asset model for valuating the cashflows, for reporting purposes only, but not necessarily - it's only enough that the traders reveal that they used such a model and perhaps disclose the scenarios they considered (but not the weighting they used necessarily). Then we would again have a very good idea in the regulatory database of scenarios considered versus not, and whether legal requirements to consider tougher stress tests should be considered to stabilize the market.
Regarding politics, there are analogies that are obvious to me, but until we get the regime in place for finance, it's not likely we'll get any kind of systemic disclosure of bot activity or transparency. A capital asset model for even quite intangible assets is possible but to actually nail down "why" a political actor makes a declaration is impossible. One can't tell sincere advocacy by an idiot, from trolling.