What Should Twitter Do Next? Part 1: Retention
Musk is in a rush to make the $44B acquisition of Twitter profitable, turning around its losses of over $200M in 2021.
His most famous revenue growth measure is Twitter Blue for $8/month.
His most famous cost reduction measure has been cutting human resources costs by at least 90%1.
Are these going to work? What else should he do?
I’ve built social media and marketplace digital products for over 10 years. I’m currently the Chief Product Officer of Ankorstore, a $2B digital marketplace. I’m also a Twitter creator, with nearly 100k followers. Here’s how I would think about it, and what features I’d build if I were at Twitter today2.
OK let’s go!
“Improving Twitter” is too broad. You want to break it down into manageable pieces that you can then prioritize and tackle one by one. In social media networks like Twitter, you have three tasks:
User Acquisition: Get new users.
Retention: Keep these new users coming back to the site.
Monetization: Make money out of their time on the site.
Each requires its own strategy. Today’s article is focused on retention, because I think this could have the biggest impact. Parts 2 and 3 will be about monetization and user acquisition. Part 4 will be about how Musk is managing the transition. Subscribe to read them all!
Retention is the core of a social media app. It’s the hardest thing to move, and the biggest determinant of a product’s adoption. If customers love an app, they will stay for a long time and share it with others.
Most of the debate around Twitter retention is around the feed: What to show, and to whom. People discuss freedom of speech, censorship, whether the feed should just show tweets in reverse chronological order, etc.
My guess is that this is the wrong focus.
The feed algorithm is clearly important, but I think it’s improved a lot recently. In the past, it was driven by reverse chronological order: Twitter showed you all the tweets from all the people you follow, from the latest to the oldest. But this has two problems:
It limits the quality of the tweets you see to whomever you follow. Most people don’t follow many other people, so they miss out on the best tweets.
Every person tweets a lot of crap, and only has a few killer tweets. By showing all the tweets from those you follow, you drown the feed in garbage.
This is why the recent Tiktokization of Twitter is great. Twitter started showing tweets from the entire network: If millions like a tweet, odds are you will too.
If you follow somebody, it might also show you viral tweet they liked, or written by people they in turn follow. They also stopped showing you all the tweets from everybody you follow: Too much garbage. The result is a feed that is less specialized than you tailored it to be, but much more entertaining. That doesn’t work for an advanced user, but Twitter has tools for advanced users to tailor their experience.
Personally, I would just keep exploring this Tiktokization of Twitter’s feed. One of the things that doesn’t seem they’ve explored enough is Lookalikes.
Twitter tries to understand the topics you’re interested in. They do that by looking at the tweets you invest in3. When recommender systems do that, you end up with a barrage of content on the same topic. Clicked once on an ad for stone-carved lions? Watch these 10,000 ads about stone-carved lions! This gets stale fast.
Another way to find interesting stuff for you is to find lookalikes: People who have a similar behavior to you on the site. Odds are if they like the same specific topics as you, they might be interested in other topics that also interest you that you’ve never encountered before. That would be hugely valuable! You’d discover on Twitter things you didn’t know you wanted. This is the basis of Facebook’s most successful ad recommender system.
Twitter already does some of this. For example, they serve you tweets from people who are followed by those you follow. But if they’re limited to that type of recommendation, they should use more person lookalikes to find you new topics.
That said, I wouldn’t focus all my attention on it. Instead, I would focus on replies.
Replies: Twitter’s Main Opportunity
If you think about what makes Twitter good, it’s all the news that you can learn from amazing experts, 24/7. That’s the feed.
What makes it terrible is that debate is impossible. Everything is polarized. It’s a shouting match. That’s the replies.
The core of an intelligent debate is the back-and-forth between two people who learn from each other.
That works so well in person that we take it for granted. Only once we see how debate breaks online do we realize how good we have it in person, and what it makes it so much easier. In person:
Conversation is synchronous, which means you go back and forth much faster.
You are in the same physical environment as the other person, so you can relate much more easily to the other person and respect it more.
The number of people debating is limited by their physical presence.
There are social norms that every person in the room can help enforce, such as stopping hecklers or telling those making noise or problems to stop.
None of these things work online. So if you try to simply port online the offline debate norms, they break.
They sort of work on Twitter when a tweet is not very successful. It only gets a few replies, so other viewers can easily see them all, and respond to the intelligent ones.
But that’s not what you should design your system for. You should design it for successful tweets, when millions of people are seeing a tweet asynchronously. Then, you want to pick the very best reaction across all the minds, and you want them to build on top of each other.
Instead, look what Twitter does:
This tweet has probably been seen about 10-20M times. But the first two responses are spammy ads! Then after that, you can’t really see a single valuable reply. This is the fault of the algorithm. Why? How is the algorithm prioritizing the replies?
A cursory look at the replies gives you a sense of the issue. It looks like the main variable to prioritize the replies is by time. The faster a reply, the further up it is.
That makes no sense here: It incentivizes people to respond fast, rather than intelligently. It also means that if you see a tweet after a few days, hours, or even minutes, it’s probably too late for you to share any witty response. So you won’t.
Why should time even matter? Why is chronological order better than reverse chronological order for replies? Couldn’t it be better to see the latest reply, rather than the first?
I believe that time should matter little for the ranking of replies.
What should matter instead? Obviously, the quality of the replies.
If the replies with the most replies, retweets, or likes appeared higher up4, you’d have several benefits:
People would enjoy the most valuable replies.
Intelligent convos could finally happen: look at a tweet, see the most relevant reply A, open that, see the most relevant reply to reply A, etc.
It would strongly incentivize people to write more valuable replies.
The author would see valuable replies, which would better engage them in debate rather than only voicing their opinion.
The audience could vote what they want the author to see and respond to.
The most important change that Twitter can do to improve the product is rank replies by quality.
This is what Imgur does with its comments, and it works quite well.
There would be some downsides of prioritizing replies by quality, but they can be solved.
For example, earlier replies would still have an advantage: The first few would have more time to gather Likes, which would then pull them to the top, which would garner them more responses, which would raise them further.
A way to solve this problem is by ranking replies not by the sheer number of likes or retweets, but by the share of views that get a like or retweet, after a certain low bar. Imagine, for example, that reply A got 100 likes, out of 10,000 people who saw it, for 1% of likes over views. It makes it to the top. Reply B is further down, so it only got 50 views. But it got 10 likes, for a 20% of likes over views. It would be catapulted to the top, get more views, and see if that 20% stands. Issues stemming from this could be solved too.
A few months ago, Twitter released downvotes to a subset of users exactly as I described: For replies only. Apparently, they were working as intended. I would love to know what their status is, and if there’s no major issue with them, I would release them for everybody, to see how they react. The point of the downvote is that everybody can contribute to downvoting, so if you only test the feature with a few people, you will miss its true potential impact.
Twitter Should Be Your Newspaper
If you think about it, Twitter should be your digital newspaper.
What’s a newspaper? A list of links to the main news events of the day. But news companies can’t do that job well, because:
News outlets don’t have access to all the news of the world.
They don’t know what things people care the most about.
They don’t know what each reader is interested in.
Twitter has all these things.
And yet look at the catastrophic landing page of the #Explore section, the closest thing that Twitter has outside of the feed.
Compare that to a newspaper! This is horrible!
I don’t care about Football. How is that the first link?
I don’t know what Pardon refers to. Why is there no more context?
This is heavy on cryptocurrency. Why? I only follow them a bit.
Why are there no images here?
If you click on any of these trending topics, what Twitter does is literally search for that thing in the search bar, which will expose you to all the flaws of search… In this case, it’s just a random list of tweets, ranked more or less according to recency and some amount of reactions to the tweets… Look at the main article I get when I click on “Die Hard”, the main trending topic from the Trending section.
Utter rubbish. And this is the For you section! Of a company that had over 10,000 employees5!
Instead, Twitter should work on a user interface that uses the best practices from every newspaper in the world:
Have headlines, with a hierarchy from the most important one right now, to secondary news.
Add a bit more context below.
Have big images or videos.
And then, Twitter should combine these best practices with its big advantages, the information it has on you, and the huge quantity of its content:
Aggregate these news stories from all the tweets from the entire network.
Personalize better: Create a user profile that is constantly informed by what you read and interact with, the same way Tiktok does. Then only show relevant news to the users.
Explore would become a different sort of feed. That’s good, because today the feed is orders of magnitude better than the Explore section.
Twitter’s search has a lot of potential, but it’s mostly unusable.
One of the issues is what we just saw: Search for a trending topic, and you’re going to land mostly on an irrelevant stream of tweets with that trending topic quoted verbatim. Aren’t there better ways to find tweets relevant to a trending topic? For example, maybe it would be cleverer to look for a cluster of tweets on the trending topic (maybe they quote-tweet each other, with some of these tweets quoting the trending topic or similar terms).
This makes me think of Google Maps. Originally, when Google adapted the search results page to location-related searches, they would just show a list of results. But then one day they realized that it made more sense to show them on a map.
Something similar could be done on Twitter. If you’re searching for a specific topic, why should you just see a stream of tweets? Might it not be relevant to see for example a cloud of the tweets and how they relate to each other? Shouldn’t we be able to explore the main tweets that way?
What about the opposite use case, niche tweets? Twitter is equally bad at them. If you want a standard search, Google is likely going to work better. Here’s an example, where I tried to find one of my own tweets. Impossible on Twitter:
And that was just one of many good results that Google presented.
So Twitter should invest in its general search.
It should also make its Advanced Search much more visible and usable. It’s actually quite a good feature, you just need to know where to find it:
It has plenty of options: the author, the words included, whether it was a tweet or a reply, the minimum amount of engagement it got, when it was published…
Twitter should make these options more obvious by showing them as filters on the search results page (SERP). This is a pretty standard feature that many other companies have had for years, including Google. If Google has SERP filters, why not Twitter?
Features for the Supply Side
Most of what I’ve described so far is for the demand side of the marketplace: Twitter’s readers. But the supply side (those generating the content that most other people enjoy) also deserves much more attention.
Musk has been very vocal about his bad experience with bots, for example. He is an outlier though, with over 100M followers, so his problems are probably not shared by many. But they’re shared by those who have the most impact on the network, so they matter.
Now that a superstar influencer is at the helm of the company, people like him will get some love, which will include bot reduction. Good. Bots should be chased down.
Another example of supply-side pain is notifications. They’re mostly useless after a certain size of audience. For example, the badge (the blue circle with a number) only goes from “0” to “20+”. That’s not enough: Anybody with enough followers will either get 0 notifications when nothing happens or 20+ as soon as they post something.
Then, the notifications are either the number of people who liked/retweeted your tweet, the number of new followers you got, quote tweets, or responses. You get immediately flooded with these notifications, and can’t tell which ones matter. When I post my most viral tweets, I immediately stop looking at notifications, which is the exact opposite of what I should be doing: Since I’m getting more engagement, that’s when I probably get the most insightful interactions. But I can’t find them in my notifs, so I just don’t pay attention to them.
Filtering notifications by Verified accounts is already a plus. But it shouldn’t be a different feed for these notifications. It should be in the same main feed of notifications, except properly highlighted, like all the other things that should be well highlighted:
Every quote tweet you get with lots of engagement
Every response you get with lots of engagement
Every response you get from somebody you follow, or with whom you’ve interacted before, even if you don’t follow them.
The rest should be further summarized. No need to have a string of notifications with single people liking tweets. In summary, do the same thing with notifications as with the feed and replies: Highlight the relevant ones and better aggregate the ones that matter less.
For me, the biggest pain is drafts. I can’t save them across web and mobile, they frequently fail for threads—especially those with images of videos—and they don’t autosave. I should feel completely safe to write on Twitter, knowing nothing will be ever lost, and that I can schedule my content to go out at any time. Short of that, I’ll use other tools to write my content, and many times I might simply not find the time to post it on Twitter.
Threadbois have figured out ways to consistently hack the algorithm for impressions.
One of the main structures is condensed marketing:
You want X.
Most people fail to get it.
This thread get you X with very little cost:
There are other formulas for virality, but they’re created by people obsessed about hacking the algorithm, instead of any person with good content.
Twitter could easily invest in AI support to create better tweets. This sounds like a no-brainer premium feature.
Finally, there are lots of limitations with content shared on Twitter. You can only post short videos, for example. Some types of videos don’t work. And most importantly, it appears that adding an external link penalizes your tweets, so people rarely do it. This is a very big problem for advertising, as we’ll see later on.
Takeaways for Retention
In summary, here would be some bets I would make to increase engagement on Twitter, and thus retention, in order of priority:
Rank replies based on users’ reactions.
Optimize the feed based on people’s lookalikes.
Make news exploration more intuitive in the Explore section and for searching topics.
Widely release the downvote button for replies only.
Make search filters accessible on the search engine results page.
Make drafts and scheduling work.
Highlight relevant notifications better and aggregate those that matter less.
Help creators write better tweets with AI support.
Support more content, especially video.
This was Part 1 of a (probably) 4-part series on Twitter’s takeover, focused on retention. Now that we have retained customers to come back to Twitter, it’s time to monetize them! Part 2 will cover Monetization for Twitter. Part 3 will cover User Acquisition. Part 4 will cover the changes in the company. Subscribe to read them!
Everything I say here is based on my observations from outside the company. If I had access to internal data and a history of what the company has tried in the past, these recommendations would surely change. I’m sure people who have worked at Twitter would look at this list and for 75% of the recommendations, they’d say “we tried that and it didn’t work”. The issue then becomes untangling that: What did they actually try? Did that truly embody the bet? How did they measure success? Was it reasonable? Sometimes, retrying something might be the better solution.
Unclear how they do that, but my guess is the algorithm isn’t that great. Compared to Tiktok, Twitter recommends tweets that are frequently underwhelming. My guess is they might use tweet interactions only (clicks, likes, replies, retweets) and not as much more subtle types of interactions (spending more than average time on the tweet, time spent on a video…). There might also be issues with the clustering of the tweets (what topics each tweet belongs to. How many topics do they have? How do they classify them?), and the sources of inputs (are search keywords weighed when assessing whether something is interesting?).
To be clear, Twitter already takes the quality of the replies into account, but not enough.