Providing online news and content for millions of users in China, Bytedance’s flagship app Jinri Toutiao (translated as “Today’s Headlines”) doesn’t require an editor-in-chief to lead its content strategy like other news platforms do, according to company founder and CEO Zhang Yiming. While the news aggregator app with around 115 million daily active users does have an executive editor whose job is to make sure the content on the app complies with China’s internet content regulations, Zhang insists that the best way to manage content is “not to interfere.”
The gap created by the absence of human interference is filled by the company’s artificial intelligence and deep-learning algorithms that deliver a selection of personalized content to its users.
The app shows an endless feed of posts and videos recommended by its algorithms, all based on the user’s age, sex, location, and personal preference. As you read posts recommended by the platform, it learns what you like and don’t like by tracking your behavior: what you click to read, what you choose to dismiss, how long you spend on an article, which stories you comment on, and which stories you choose to share. The behavior recorded by the system then spits out recommendations to populate your feed.
If the news feed seems infinite, it’s because there are nearly 1.5 million publishers on the Jinri Toutiao platform—as of March 2018—comprising not only news organizations, but also individuals and teams of bloggers pushing out content.
Though criticized by the authorities as being “addictive” and “encouraging vulgar content,” the mechanism contributed massively to the early success of Jinri Toutiao, and therefore Bytedance as well.
The company has replicated the recommendation system with other products such as short-video platform Douyin and its virally popular international version, TikTok. Its success speaks for itself.
According to a person who is familiar with Bytedance’s recommendation system, it was initially based on Google’s Wide & Deep Learning, open-source models that combine the strengths of the wide linear model and the deep neural network, two types of artificial neural networks that can perform tasks usually carried out by a human brain.
The Wide & Deep Learning system is used for recommendations on Google Play, the search engine’s popular Android mobile app store with more than 1 billion active users, and has led to “significant improvement” in app downloads, according to a paper by a group of Google researchers.
“The recommendation system is now Bytedance’s core technology that underpins everything from its news app to its short-video apps,” said the person familiar with the matter.
Under the hood
In January 2018, Bytedance held a meeting to disclose how the algorithms work. The move was in response to pressure from internet watchdogs and state media, which had criticized the Jinri Toutiao app for spreading pornography and allowing machines to make content decisions (in Chinese).
At the meeting, Bytedance’s algorithm architect Cao Huanhuan explained the principles of the recommendation system used by Jinri Toutiao and many of the company’s other apps.
The full text of his speech can be found here (in Chinese).
According to Cao, the system’s main inputs consist of three kinds of data: the content profile, the user profile, and the environment profile.
The content profile contains the categories and keywords of each article, as well as the respective relevance values associated with them.
As an example, Cao highlighted an article on the Jinri Toutiao app about the Russian tennis player Maria Sharapova’s 17th consecutive defeat by American player Serena Williams. The article was allocated to the Sports and Tennis channels; its keywords included “Xiaowei” [Williams' nickname], “Sharapova,” “Wimbledon Championships,” and “semi-final.”
The categorizing was performed by natural language processing, a branch of artificial intelligence that deals with the interaction between computers and humans using natural languages, according to Cao.
The system also recorded the values of the relevance of the categories and keywords to the story. For example, the relevance value of the keyword “semi-final” is 0.7198 while that of “Sharapova” is 0.9282, meaning “Sharapova” is more relevant to the story than “semi-final.”
The profile also included when the article was published, which helps the system to decide when to stop recommending this particular story.
The user profile consists of a series of characteristics for each user, such as browsing history, search history, type of device, sex, age, location, and behavioral traits. While characteristics such as sex, age, and location set the tone for the kind of content that should be recommended to the user, other inputs tell the system the user's preferences on specific subjects and themes.
The environment profile is dependent on when and in which scenario the app is used: at work, commuting, traveling, and so on. That is because “people have different preferences in different situations,” said Cao. Other environmental traits include the weather and the user’s internet connection—for instance, cellular networks or Wi-Fi.
The distribution process begins with the system giving a recommendation value to a newly published story based on its quality and potential readership. The bigger the value is, the greater the number of users will see the story on their feeds.
Once published, the story’s recommendation value changes as users interact with it. Positive actions such as likes, comments, and shares increase the story’s recommendation value, which brings more exposure. Negative actions such as dislikes and short reading times decrease the value. The recommendation value also decreases over time.
Recently, Bytedance has moved to commercialize its recommendation tool, packaging the algorithm for use across different product lines and platforms.
Named “ByteAir,” the platform can use big data and machine learning—as well as Bytedance’s experience in news, live-streaming, social, and e-commerce—to create customized recommendation services for Bytedance’s partners, according to its website.
The model is “recommendation as a service,” according to the person we spoke with, who believes that recommendation is a universal demand for online service providers
“I’m sure they’re gonna have tens of thousands of applications and services that will gladly pay them money to use their recommendation service,” the person said.
“And in doing that, Bytedance will be able to make its recommendation better, because I’m sure part of the contract would require these apps to share their user data with [Bytedance].”