How would you build a friend recommendation system for Facebook?

Prashant Jha
7 min readJun 20, 2021

Written in the collaboration with Nurul Huda and Shubham Sharma.

Social media include interactive technologies that allow the creation or exchange of information, ideas, career interests, and other forms of expression via virtual communities and networks. In short, it allows people to share content quickly, efficiently, and in real-time. While many people access social media through smartphone apps, this communication tool started with computers, and social media can refer to any internet communication tool that allows users to broadly share content and engage with the public. Its ability to share photos, opinions, and events in real-time has transformed the way we live and the way we do business. Examples of social media are Facebook, Twitter, Instagram, etc.

Social media platforms like Facebook provide tremendous benefits to users. However, it can also bring harm to the users. Some of the advantages and disadvantages of using social media are given below.

Advantages:

  • Connectivity and Information and Updates: People from anywhere can connect with anyone. Regardless of the location and religion. The beauty of social media is that you can connect with anyone to learn and share your thoughts. The main advantage of social media is that you update yourself on the latest happenings around the world. Most of the time, Television and print media these days are biased and do not convey the true message. With the help of social media, you can get the facts and true information by doing some research.
  • Education and Awareness: It creates awareness and innovates the way people live. It is social media that has helped people discover new and innovative stuff that can enhance personal lives. From farmers to teachers, students to lawyers every individual of the society can benefit from social media and its awareness factor. It has a lot of benefits for the students and teachers. It is very easy to educate others who are experts and professionals via social media. You can follow anyone to learn from him/her and enhance your knowledge about any field. Regardless of your location and educational background you can educate yourself, without paying for it.
  • Help and Noble Cause: You can share your issues with the community to get help and giddiness. Whether it is helping in terms of money or in terms of advice, you can get it from the community you are connected with. Social media can also be used for noble causes. For example, to promote an NGO, social welfare activities, and donations for needy people. People are using social media to make donations for needy people and it is a really quick way to collect funds to help needy people.
  • Helps Govt and Agencies Fight Crime: It helps Governments and Security Agencies to spy and catch criminals to fight crime.
  • Promoting Business: This makes the businesses profitable and less expensive because most of the expenses made over a business are for advertising and promotion. Positive review comments from the customers about a company can help them with sales and goodwill.

Disadvantages:

  • Cyberbullying: According to a report published by PewCenter.org most of children have become victims of cyberbullying over the past. Since anyone can create a fake account and do anything without being traced, it has become quite easy for anyone to bully on the Internet. Threats, intimidation messages, and rumors can be sent to the masses to create discomfort and chaos in society. This research shows that approximately 1 in 20 students commit suicide in cyberbullying cases.
  • Hacking and Security Issues: Personal data and privacy can easily be hacked and shared on the Internet. Which can make financial losses and loss of personal life. Similarly, identity theft is another issue that can give financial losses to anyone by hacking their personal accounts. Several personal Facebook accounts have been hacked in the past and the hacker had posted materials that have affected the individual's personal lives.
  • Addiction: The addictive part of social media is very bad and can disturb personal lives as well. Teenagers are the most affected by their addiction to social media. They get involved very extensively and are eventually cut off from society. It can also waste individual time that could have been utilized by productive tasks and activities.
  • Reputation: Social media can easily ruin someone’s reputation just by creating a false story and spreading it across social media. Similarly, businesses can also suffer losses due to bad reputations being conveyed over social media.

After discussing social media’s disadvantages and advantages. Let’s discuss how social media platforms like Facebook works. It starts with a user creating a profile, usually by providing a name and an email address. Once a profile has been created, users can create and share content. In addition to creating content for their profile, Facebook users can find other users whose content they want to follow or comment on. A user may “follow” another user, add them as a “friend,” or may “subscribe” to another user’s page. It often uses “feeds” that allow users to scroll through content. Facebook uses algorithms, based on a user’s profile data, to determine the content that appears and the order that it appears.

One of the most important goals of Facebook is to help users in finding their friends and connect to more people. This is done using recommendation systems. This recommendation system is based on the friends of friends’ methodology. It can be based on friends in common, similar age, geographic location, school information, workplace, etc. Friends are recommended based on people a user searches for, people who searched for the user, number of mutual friends, group affiliation, etc.

There are many ways to recommend friends to a user. Some of the methods are

Content-based approach

1. The Heuristic-Based recommender system includes TF-IDF (information retrieval) and Clustering.

2. Model-based recommender systems include Bayesian Classifiers, Clustering, Decision trees, Artificial Neural Network, etc.

Collaborative-based approach

1. The Heuristic-Based recommender system includes Nearest neighbor (cosine, correlation), Graph theory, and Clustering

2. Model-based recommender system includes Bayesian Network, Clustering, Probabilistic models, Linear regression, Artificial Neural Network

Hybrid approach

1. he Heuristic-Based recommender includes a linear combination of predicted ratings, various voting schemes, incorporating one component as a part of the heuristic for the other

2.Model-based recommender system includes incorporating one component as a part of the model and building one unifying model.

Approach to recommend new friends

So, Now let’s talk about how we can make a friend recommendation system. We will be calculating a similarity score. For simplicity’s sake, as a start, we can use distance metrics as a similarity score. So we’ll be using the euclidean distance metric as similarity or closeness score.

We can not calculate the distance between all the users because not everyone can be someone’s friend. We’ll be considering only people who are someone’s friend’s friend for the recommendation.

But now the question is, how much distance should be there between two people so we can recommend them as friends? We have to decide a threshold distance and if closeness is below that threshold with someone, we can recommend him as a friend. The threshold can be decided by taking the average of the closeness score with the existing friends of someone. That will give us an idea of how close a person is on average with his existing friends.

That is fine, we got our algorithm sorted, but we’re not done yet. For calculating the distance between two users, we need some features. What features we can use? What are the features that can indicate if two people can be friends or not? Here are some examples:

  • User’s City/Town (We can use geometric coordinates here, so it will act as a continuous variable)
  • The school user goes (Here too we can use coordinates.)
  • The ratio of Male and Female friends out of existing friends.
  • User’s age.
  • User’s Grade/Class/Subject/Stream.

These were the example of some user-specific features that we can use to define a user by a vector and to calculate distance between two friends.

We can improve this system more by using some pairwise features as well, the features that define the relationship between two users who are not connected yet. For example:

  • The number of mutual friends.
  • How many times both liked a common friend’s post.
  • How many times both commented on a common friend’s post.
  • How many times both were tagged in a single post.
  • Common hobbies/interests.

We can train a classification model by using the above pairwise features as independent features along with the output of the recommendation system. This model will be a binary classification model that will give output if they are likely to be friends or not?

CONCLUSION

In the end, Facebook’s friend-recommendation system isn’t magic or malice — just really good math,

So a natural question that might pop up could be “why go through so much trouble and not let the users search for themselves?” A simple answer is — Business impact. This lets facebook advertise to similar users and generate more revenue.

One other major concern is Privacy.

The features we used, avoided the use of location, third-party apps, cookies, etc. This was a major user concern that might have led to not trusting the app as it happened in 2016, when Vox ran an article about PYMK, Facebook said it did not collect text and call data from users. Two years later in March 2018, the company admitted that it does collect this data from some Android users via the Messenger app, In 2017, sex workers feared for their safety when PYMK recommended their clients add them to the app. A year earlier, a psychiatrist’s patients were recommended to one another as friends.

So we refrained from using those features which may threaten user’s privacy.

After deploying our model we can test its accuracy by out of all recommended friends, to how many people actually sent the request after recommendations.

--

--