9.4. Facebook Network Assignment

In this assignment, you will do an analysis of your own Facebook friends graph.

No, you won’t. On April 30 2015, Facebook removed friendship connections from its APIs altogether. This means no one can download their own friend graphs any more. However, Facebook graphs can still be studied. Use the using_networkx Ipython notebook zipfile. which includes Facebook data from the Stanford networks pages. You may also wish to visit the Stanford Facebook data page yourself for more information about the data.

For various aspects of this assignment, you should have a look at the using_networkx Ipython notebook zipfile. to see an example of computing ionformation about networks in networkx. Note that the notebook also includes details about how to display networks in networkx, but displaying the network is not part of the assignment.

  1. The SNAP data includes a collection of 10 ego-network graphs. After you execute the first cell in the using_snap notebook, you will have expanded the Stanford archive in a subdirectory of your notebook directory named “facebook”. Each of the 10 ego graphs in that directory is represented in several files. The code in the using_snap notebook reads one of those ego graphs into Python, along with information about the individuals in that graph. You choose which ego network. You can choose any ego but 0.

  2. Using your networkx tools, compute the facebook friend with the highest degree centrality in your chosen ego graph, the number of components in your facebook ego-network (nx.connected_components), the average clustering coefficient of the largest component in your facebook network. All of these are illustrated in the using_networkx notebook. For connected components, see the first section of the notebook; for average clustering coefficient, see the section titled The clustering coefficient of a random graph in the using_networkx notebook.

  3. Now compute the average clustering coefficient of each of the circles in your network and compare it to the average clustering coefficient of the entire graph.

  4. Structural questions: Is your facebook graph connected? Judging from the layout the spring layout algorithm gives you, are there fairly obvious communities in it? Are there people who act as bridges between communities?

  5. Using the Dice_coefficient function defined in the Similarity of friends: Homophily section of the using_snap notebook, find the average similarity of at least one circle to ego.

To turn in:

  1. Answers to the questions (2), (3), (4), and (5) above (in a IPython notebook showing your work).

Some further (optional) ideas:

  1. Sort the friends in the graph by degree-centrality, betweenness centrality.

  2. Graph the degree distribution of your ego network using your favorite graphing utility. A degree distribution graph is a bar graph with degree on the x-axis and number of nodes with that degree on the y-axis. So it’s a kind of histogram.

  3. Create a pandas dataframe from an ego network. Each row should be one of the friends in that ego network. As a starting point, each column should be one of the features available in the graph, although for many rows will have features with NA values. There are some good variables to analyze, including gender and circle memberships. The idea is to look for variables that correlate to some degree with centrality.

  4. Draw your graph using the networkx drawing tools. For help with this, once again go to the using_networkx Ipython notebook zipfile. For coloring communities, see the section entitled Adding communities. You will need as many colors as there are communities in the output of the algorithm. For help with reducing label clutter, see the Playing with your ego network section, which shows how to display only a subset of your labels, or see the following section on how to make your network mousable. You can turn in one or more saved images of your graph. If you do a mousable version, you can save different versions with different communities turned on.

  5. Compute the cluster coefficient of your network using nx.average_clustering. Also compute the average path length using nx.average_shortest_path_length. Discuss how this bears on your facebook network being a small world. See the discussion of small worlds in the online text chapter on Social Networks. Note: To do a good job on this question, you will need to look up the original paper by Watts & Strogatz [WATTS1998], which will tell you how to give a quantitative answer. [For a final project, this item can be done instead of item 4.]