In this project you will investigate the degree of separation of Holywood actors, also known as the Kevin Bacon game. As you may know, Kevin Bacon is a prolific actor who has appeared in many movies. We assign Kevin Bacon himself a Kevin-Bacon-number of 0. Any actor (except Kevin Bacon himself) who has starred in a movie with Kevin Bacon has a Kevin-Bacon-number of 1. Any remaining actor who has been in the same cast as an actor whose Kevin-Bacon-number is 1 has a Kevin-Bacon-number of 2, and so on.
For example, Meryl Streep has a Kevin-Bacon-number of 1 because she appeared in The River with Kevin Bacon. Nicole Kidman has a Kevin-Bacon-number of 2 because she did not play with Kevin Bacon in any movie, but she was in Cold Mountain with Donald Sutherland, and Sutherland appeared in Animal House with Kevin Bacon.
Check out the Wiki page on the six degrees of Kevin Bacon. And check out an online version of this game, The Oracle of Bacon.
Genarally speaking, the goal of this lab is to:
You may ask, is there anything special about Kevin Bacon? One should be able to compute shortest paths between any two actors; and one should be able to evaluate any actor as a center of Holywood. Your lab should handle the following:
For Kevin Bacon, you'll see that the average KB-number is much
smaller than you expect. This phenomenon is known as the
Here are various lists of movies and the actors that you'll be using:
'Breaker' Morant (1980)/Fitz-Gerald, Lewis/Steele, Rob (I)/Wilson, Frank (II)/Tingwell, Charles 'Bud'/Cassell, Alan (I)/Rodger, Ron/Knez, Bruno/Woodward, Edward/Cisse, Halifa/Quin, Don/Kiefel, Russell/Meagher, Ray/Procanin, Michael/Bernard, Hank/Gray, Ian (I)/Brown, Bryan (I)/Ball, Ray (I)/Mullinar, Rod/Donovan, Terence (I)/Ball, Vincent (I)/Pfitzner, John/Currer, Norman/Thompson, Jack (I)/Nicholls, Jon/Haywood, Chris (I)/Smith, Chris (I)/Mann, Trevor (I)/Henderson, Dick (II)/Lovett, Alan/Bell, Wayne (I)/Waters, John (III)/Osborn, Peter/Peterson, Ron/Cornish, Bridget/Horseman, Sylvia/Seidel, Nellie/West, Barbara/Radford, Elspeth/Reed, Maria/Erskine, Ria/Dick, Judy/Walton, Laurie (I) 'burbs, The (1989)/Gage, Kevin/Hahn, Archie/Feldman, Corey/Gordon, Gale/Drier, Moosie/Theodore, Brother/Katt, Nicky/Miller, Dick (I)/Hanks, Tom/Dern, Bruce/Turner, Arnold F./Howard, Rance/Ducommun, Rick/Danziger, Cory/Ajaye, Franklyn/Scott, Carey/Kramer, Jeffrey (I)/Olsen, Dana (I)/Gains, Courtney/Picardo, Robert/Hays, Gary/Davis, Sonny Carl/Gibson, Henry (I)/Jayne, Billy/Stevenson, Bill (I)/Katz, Phyllis/Vorgan, Gigi/Darbo, Patrika/Schaal, Wendy/French, Leigh/Fisher, Carrie/Benner, Brenda/Newman, Tracy (I)/Stewart, Lynne Marie/Haase, Heather (I) ...
You have movies, and you have actors. Actors are linked to the movies that they played in, and the other way around. The mathematical model for such a structure that stores pairwise connections between entities is called a graph.
A graph is comprised of a set of vertices and a set of edges. Each edge represents a connection between two vertices. A graph represents a network on the set of vertices. Many, many problems in the world can be modeled as graphs, from telephone and computer networks, to transportation networks, to Internet (websites and links), to social networks, to genetic and neural networks.
Not surprisingly, you'll use graphs to model the movie-actor relationship. The first question is how to model the Holywood world with a graph:
To decide on a representation you need to understand what exactly you need to do with the graph. Think of the pros and cons for each of the options above. Keep in mind that whatever structure you chose to represent the graph, you have to build it based on one of the text files above.
The second question is what is a good way to store the graph. The graph contains of a set of vertices, which you can store as an array/vector, list, or map. For each vertex, you need a list of edges connected that are connected to it; you can store these "adjacency lists" as arrays/vectors, or lists, or maps.
Once you decide what the graph represents and what data structure you'll use to represent it, you'll start developing a MovieGraph class. This class should be able to construct a movie-graph from a file. Encapsulate all necessary getters and setters, and all basic functionality that you may expect from a class that implements a MovieGraph. For example,
//create en empty movie graph MovieGraph() //read graph from the file MovieGraph(String fname) //add edge u-v void addEdge(String u, String v) //number of vertices int nV() //number of edges int nE() //return the vertices adjacent to vertex v bolean neighbors(String v) //return the degree of vertex v (degree = nb of neighbors) int degree(String v) //is v a vertex in the graph boolean hasVertex(String v) //is u-v an edge in the graph boolean hasEdge(String u, String v)Include testing functions that allow to print the vertices and edges in your graph.
void queryMovies() { while (1) { //ask the user to enter a movie name or Q to exit call queryMovie on the movie that the user entered } }
Note that to find the Kevin-Bacon-number of an actor X, we need to find the shortest path connecting X to Kevin Bacon. Generally speaking, for an arbitrary actor A, we need to find the shortest path connecting X to A.
Your goal is to write a method that takes two actor names A and B, finds the A-number of B (that is, a shortest path from A to B) and displays nicely the movie-actor chain to A. Shortest paths are not necessarily unique; that is, there may be several paths of the same minimum length connecting A to X. In this case, we just want to compute one of them (does not matter which one).
It turns out that you can compute shortest paths in a graph using a strategy that you have seen while searching: breadth-first search (BFS). Start from the vertex representing the source (actor A); add all its neighbors to a queue. These are all the actors with an A-number of 1. Then add to the queue all neighbors of these neighbors, and so on. It is not hard to see (and we'll argue this in class) that using breadth-first search from A you find the shortest paths from A to all other vertices (that are connected to A).
Some things to think of:
Note that there is nothing special about Kevin Bacon, and that the same approach can be computed to compute shortest paths between any two actors in Holywood. You want to make your methods general enough, not customized for Kevin Bacon.
In terms of style, you will probably want to implement computing paths as a separate class. Call it MoviePath. This class has to essentially perform BFS from a given vertex on a given graph and has to store all the necessary data for this as class instance variables. I imagine you will have a couple of methods in MoviePath. First, you'll have a constructor that takes as parameters a MovieGraph and a vertex in this graph and runs BFS from this vertex in the graph. Then you'll have functions that will return the actual path and distance to the source vertex.
Efficiency: One thing to think about is efficiency. Some of the graphs are very large. Note that, to compute a path from A to B, you need to run BFS from A until reaching B. So, one way to compute the average path length from A for all actors is to run this process for each actor B. This is extremely inneficient, and you will not be able to use it on anything but the smallest graph. You want to think about running BFS from A until the end (until reaching all nodes that can be reached), and compute in this way all the paths from A in the same time.
It is due last on Wednesday December 2nd. You can work with one partner. You are stroungly encouraged to find a partner. Once you have the background, working with a partner is both fun and challenging.
When you turn in the code, include a brief README file that describes the structure of your code, instructs the user how to run it, and specifies how each team member contributed to the lab.
Since the lab gives you little guidance on how to structure the code, you will find that the amount of time you put into this project is directly proportional to how clean is your design.
These are some things to think of as you think of how to model the problem. You need to understand that there is not one "right" way to do it. There are easier ways, and there are harder ways. There are more efficient ways, and less efficient ways. There are ways that will be easy to program, and there are ways that will take a lot of effort to make work. YOU are the creator of your world. Understand what it is that your world needs to do, decide how to model your world, keep it consistent, and make it work.
Lessons to learn: