Assignment 1: Finding the closest pair of points--two algorithms and their experimental evaluation

In this assignment you will write code to find the closest pair of a set of points in the plane using two methods, and you will perform an experimental evaluation of the two methods in practice.

  1. First: the naive, quadratic algorithm discussed in class.

  2. Second: For the second algorithm, you will not implement the optimal divide-and-conquer algorithm, but instead you will develop an approach based on a widely used heuristic which gives good results in practice under certain assumptions about the data. Namely, you'll use a grid to bucket the points into grid cells and then use the grid to speed the search of the closest pair (see last problem in closest pair exercises).

Experimental evaluation: Denote the number of points by n. Generate sets of increasingly larger size sizes, and time both methods separately, until the difference in running times is significant. For e.g. you could pick n=100,1000, 10000, .... Time each method separately, and record the running times in a table.

Comments

Pair programming

You are encouraged to find a partner and work as a team (make sure you read and follow the pair-programming policy on class website). But if you would rather work alone, or if you don't know anyone in the class yet --- that's totally fine, just work alone. There are pros to both.

Help

My office hours are: Come any time to discuss any issues you are encountering or simply to chat.

Slack: Use Slack as much as you can, ask questions, and answer questions.

Getting your code to work: Expect you will have to debug your code. Remember you are here to learn, so don't compare yourself with others. When your code finally works, you created that world and you feel like God (paraphrasing Linus Torvalds); but, along the way, you make mistakes and you feel stupid. That's the beauty of programming. That's normal!

What and how to turn in

You will receive the assignment on GitHub, but there will be no startup code. You are encouraged to do pair-programming, but feel free to work alone. Push your code into your github repository for this assignment. Provide a README where you describe briefly how your gridded approach works, how you chose the grid size, and include the running times.

If you are done with this assignment and you want an extra challenge, it will be cool to implement the divide-and-conquer algorithm discussed in class.

Enjoy!


Review: Arrays and vectors in c++

Feel free to use C, C++, or a combination of the two. You will likely need to work with arrays, so I created a simple test program that demos how you allocate arrays, which you can find here.

Arrays of integers, C style

  int *a; 
  
  /*  DON'T DO THIS: 
  
      int a[n]; 
  
      It's wrong and you're sure to get segfaults for larger values of
      n. YOU NEED TO ALLOCATE  dynamically using malloc() because you
      don't know n at compile time.
  */ 
  
  //allocate the space  dynamically
  a =(int*)malloc(n * sizeof(int)); 
  
  //put some data in it 
  for (i=0; i < n; i++) 
    a[i] = 1; 

  //compute something 
  sum=0; 
  for (i=0;i< n;i++)
    sum += a[i]; 

  //free the space 
  free(a); 

Array of integers, C++ style


 //an array of n ints, C++ style 
  int *b; 
  
  /*  DON'T DO THIS: 
      
      int b[n]; 
      
      It's wrong and you're sure to get segfaults for larger values of
      n. YOU NEED TO ALLOCATE  dynamically using new because you
      don't know n at compile time.
  */ 
    

  //allocate the space  dynamically
  b = new int[n]; 

  //put some data in it 
  for (i=0; i < n; i++) 
    b[i] = 1; 

  //compute something 
  sum=0; 
  for (i=0;i < n;i++)
    sum += b[i]; 

  //free the space 
  delete [] b; 

Array of vector< int >


 //an array of Vectors, C++ style 
  vector< int > *d; 

 /*  DON'T DO THIS: 
      
      vector< int > d[n]; 
      
      It's wrong and you're sure to get segfaults for larger values of
      n. YOU NEED TO ALLOCATE  dynamically using new because you
      don't know n at compile time.
  */ 

  //allocate the space  dynamically
  d = new vector< int > [n]; 
  //NOTE: we assume that c++ calls the constructor to create a new Vector at each d[i]
  
  //put some data in it 
  for (i=0; i< n; i++) 
    //d[i] is a Vector, so we push 1 into it 
    d[i].push_back(1); 

  //compute something 
  sum=0; 
  for (i=0;i< n;i++)
    sum += d[i][0]; 

  //free the space 
  delete [] d; 

  printf("test4: sum=%f\n", sum); 

2D-array of vector< point >


//a structure for a point in 2D
typedef struct _point2d {
  double x, y; 
} point2d; 


//a 2D array of Vectors of points 
  vector< point2d > **grid; 
  
  /*  DON'T DO THIS: 

     vector< point2d > grid [n][n] 

     It's wrong and you're sure to get segfaults for larger values of
     n. YOU NEED TO ALLOCATE  dynamically using new because you
     don't know n at compile time.
  */ 

  //allocate first an array of vector*
  grid = new vector< point2d >* [n]; 
  for (i=0; i < n;i++) {
    //grid[i] is a vector*, that is, an array (of vectors); we allocate it 
    grid[i] = new vector< point2d > [n]; 
  }

  //put some data in it 
  for (i=0; i < n; i++) {
    for (j=0; j < n; j++) {
      //grid[i][j] is a Vector, so we push 1 into it 
      assert(grid[i][j].size()==0); 
      point2d p = {1.0, 1.0};
      grid[i][j].push_back(p); 
      assert(grid[i][j].size()==1); 
    }
  }
 
  //compute something 
  sum=0; 
  for (i=0; i < n; i++) {
    for (j=0; j < n; j++) {
      // grid[i] is a vector;grid[i][j][0] is the first point in that vector
      sum += grid[i][j][0].x;
    } 
  }
  //free the space 
  delete [] e;  

Last modified: Wed Sep 8th, 17:03:04 EDT 2021