Assignment 1: Finding the closest pair of points--two algorithms and their experimental evaluation

In this assignment you will write code to find the closest pair of a set of points in the plane using two methods, and you will perform an experimental evaluation of the two methods in practice.

  1. Firs: the naive, quadratic algorithm discussed in class.
  2. Second: a gridded approach to bucket the points into grid cells and then use the grid to speed the search of the closest pair (see last problem in closest pair exercises).

Experimental evaluation: Perform an experimental evaluation of both methods to see how their performance compares in practice. Denote the number of points by n. Generate sets of increasingly larger size sizes, and time both methods separately, until the difference in running times is significant. For e.g. you could pick n=100,1000, 10000, .... Record the running times in a table.

This first assignment is meant as a warmup to get everyone up to speed with C/C++. It is also an opportunity to learn what quadratic complexity means in practice.

Comments

Arrays and vectors in c++

Feel free to use C, C++, or a combination of the two. You will likely need to work with arrays, so I created a simple test program that demos how you allocate arrays, which you can find here.

Arrays of integers, C style

  int *a; 
  
  /*  DON'T DO THIS: 
  
      int a[n]; 
  
      It's wrong and you're sure to get segfaults for larger values of
      n. YOU NEED TO ALLOCATE  dynamically using malloc() because you
      don't know n at compile time.
  */ 
  
  //allocate the space  dynamically
  a =(int*)malloc(n * sizeof(int)); 
  
  //put some data in it 
  for (i=0; i < n; i++) 
    a[i] = 1; 

  //compute something 
  sum=0; 
  for (i=0;i< n;i++)
    sum += a[i]; 

  //free the space 
  free(a); 

Array of integers, C++ style


 //an array of n ints, C++ style 
  int *b; 
  
  /*  DON'T DO THIS: 
      
      int b[n]; 
      
      It's wrong and you're sure to get segfaults for larger values of
      n. YOU NEED TO ALLOCATE  dynamically using new because you
      don't know n at compile time.
  */ 
    

  //allocate the space  dynamically
  b = new int[n]; 

  //put some data in it 
  for (i=0; i < n; i++) 
    b[i] = 1; 

  //compute something 
  sum=0; 
  for (i=0;i < n;i++)
    sum += b[i]; 

  //free the space 
  delete [] b; 

Array of vector< int >


 //an array of Vectors, C++ style 
  vector< int > *d; 

 /*  DON'T DO THIS: 
      
      vector< int > d[n]; 
      
      It's wrong and you're sure to get segfaults for larger values of
      n. YOU NEED TO ALLOCATE  dynamically using new because you
      don't know n at compile time.
  */ 

  //allocate the space  dynamically
  d = new vector< int > [n]; 
  //NOTE: we assume that c++ calls the constructor to create a new Vector at each d[i]
  
  //put some data in it 
  for (i=0; i< n; i++) 
    //d[i] is a Vector, so we push 1 into it 
    d[i].push_back(1); 

  //compute something 
  sum=0; 
  for (i=0;i< n;i++)
    sum += d[i][0]; 

  //free the space 
  delete [] d; 

  printf("test4: sum=%f\n", sum); 

2D-array of vector< point >


//a structure for a point in 2D
typedef struct _point2d {
  double x, y; 
} point2d; 


//a 2D array of Vectors of points 
  vector< point2d > **grid; 
  
  /*  DON'T DO THIS: 

     vector< point2d > grid [n][n] 

     It's wrong and you're sure to get segfaults for larger values of
     n. YOU NEED TO ALLOCATE  dynamically using new because you
     don't know n at compile time.
  */ 

  //allocate first an array of vector*
  grid = new vector< point2d >* [n]; 
  for (i=0; i < n;i++) {
    //grid[i] is a vector*, that is, an array (of vectors); we allocate it 
    grid[i] = new vector< point2d > [n]; 
  }

  //put some data in it 
  for (i=0; i < n; i++) {
    for (j=0; j < n; j++) {
      //grid[i][j] is a Vector, so we push 1 into it 
      assert(grid[i][j].size()==0); 
      point2d p = {1.0, 1.0};
      grid[i][j].push_back(p); 
      assert(grid[i][j].size()==1); 
    }
  }
 
  //compute something 
  sum=0; 
  for (i=0; i < n; i++) {
    for (j=0; j < n; j++) {
      // grid[i] is a vector;grid[i][j][0] is the first point in that vector
      sum += grid[i][j][0].x;
    } 
  }
  //free the space 
  delete [] e;  

What and how to turn in

You will receive the assignment on GitHub, but there will be no startup code. You are encouraged to do pair-programming, but feel free to work alone. Push your code into your github repository for this assignment. Provide a README where you describe briefly how your gridded approach works, how you chose the grid size, and include the table of running times.

Enjoy!


Last modified: Thu Aug 26 17:03:04 EDT 2021