First you will need to define a data structure to encode a kd-tree such as below --- feel free to refine as needed.
typedef struct _treeNode treeNode; struct _treeNode { point2D p; /* If this is a leaf node, p represents the point stored in this leaf. If this is not a leaf node, p represents the horizontal or vertical line stored in this node. For a vertical line, p.y is ignored. For a horizontal line, p.x is ignored */ char type; / * this can be 'h' (horizontal) or 'v' (vertical), or 'l' (leaf) depending whether the node splits with a horizontal line or vertical line. Technically this should be an enum. */ treeNode *left, *right; /* left/below and right/above children. */ } typedef struct _kdtree{ treeNode* root; int count; //number of nodes in the tree int height; //height of tree } kdtree;
You'll need to write the basic primitives for operating on a treeNode and on a kdtree, such as creating a node and creating an empty tree, printing a node, and printing a tree.
For example, include a function that prints some basic info about the kd-tree, such as number of nodes, and height. Call this function in the main functin so that we can see its output.
void kdtree_print(kdtree* t);The main function that you will write for Part 1 is building a kd-tree from a set of points. You will write a function to build and return a kd-tree as follows:
/* Build a kd-tree for the set of n points, where each leaf cell contains 1 point. Return a pointer to the root. */ treeNode* kdtree_build(point2D* points, int n)The function takes as input the set of points and returns the kd-tree.
int ndp; //number of dictinct points //this should be made a global same as n ndp = remove_coincident_points(points,n); //ndp is smaller or equal to n
This helper function should remove the coincident points in points and return the number of dictinct points. One way it can do this is by sorting points and then in a subsequent pass shifting things left when equal. Basically the array points will stay the same, just that the ndp elements of it that are distinct will appear at the beginning.
point2D *points-by-x, *points-by-y; //allocate them of size ndp, copy data from points then sort them
You need to use system qsort for this by defining appropriate comparison functions.
kdtree_build_rec(point2D* points-by-x, point2D* points-by-y, int ndp, ...)This helper function should build the kd-tree recursively. It should probably take the depth of the current node as a parameter and use it to decide whether to split vertically or horizontally.
The main challenge in this function will be to catch all the cases that can happen and make sure the recursion stops.
In case of an even number of points, the median should be either a point in between the two medians; or the smaller of the two medians --- in this case make sure you include the point on the line in the tree to the left or below, in order to avoid infinite recursion. Stop the recursion when the node contains 1 point (and perhaps earlier if necessary).
It's possible that all points go on one side. For e.g. consider the points
(2,6), (3,6), (3,5)examined in the x-coordinate. Middle point is (3,6). But the third's point x-value is also 3, so it will go on the left side. Thus this passes the entire array to the next level. Then we examine them in the y-coordinate:
(3,5), (3,6), (2,6)Middle point is (3,6). But the third point has same y-coord as the median, which means it will also go on the left side. Thus this passes entire array to next level again, i.e. infinite recursion. These points are not coincident but are collinear in just the wrong way to cause infinite recursion (example thanks to Rob).
You'll need to find a way to deal with this.
To test your kd-tree, run it on sets of random points with values of n ranging from 1,2,3,4,5,...to 1000000. For each value of n press the space bar to get a different set of random points. For small values of n you'll want to start by printing the entire tree. Once your code works for small n, you'll probably want to switch to just printing the info of the kd-tree (number of nodes and height). Write a few different functions for initialization (in addition to initialize_points_random()), for example
//initializes array points with n points on a horizontal line void initialize_points_case1() { .. } //initializes array points with n points on a vertical line void initialize_points_case2() { .. } //initializes array points with 3 points as in the example above //that may trigger infinite recursion void initialize_points_case3() { n=3; .. }
//for each node in the tree in some order { glBegin(gl_LINE); //identify the endpoints p1 and p2 of the line segment that you //need to draw glVertex2f(p1); glVertex2f(p2); glEnd(gl_LINE); }The harder part in the rendering is identifying the endpoints of the line segment for that node. Note that the line x=x1through the root is infinite in the y-direction. The lines in the nodes left and right of the root are infinite on one side, and bounded by x=x1 on the other side. And so on. The region corresponding to a node (and thus the endpoints of the segments that will split it) can be computed based on the ancestors of the node in the tree.
The input points are generated in the range [0,WINDOWSIZE] x [0, WINDOWSIZE]. Thus a value of infinity in x or y direction should be set to WINDOWSIZE.
Ideally, you would develop this code from scratch. I have placed some startup code in the usual place: Code. You do not need to use this code. Use as much or as little as you want.
How to turn in: Please use the svn folder provided for the class! If you would like to change partners, just come talk to me or send me an email, and we'll create new svn folders for you.
The usual comments:
Always code assuming you'll have to debug. Think and structure your code incrementally so that it is easy to debug it. Test one piece before you move on to the next one. Keep in mind that pointer errors do not always manifest, and soemtimes they manifest in different ways on different computers.
And finally, if your code has bugs, you need to make it to the study group and talk to Max (the TA). He may not be able to tell you exactly what is wrong, but going over your code with someone else may show you what's wrong. I've been in many situatiosn where someone would walk me through their code, only to find out the problem as they were explaining how it works.
Enjoy!