First you will need to define a data structure to encode a kd-tree such as below --- feel free to refine as needed.
typedef struct _treeNode treeNode; struct _treeNode { point2D p; /* If this is a leaf node, p represents the point stored in this leaf. If this is not a leaf node, p represents the horizontal or vertical line stored in this node. For a vertical line, p.y is ignored. For a horizontal line, p.x is ignored */ char type; / * this can be 'h' (horizontal) or 'v' (vertical), or 'l' (leaf) depending whether the node splits with a horizontal line or vertical line. Technically this should be an enum. */ treeNode *left, *right; /* left/below and right/above children. */ } typedef struct _kdtree{ treeNode* root; int count; //number of nodes in the tree int height; //height of tree } kdtree;
You'll need to write the basic primitives for operating on a treeNode and on a kdtree, such as creating a node and creating an empty tree, printing a node, and printing a tree.
For example, include a function that prints some basic info about the kd-tree, such as number of nodes, and height. Call this function in the main functin so that we can see its output.
void kdtree_print(kdtree* t);The main function that you will write for Part 1 is building a kd-tree from a set of points. You will write a function to build and return a kd-tree as follows:
/* Build a kd-tree for the set of n points, where each leaf cell contains 1 point. Return a pointer to the root. */ treeNode* kdtree_build(point2D* points, int n)The function takes as input the set of points and returns the kd-tree.
int ndp; //number of dictinct points //this should be made a global same as n ndp = remove_coincident_points(points,n); //ndp is smaller or equal to n
This helper function should remove the coincident points in points and return the number of dictinct points. One way it can do this is by sorting points and then in a subsequent pass shifting things left when equal. Basically the array points will stay the same, just that the ndp elements of it that are distinct will appear at the beginning.
point2D *points-by-x, *points-by-y; //allocate them of size ndp, copy data from points then sort them
You need to use system qsort for this by defining appropriate comparison functions.
kdtree_build_rec(point2D* points-by-x, point2D* points-by-y, int ndp, ...)This helper function should build the kd-tree recursively. It should probably take the depth of the current node as a parameter and use it to decide whether to split vertically or horizontally.
The main challenge in this function will be to catch all the cases that can happen and make sure the recursion stops.
In case of an even number of points, the median should be either a point in between the two medians; or the smaller of the two medians --- in this case make sure you include the point on the line in the tree to the left or below, in order to avoid infinite recursion. Stop the recursion when the node contains 1 point (and perhaps earlier if necessary).
It's possible that all points go on one side. For e.g. consider the points
(2,6), (3,6), (3,5)examined in the x-coordinate. Middle point is (3,6). But the third's point x-value is also 3, so it will go on the left side. Thus this passes the entire array to the next level. Then we examine them in the y-coordinate:
(3,5), (3,6), (2,6)Middle point is (3,6). But the third point has same y-coord as the median, which means it will also go on the left side. Thus this passes entire array to next level again, i.e. infinite recursion. These points are not coincident but are collinear in just the wrong way to cause infinite recursion (example thanks to Rob).
You'll need to find a way to deal with this.
To test your kd-tree, run it on sets of random points with values of n ranging from 1,2,3,4,5,...to 1000000. For each value of n press the space bar to get a different set of random points. For small values of n you'll want to start by printing the entire tree. Once your code works for small n, you'll probably want to switch to just printing the info of the kd-tree (number of nodes and height). Write a few different functions for initialization (in addition to initialize_points_random()), for example
//initializes array points with n points on a horizontal line void initialize_points_case1() { .. } //initializes array points with n points on a vertical line void initialize_points_case2() { .. } //initializes array points with 3 points as in the example above //that may trigger infinite recursion void initialize_points_case3() { n=3; .. }
//for each node in the tree in some order { glBegin(gl_LINE); //identify the endpoints p1 and p2 of the line segment that you //need to draw glVertex2f(p1); glVertex2f(p2); glEnd(gl_LINE); }The harder part in the rendering is identifying the endpoints of the line segment for that node. Note that the line x=x1through the root is infinite in the y-direction. The lines in the nodes left and right of the root are infinite on one side, and bounded by x=x1 on the other side. And so on. The region corresponding to a node (and thus the endpoints of the segments that will split it) can be computed based on the ancestors of the node in the tree.
The input points are generated in the range [0,WINDOWSIZE] x [0, WINDOWSIZE]. Thus a value of infinity in x or y direction should be set to WINDOWSIZE.
Needless to say, develop your code gracefully. Start by drawing infinite lines through all nodes, then refine it to compute the proper segment.
Make sure that your code is C (C99 standard) and not C+, for e.g. you shoudl NOT have
for (int i=0; i< ...);As handy as it is to write loops like this, it is not standard C style. If in doubt, compile your code on dover. If your code compiles, you are all set. If not, fix it!
Enjoy!