The traditional measure of an algorithm's efficiency -- number of instructions performed -- assumes that all the data fits in main memory. Massive data sets, which are becoming more common and that do not fit into main memory, have led to a new measure of the efficiency of an algorithm. This measure takes into account disk accesses, which are orders of magnitude slower than main memory accesses and usually dominate the running time. IO-efficient algorithms try to minimize both the number of instructions and the number of disk accesses.
This class covers basic algorithms and data structures, techniques and paradigms, and applications. Looks at examples where IO-efficient algorithms make a difference in practice, and the extent of this difference. Consists of lectures, paper reading and presentation, and programming projects.
The class will consist of lectures, readings, discussions and programming projects.
The class was developed with the support of NSF award no. 0728780.
Prerequisites: csci 210 (Data Structures)
Office hours: Mon, Tue 3-4:30pm. For quick questions you can come to my office anytime.
Class Email: csci345 at bowdoin
Class webpage: http://www.bowdoin.edu/~ltoma/teaching/cs345/spring11/. All material will be available from this page throughout the semester. This class does not have a Blackboard site.
Week | Topic |
Week 1,2,3 | C, Makefiles, Emacs, Linux. |
Week 4,5 | Project: Experiencing the IO bottleneck. |
Week 6 | Paging and the VMM in the OS. |
Week 7,8,9,10 | The IO-model and IO-efficient algorithms (B-trees, IO-efficient sorting, list ranking, IO-efficient priority queues, IO-efficient flow accumulation, IO-efficient visibility, TBD). Techniques to improve data locality (space-filling curves) | Week 11, 12, 13, 14 | Projects. |
Here is what happened in class.