COMSE6998
Advanced Data Structures
Welcome to COMSE6998 Advanced Data Structures!
Time: Thursdays, 10:00-12:00.
Location: 963 Schermerhorn Hall Extension.
The course will cover data structures in 4 main themes: Integers, Geometry, Graphs and Strings
(information retrieval). We will spend time on both upper and lower bounds in various memory models.
Final projects
Here is the list of suggested final projects.
Final projects are due on Friday, May 10th.
Assignments
Submit to CourseWorks.
- First homework, due February 21st (Thursday) at 6pm.
- Second homework, due March 9th (Saturday) at 11:59pm.
- Third homework, due March 28th (Thursday) at 6pm.
- Fourth homework, due April 11th (Thursday) at 6pm.
- Fifth homework, due May 4th (Saturday) at 6pm.
Lecture notes
- Lecture 1.
- Lecture 2.
- Lecture 3.
- Lecture 4.
- Lecture 5.
- Lecture 6.
- Lecture 7.
- Lecture 8.
- Lecture 9.
- Lecture 10.
- Lecture 11.
- Lecture 12.
- Lecture 13.
Syllabus
- Introduction (Motivation, DS and memory models, static/dynamic, performance guarantees).
- Advanced binary search trees (BST model of computation, instance optimality, Splay Trees).
- Dynamic optimality conjecture (geometric view, Arboral sets, Wilber’s LB, Tango Trees).
- Predecessor search UBs on the RAM: van-Emde Boas Trees and Fusion Trees (stratified trees, Indirection).
- Predecessor search LB via Round-Elimination (asymmetric communication complexity paradigm).
- Near-Neighbor Search (locality-sensitive hashing, Dynamization via Segment Trees, LBs from
lopsided set disjointness and metric embeddings, Cell-Sampling).
- Orthogonal Range Counting (Layered Range Trees, Fractional Cascading, Dynamization via BB[a] Trees,
delaying updates via Buffer Trees).
- Tight LBs for dynamic Range Counting in 1D and 2D (Chronogram method, dynamic Cell Sampling).
- Dictionaries and Hashing (Perfect Hashing, Cuckoo Hashing, dynamic dictionaries).
- Succinct DS for Information Retrieval (Suffix arrays, compressed pattern matching, Rank UB/LB).
- Dynamic undirected connectivity (Link-Cut Trees, Euler-Tour Trees, Distance Oracles).
- Tight LB for undirected connectivity (Information-Transfer method). The “Multiphase” Conjecture.
References
- Data Structures and Network Algorithms (by R.Tarjan — covers BSTs, splay trees, link-cut trees).
- Open Data Structures (by P.Morin — covers BSTs, B-trees, hashing, and some integer data structures).
- Advanced data structures (by Peter Brass. Open access e-book available on the web).
- Advanced Data Strucutres course at MIT (by Erik Demaine)
Data Structure Simulator
https://www.cs.usfca.edu/~galles/visualization/RedBlack.html
Logistics
There will be bi-weekly homeworks (~5-6 in total). Grading will be based on HW (35%), scribing
1 lecture (15%) and a final project (50%), which will entail exploring and extending a research
paper of your choice (to be discussed and approved by the instructor and the TAs).
TA: Gleb Posobin (Room : 522 CSB. Office Hours: Tuesday 4−6pm)
Instructor: Omri Weinstein (523 CSB. Office Hours: By appointment).