Administrative course information is available here
There will be created a mailing list for the course using your email addresses in Fronter.
There is not a textbook for the course. See below for mandatory readings.
· Lars Ailo Bongo (larsab@cs.uit.no), Office: A259
· Ibrahim Umar (ibrahim.umar@uit.no), Office: A135
All lectures are either on Tuesdays, 14:15-16:00 (MH U6) or Fridays, 11:15-12:00 (Store Aud).
Lecture |
Date |
Subject |
Lecturer |
L1 |
Fri
16.08 |
Lars
Ailo |
|
L2 |
Tue 20.08 |
Lars Ailo |
|
L3 |
Tue
20.08 |
Lars
Ailo |
|
V1 |
Fri 30.08 |
video |
|
L4 |
Tue
27.08 |
Lars
Ailo |
|
L5 |
Tue 03.09 |
Lars Ailo |
|
L6 |
Tue
10.09 |
Giacomo
Tartari |
|
L7 |
Tue 27.09 |
Lars Ailo |
|
L8 |
Tue
24.09 |
Lars
Ailo |
|
L9 |
Tue 01.10 |
Guest lecture: Stallo |
Jonas Juselius |
L10 |
Tue
08.10 |
Lars
Ailo |
|
Tue 15.10 |
No lecture |
||
L11 |
Tue
22.10 |
Lars
Ailo |
|
L12 |
Tue 29.10 |
BigTable , Hbase, and Spark (no slides) |
Lars Ailo |
|
|
|
|
L13 |
Tue 12.11 |
Summary (1st hour) |
Lars Ailo |
|
|
Course
evaluation (2nd hour) |
Jan
Fuglesteg |
|
Tue 19.11 |
No lecture |
|
|
Fri
22.11 |
Exam |
|
All are mandatory unless otherwise noticed.
1. Introduction
a. None
2. Threads and synchronization primitives
a. Modern Operating Systems, 3ed. Andrew S. Tanenbaum. Prentice Hall. 2007. Chapters 2.2, 2.3, 2.5, 10.3, 11.4
b. Alternatively: the chapters about threading, IPC mechanisms, and classical IPC problems from another operating systems textbook.
c. Additional reading: Futexes Are Tircky
3. Parallel architectures
a. Computer Organization and Design: the Hardware/Software Interface, 4ed. David A. Patterson, John L. Hennessy. Morgan Kaufmann. 2011. Chapter 8: “Multicores, Multiprocessors, and Clusters”.
4. Parallel programs
a. Chapter 1.3 and 2.0—2.3.5 from Parallel Computer Architecture: A Hardware/Software Approach, J.P. Singh, Anoop Gupta. Morgan Kaufmann. 1998.
5. Programming for performance
a. Chapter 3 from Parallel Computer Architecture: A Hardware/Software Approach, J.P. Singh, Anoop Gupta. Morgan Kaufmann. 1998.
6. Go
a. Rob Pike. SPLASH keynote talk
b. A tour of Go
d. Effective Go
e. Go concurrency patterns (video, slides)
7. Performance evaluation
a. Jayadev Misra. 1986. Distributed discrete-event simulation. ACM Comput. Surv. 18, 1 (March 1986), 39-65.
b. J. Liedtke. 1995. On micro-kernel construction. In Proceedings of the fifteenth ACM symposium on Operating systems principles (SOSP '95),.
c. Additional reading: Yanpei Chen, Kiran Srinivasan, Garth Goodson, and Randy Katz. 2011. Design implications for enterprise storage systems via multi-dimensional trace analysis. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (SOSP '11).
d. Additional reading: Kirk Glerum, Kinshuman Kinshumann, Steve Greenberg, Gabriel Aul, Vince Orgovan, Greg Nichols, David Grant, Gretchen Loihle, and Galen Hunt. 2009. Debugging in the (very) large: ten years of implementation and experience. In Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles (SOSP '09).
8. Parallel program performance evaluation
a. Amdahl’s law
c. Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh and Kai Li. The PARSEC Benchmark Suite: Characterization and Architectural Implications. Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, October 2008.
9. Stallo
a. None
10. Data-intensive computing
a. Introduction by Jim Gray (on eScience), and Gray’s law (pp. 5—11) from The Fourth Paradigm: Data-Intensive Scientific Discovery. Edited by Tony Hey, Stewart Tansley, and Kristin Tolle.
b. On the Future of Genomic Data. Scott D. Kahn. Science 11 February 2011: 331 (6018), 728-729.
c. Data, data everywhere. The Economist Feb 25th, 2010.
d. Three Paradoxes of Big Data. Neil M. Richards and Jonathan H. King, Stanford Law Review Online, 2013.
e. Additional reading: rest of The Fourth Paradigm: Data-Intensive Scientific Discovery.
f. Additional reading: rest of Science special issue on Big Data.
g. Additional reading: rest of The Economist special report on the data deluge
11. MapReduce & BigTable
c. Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung. 2003. The Google file system. In Proceedings of the nineteenth ACM symposium on Operating systems principles (SOSP '03). ACM, New York, NY, USA, 29-43.
d. Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: simplified data processing on large clusters. Commun. ACM 51, 1 (January 2008), 107-113.
12. Data analytics
a. Hbase tutorial (or another Hbase tutorial)
b. Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, Andrew Fikes, and Robert E. Gruber. 2008. Bigtable: A Distributed Storage System for Structured Data. ACM Trans. Comput. Syst. 26, 2, Article 4 (June 2008), 26 pages.
c. Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, Ion Stoica. Spark: Cluster Computing with Working Sets. HotCloud 2010. June 2010.
13. Summary
a. None
· Parallel Computer Architecture: A Hardware/Software Approach. David Culler, J.P. Singh, Anoop Gupta. Morgan Kaufmann. 1998.
o This book has a great introduction to parallel programming.
o There is one copy in the library. Please be nice to your fellow students and do not lend that copy for an extended period of time.
· The Art of Computer Systems Performance Analysis: Techniques for Experimental Design, Measurement, Simulation, and Modeling. R. K. Jain. Wiley. 1991.
o A very good book about performance analysis.
o There is one copy in the library. Please be nice to your fellow students and do not lend that copy for an extended period of time.
· Computer Architecture, Fifth Edition: A Quantitative Approach, 5ed. John L. Hennessy, David A. Patterson. Morgan Kaufmann. 2011.
o This book has a throughout description of different parallel architectures.
o The book can be purchased from your favorite bookstore.
· The Fourth Paradigm: Data-Intensive Scientific Discovery. Edited by Tony Hey, Stewart Tansley, and Kristin Tolle. 2010.
o This collection of essays describe many of the opportunities and challenges for data-intensive computing in different scientific fields.
o The book is freely available as an ebook.
· Hadoop: The Definitive Guide, 3ed. Tom White. O’Reilly. 2012.
o Nice overview of the Hadoop ecosystem, included detailed description of HDFS and hadoop MapReduce.
o This book is available in the library as a Safari ebook.
· HBase: The Definitive Guide. Lars George. O’Reilly. 2012.
o Detailed description of HBase included tips for tuning the system.
o This book is available in the library as a Safari ebook.
· ACM parallel computing techpack
· Intel Academic Resources especially Introduction to Parallel Programming for Shared Memory Parallelism (Intel)
· Python threads and Python multiprocessing
· The Go programming language, Go Wiki, Go blog, Go examples, and Go forum
· Communicating Sequential Processes by C. A. R. Hoare, CACM vol. 21 (8), 1978. (paper describes concepts used by Go)
· Parallel programming patterns
1. Introduction
a. None
2. Threads and synchronization primitives
a. Compare the overhead of forking a process vs. creating a Pthread
b. Compare the overhead of forking a process vs. creating a Python thread
c. Implement a solution the following classical IPC problems using pthreads/Python threads and semaphores/condition variables. Note that you also need to generate a use case, test data, and useful output:
i. Producer/ consumer
ii. Reader/ writer
iii. Sleeping barber
iv. Dining philosophers
d. Modify the code in c) to use message passing.
3. Parallel architectures
a. None
4. Parallel programs
a. Implement the kernel in chapter 2.3 with shared memory orchestration using Python threads. You also need to generate a use case, test data, and some way to see the result (tip: if you use a heatmap visualization you can visually see the changes in the output as the computation proceeds).
5. Programming for performance
a. Implement a tuple space in Python with semantics similar to Linda. Use your tuple space to implement a parallel version of Mandelbrot that uses dynamic assignment (pool of tasks).
6. Go
a. Implement the classical IPC problems in exercise 2.c. in Go.
7. Performance evaluation
a. Find two systems research papers. Study the methodology. Do they follow the recommended steps for measurements? Do they avoid all common mistakes? How have they monitored the system? Did they use simulation? Did they presented an analytical model?
8. Parallel program performance evaluation
a. Go through PARSEC paper and find out how they did each of the “Steps for a performance evaluation study”.
9. Stallo guest lecture
a. None
10. Data-intensive computing
a. Read a few of the additional readings articles.
11. GFS/HDFS and MapReduce
a. Implement the programs in the MapReduce tutorial
b. Implement the programs in the Pydoop tutorial
12. BigTable/HBase:
a. Implement the programs in the Hbase tutorial
13. Summary
a. Sample exam
All group/labs are on Thursdays, 12:15-14:00, MH U6 or lg-lab.
Project |
Date |
Subject |
Lecturer |
P1 |
Thu
22.08 |
Ibrahim |
|
|
Thu 12.09 |
P1 due |
|
P2 |
Thu
20.09 |
Lars
Ailo |
|
|
Thu 24.10 |
P2 due |
|
P3 |
Thu
24.10 |
Lars
Ailo |
|
|
Tue 05.11 |
P3 due |
|
The exam is on Friday November 22nd, 09.00-13.00, Åsgårdvegen 9. More details are below.
Time and place for written exams - autumn semester 2013
Overview of the time and place for the written examinations in the
autumn of 2012 can be found here:
http://uit.no/Content/105768/oppslag%2Cuts%2Ckont.%2CH%C3%98ST-13.pdf
The schedule for the written exams in computer science can be found on
page 6 in the document.
NB! Check carefully the time and venue for the
specific examination. Remember, you must identify yourself at the exam. Bring
valid ID.
Show up early, to sit in place no later than 15 minutes before the
exam starts.
Rules during examination (written exams) ca
be found here: http://en.uit.no/utdanning/art?dim=179018&p_document_id=347763