RIT Computer Engineering


Visitors Since 25-Aug-2015


   Presentations 2:   Thursday December 10
           5:00-7:00 PM in GLE 9/3149.

           Email talks by 2:00PM Wednesday, December 9

   Presentations 3:   Tuesday December 15
           5:00-7:00 PM in GLE 9/3149.

           Email talks by 2:00pm Monday, December 14

Project papers due at the time of the second presentations.
Email the paper and bring a hard copy to the talks.


This course covers a number of important issues involved in the design and utilization of high performance parallel computing systems. This includes: parallel computer models, parallel program characteristics & creation steps, parallel system performance evaluation, the concept of scalable performance, parallel and scalable architectures, parallel programming concepts, network requirements for parallel computing, cache coherence in shared memory machines. A number of current parallel machines will be studied.


  Assignment #1 :  PDFWord,   Zipped file (has sequential code files).

            Due Tuesday, October 27 (@ 23:59).

            Submit to myCourses "Assignment 1" Dropbox.

  Assignment #2 :  PDFWord,   Zipped file.

            Due Thursday, November 19 (@ 23:59).

            Submit to myCourses "Assignment 2" Dropbox.

  CE Cluster information:  PDF :  Word,    (also handed out in class).

  Additional programming help/information found at: mps.ce.rit.edu

  Homework Assistants/TAs/Graders:

            Stephen Moskal     e-mail:   sfm5015@rit.edu

            Vignesh Kothandapani     e-mail:   vxk5499@rit.edu

  CE Cluster System Administrators:

            Emilio Del Plato     e-mail:   ehdeec@rit.edu

            Paul Mezzanini     e-mail:   pfmeec@rit.edu


For the following lecture notes you can download or view a lecture as an Acrobat PDF file, or as a Microsoft Powerpoint file:

8-25-2015 The Need and Feasibility of Parallel Computing, Technology Trends, Microprocessor Performance Attributes, Goal of Parallel Computing. Computing Elements, Programming Models, Flynn's Classification, Multiprocessors Vs. Multicomputers. Current Trends In Parallel Architectures, Communication Architecture.
(PCA Chapter 1.1, 1.2)

9-1-2015 Parallel Architectures Convergence: Communication Architecture, Communication Abstraction. Naming, Operations, Ordering, Replication. Communication Cost Model.
(PCA Chapter 1.2, 1.3)

9-8-2015 Parallel Computations/Programs: Conditions of Parallelism. Asymptotic Notations for Algorithm Analysis, PRAM. Levels of Parallelism, Hardware Vs. Software Concurrency. Data Vs. Functional Parallelism. Amdahlís Law, DOP, Concurrency Profile. Steps in Creating Parallel Programs: Decomposition, Assignment, Orchestration, Mapping.
(PCA Chapter 2.1, 2.2)

9-24-2015 Parallelization of An Example Problem/Program: Ocean simulation Iterative equation solver (2D Grid).
(PCA Chapter 2.3)

10-6-2015 Cluster Computing: Origins, Broad Issues in Heterogeneous Computing (HC). Message-Passing Programming. Overview of Message Passing Interface (MPI 1.2).
(PP Chapter 2, Appendix A, MPI and HC References Below)

10-20-2015 Considerations in Parallel Program Creation Steps for Performance.
(PCA Chapter 3)

10-29-2015 Basic/Fundamental Parallel Computing/Programming Techniques and Examples. Massively Parallel Computations: Pixel-based Image Processing. Divide-and-conquer Problem Partitioning: Parallel Bucket Sort, Numerical Integration, Gravitational N-Body Problem. Pipelined Computations: Addition, Insertion Sort, Solving Upper-triangular System of Linear Equations. Synchronous Iteration: Barriers, Iterative Solution of Linear Equations. Dynamic Load Balancing: Centralized, Distributed, Moore's Shortest Path Algorithm.
(PP Chapters 3-7, 12)

11-10-2015 Network Properties, Scalability and Requirements For Parallel Processing. Static Point-to-point Connection Network Topologies. Network Embeddings. Dynamic Connection Networks.
(PP Chapter 1.3, PCA Chapter 10, handout)

11-19-2015 Parallel System Performance: Evaluation & Scalability. Workload Selection. Parallel Performance Metrics Revisited. Application/Workload Scaling Models of Parallel Computers. Parallel System Scalability.
(PP Chapter 1, PCA Chapter 4, handout)

11-24-2015 The Cache Coherence Problem in Shared Memory Multiprocessors. Cache Coherence Approaches. Snoopy Bus-Snooping Cache Coherence Protocols: Write-invalidate: MSI, MESI, Write-Update: Dragon.
(PCA Chapter 5, handout)

12-1-2015 Cache Coherence in Scalable Distributed Memory Machines: Hierarchical Snooping, Directory-based cache coherence.
(PCA Chapter 8)


Tuesday and Thursday: 5:00-6:15 PM, 9/3149


Dr. Muhammad Shaaban
e-mail: meseec@rit.edu
Office: 9-3469   X52373

Office Hours:
My Fall 2015 schedule


Quizzes: 40%
Homework: 30%
Special Topic paper and presentation: 30%

Quizzes are announced one class in advance, and are given only during first thirty-fourty minutes of the specified class. Quizzes are closed references (e.g. no books, notes, handouts, etc...). Calculators will be helpful. There are no makeup quizzes.

Special Topic paper and presentation:
Students will select one partner from the class to research a topic in the field of parallel processing & parallel computer architecture, write a report, and give a presentation. Each groupís topic must be presented and approved by Dr. Shaaban. Duplicate topics are not permitted and proposals are accepted on a first come first serve basis.

The Paper: Each group will write a joint report (~ 6-8 pages) on their research using the IEEE journal format/guidelines/template. DO NOT CHANGE THE TEMPLATE! Take great care in following the guidelines, especially properly citing illustrations, graphs and quoting from their respective sources. The paper is due (hardcopy and electronic) at the beginning of the last day of the presentations. Late submissions will be significantly penalized. Plagiarism will result in a Zero (see page 14 of the KGCOE 2014-2015 Graduate Student Handbook).

The Presentation: Each group will give a 20-minute PowerPoint presentation of their research to the entire class. This is a joint presentation and the group must be thoroughly prepared to answer questions. A signup sheet for a time slot will be available towards the end of the quarter. Attendance is mandatory for all presentation sessions. Missing your presentation slot and/or electronic submission time will result in a zero. You must submit your Microsoft Power Point presentation electronically to Dr. Shaaban 24 hrs prior to your presentation time slot. Samples of prior presentations are available on the course website.


Current:           http://people.rit.edu/meseec/cmpe655-fall2015/
Fall 2014:        http://people.rit.edu/meseec/cmpe655-fall2014/
Spring 2014:    http://people.rit.edu/meseec/cmpe655-spring2014/
Fall 2013:        http://people.rit.edu/meseec/cmpe655-fall2013/
Spring 2013:    http://people.rit.edu/meseec/eecc756-spring2013/
Spring 2012:    http://people.rit.edu/meseec/eecc756-spring2012/
Spring 2011:    http://people.rit.edu/meseec/eecc756-spring2011/
Spring 2010:    http://people.rit.edu/meseec/eecc756-spring2010/
Spring 2009:    http://people.rit.edu/meseec/eecc756-spring2009/
Spring 2008:    http://people.rit.edu/meseec/eecc756-spring2008/
Spring 2007:    http://people.rit.edu/meseec/eecc756-spring2007/
Spring 2006:    http://people.rit.edu/meseec/eecc756-spring2006/
Spring 2005:    http://people.rit.edu/meseec/eecc756-spring2005/
Spring 2004:    http://people.rit.edu/meseec/eecc756-spring2004/
Spring 2003:    http://people.rit.edu/meseec/eecc756-spring2003/
Spring 2002:    http://people.rit.edu/meseec/eecc756-spring2002/
Spring 2001:    http://people.rit.edu/meseec/eecc756-spring2001/
Spring 2000:    http://people.rit.edu/meseec/eecc756-spring2000/
Spring 99:       http://people.rit.edu/meseec/eecc756-spring99/
Spring 98:       http://people.rit.edu/meseec/eecc756-spring98/


Computer Architecture CMPE-550. Working knowledge in C Language.



Parallel Computer Architecture (PCA): A Hardware/Software Approach, David E. Culler, Jaswinder P. Singh, Morgan Kaufmann Publishers, 1999.

Parallel Programming (PP): Techniques and Applications Using Networked Workstations and Parallel Computers, Second Edition, Barry Wilkinson, Michael Allen, Prentice Hall, 2004.


Designing and Building Parallel Programs, Ian Foster, Addison-Wesley, 1995, complete textbook online, ( includes a chapter on MPI).

Scalable Parallel Computing, Kai Hwang, Zhiwei, McGraw-Hill, 1998.

Advanced Computer Architecture: Parallelism, Scalability, Programmability, Kai Hwang, McGraw-Hill, 1993.

Parallel Virtual Machine (PVM):

PVM (Parallel Virtual Machine) Home Page

PVM: Parallel Virtual Machine: A Users' Guide and Tutorial for Networked Parallel Computing, Al Geist(Editor), et al, MIT Press, 1994, complete online version , PDF version , postscript version.

Advanced Tutorial on PVM 3.4 New Features and CapabilitiesAl Geist, Presented at EuroPVM-MPI'97, 1997.

Message-Passing Interface (MPI):

Open MPI Home Page.

MPI Home Page.

MPICH Home Page A Portable Implementation of MPI.

Beginner's Guide to MPI , University of Delaware.

MPI: The Complete Reference , Marc Snir et al., First Edition, 1995 (html version of book).

Heterogeneous Computing (HC)

Heterogeneous Computing: Challenges and Opportunities, PDF,
Ashfaq A. Khokhar, Viktor K. Prasanna, Muhammad E. Shaaban, Cho-Li Wang
IEEE Computer, June 1993 (Vol. 26, No. 6), pp. 18-27.

Heterogeneous Distributed Computing , PDF,
Muthucumaru Maheswaran, Tracy D. Braun, Howard Jay Siegel,
Edited version of a chapter appearing in the Encyclopedia of Electrical and Electronics Engineering, J. G. Webster, editor, John Wiley & Sons, New York, NY, 1999 Vol. 8, pp. 679-690.

A Comparison Study of Static Mapping Heuristics for a Class of Meta-tasks on Heterogeneous Computing Systems, PDF,
Tracy D. Braun, Howard Jay Siegel, Noah Beck, Ladislau L. Boloni, Albert I. Reuther, Mitchell D. Theys, Bin Yao, Richard F. Freund
Proceedings 8th Heterogeneous Computing Workshop, 1999. (HCW 1999), 1999, pp. 15-29.

Segmented min-min: a static mapping algorithm for meta-tasks on heterogeneous computing systems, PDF,
Min-You Wu, Wei Shu, H. Zhang,
Proceedings. 9th Heterogeneous Computing Workshop, 2000. (HCW 2000), 2000, pp. 375 -385.

Performance-Effective and Low-Complexity Task Scheduling for Heterogeneous Computing, PDF,
H. Topcuoglu, S. Hariri, M.Y. Wu,
IEEE Transactions on Parallel and Distributed Systems, March 2002 (Vol. 13, No. 3).

Greedy Heuristics for Resource Allocation in Dynamic Distributed Real-Time Heterogeneous Computing Systems, PDF,
S. Ali, J. Kim, H. J. Siegel, A. Maciejewski, Y. Yu, S. Gundala, S. Gertphol, V. K. Prasanna,
Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA '02), June 2002, (Volume 2), pp. 519-530.

Selected papers.

EECC756 Spring 1999 Class Projects.


Attending all lecture sessions is expected.


Week1: Motivation for Parallel Computing, Parallel Programming Models, Classification of Parallel Architectures.
Week2: Parallel Architectures Convergence: Communication Architecture, Communication Abstraction. Naming, Operations, Ordering, Replication. Communication Cost Model.
Week3: Parallel Program characteristics and creation Steps.
Program Parallelization Example: Ocean simulation Iterative equation solver.
Week4-5: Heterogeneous & Cluster Computing. Message-Passing Programming & Environments.
Message Passing Interface (MPI).
Week6-7: Parallel Programming for Performance.
Week8-9: Message-Passing Computing Examples.
Week10-11: Network Requirements For Parallel Computing. Static Point-to-point Connection Network Topologies. Dynamic Connection Networks.
Week12: Parallel System Performance: Evaluation & Scalability Metrics.
Week13: Shared Memory Multiprocessors. The Cache Coherence Problem. Scalable Distributed Memory Parallel Machines.
Week14: Cache Coherence in Scalable Distributed Memory Parallel Machines.
Week15-17: Project Presentations.

  This page is bytes long and was last modified on:   .

Made with at least 30% post-consumer recycled bits