CS385 – Computer Architecture (Section 01)

Spring-2018

Classes: MW 3:05pm - 4:20pm, Maria Sanford Hall 204
Instructor: Dr. Zdravko Markov, MS 30307, (860)-832-2711, http://www.cs.ccsu.edu/~markov/, e-mail: markovz at ccsu dot edu
Office hours: MW 4:30pm - 6:00pm, TR 10:45am - 12:00pm, or by appointment

Catalog description: The architecture of the computer is explored by studying its various levels: physical level, operating-system level, conventional machine level and higher levels. An introduction to microprogramming and computer networking is provided.

Course Prerequisites: CS 354

Prerequisites by topic

Course description

The course provides a comprehensive coverage of computer architecture. It discusses the main components of the computer and the basic principles of its operation. It demonstrates the relationship between the software and the hardware and focuses on the foundational concepts that are the basis for current computer design. The course is based on the MIPS processor, a simple clean RISC processor whose architecture is easy to learn and understand. The major topics covered in the course are the following:
  1. MIPS instruction set
  2. Computer arithmetic and ALU design
  3. Datapath and control
  4. Using Hardware Description Language to design and simulate the CPU
  5. Pipelining
  6. Memory hierarchy, caches and virtual memory
  7. Interfacing CPU and peripherals, buses
  8. Multiprocessors, networks of multiprocessors, parallel programming
  9. Performance issues

Course Learning Outcomes (CLO)

  1. Understand the fundamentals of different instruction set architectures and their relationship to the CPU design.
  2. Understand the principles and the implementation of computer arithmetic.
  3. Understand the operation of modern CPUs including pipelining, memory systems and busses.
  4. Understand the principles of operation of multiprocessor systems and parallel programming.
  5. Design and emulate a single cycle or pipelined CPU by given specifications using Hardware Description Language (HDL).
  6. Work in teams to design and implement CPUs.
  7. Write reports and make presentations of computer architecture projects.
The CS 385 Course Learning Outcomes support the following Student Outcomes (SO):

Required textbook

Required software

  1. Icarus Verilog: HDL compiler and simulator, available from the book companion website or at http://bleyer.org/icarus/. Note about installation: don't use folder names that include spaces (like Program Files). Read book sections B.4 and 5.8 for using HDL. An online simulator is availabe at https://www.tutorialspoint.com/compile_verilog_online.php
  2. SPIM simulator: A free software simulator for running MIPS R2000 assembly language programs available for Windows and other platforms.
  3. Other simulators that may be used for drawing logic diagrams and experimenting with small circuirs (note that the semester project should be done with Verilog):
Semester project: There will be a semester project to build a simplified MIPS machine. The projects will be done in teams of 2-3 people each and will require three progress reports, a final report and a presentation. The machine must be implemented in HDL Verilog, tested with a sample MIPS program and properly documented. The progress and the final reports must be submitted in Blackboard Learn at https://ccsu.blackboard.com/.

Class Participation: Active participation in class is expected of all students. Regular attendance is also expected. If you must miss a class, try to inform the instructor of this in advance.In case of missed classes and work due to plausible reasons (such as illness or accidents) limitted assistance will be offered. Unexcused absences will result in the student being totally responsible for the make-up process.

Honesty policy: The CCSU honor code for Academic Integrity is in effect in this class. It is expected that all students will conduct themselves in an honest manner and NEVER claim work which is not their own. Violating this policy will result in a substantial grade penalty, and may lead to expulsion from the University. You may find it online at http://web.ccsu.edu/academicintegrity/. Please read it carefully.

Grading: Grading will be based on one programming assignment (10%), a midterm test (20%), a final exam (25%) and a semester project (45%, including progress reports, the final documentation, and the presentation). The letter grades will be calculated according to the following table:
 
A A- B+ B B- C+ C C- D+ D D- F
95-100 90-94 87-89 84-86 80-83 77-79 74-76 70-73 67-69 64-66 60-63 0-59

Unexcused late submission policy: Submissions made more than two days after the due date will be graded one letter grade down. Submissions made more than a week late will receive two letter grades down. No submissions will be accepted more than two weeks after the due date.

Students with disabilities: Students who believe they need course accommodations based on the impact of a disability, medical condition, or emergency should contact me privately to discuss their specific needs. I will need a copy of the accommodation letter from Student Disability Services in order to arrange class accommodations. Contact Student Disability Services, Willard Hall, 101-04 if you are not already registered with them. Student Disability Services maintains the confidential documentation of your disability and assists you in coordinating reasonable accommodations with your faculty.


Tentative schedule of classes and assignments

Note: Dates for classes, assignments and tests may change (see also University Calendar). The lecture notes may also be updated. Check the schedule and the class pages regularly for updates!
  1. Jan 22: Introduction: Computer Architecture = Instruction Set Architecture + Machine Organization
  2. Jan 24: Review of HDL
  3. Jan 29: MIPS Instructions: arithmetic, registers, memory, fecth&execute cycle
  4. Jan 31: MIPS Instructions: control and addressing modes
  5. Jan 31 : Submit Digital Design Review Assignment (extra credit)
  6. Feb 5: Computer arithmetic and ALU design: representing numbers, arithmetic and logic operations
  7. Feb 7: Snow day
  8. Feb 12: ALU design: full adder, slt operation, HDL design, carry lookahead
  9. Feb 12: Assignment 1 due (10 pts.)
  10. Feb 14: ALU design: multiplication, representing floating point numbers
  11. Feb 21: The Processor: Building a datapath
  12. Feb 26: The Processor: Control (single cycle approach)
  13. Feb 28: Using a Hardware Description Language to Design and Simulate the MIPS processor. Review of Semester Project Reports #1 and #2
  14. March 5: Introduction to pipelining
  15. March 7: Progress Report #1 due (10 pts.): A simpilfied single-cycle datapath capable of executing the addi instruction and all R-type instructions. See Semester Project for details.
  16. March 7: Snow day
  17. March 19: Solving pipeline hazards
  18. March 21: Snow day
  19. March 23-25: Midterm Test (20 pts.) to be taken online in Blackboard
  20. March 26: Implementing pipeline datapath and control, Implementing data and branch hazards control
  21. March 28: Review of Datapath, Control and Pipelining, HDL implementation
  22. April 2: Progress Report #2 due (10 pts.): Complete single-cycle datapath. See Semester Project for details.
  23. April 2: Memory hierarchy
  24. April 4: The Basics of caches
  25. April 9: Improving cache performance
  26. April 11: Virtual Memory basics
  27. April 16: Progress Report #3 due (10 pts.): 3-stage pipelined datapath for addi and R-type instructions. See Semester Project for details.
  28. April 16: Virtual Memory optimization
  29. April 18: A general framework of memory hierarchies
  30. April 23: Interfacing Processors and Peripherals - Buses
  31. April 25: Interfacing I/O devices to Memory, CPU and OS
  32. April 30: Multiprocessors
  33. May 2: Networks of muiltiprocessors, Review of memory System, Buses, I/O and Multiprocessors
  34. May 7-10: Final Exam (25 pts.) to be taken online in Blackboard
  35. May 10: Semester Project Final Report due (10 pts.)
  36. May 7, 1:00pm - 3:00pm: Semester project presentation (5 pts.)

CS385 – Computer Architecture, Lecture 1

Reading: Chapter 1
Topics: Introduction, Computer Architecture = Instruction Set Architecture + Machine Organization.
Lecture slides (PDF)

Lecture Notes

  1. Levels of Abstraction
  2. Computer Architecture = Instruction Set Architecture + Machine Organization
  3. Instruction Set – The Software Hardware Interface
  4. Levels of Computer Architecture in More Depth
  5. Basic Components of a Computer
  6. Computer Organization
  7. Performance (Lecture slides (PDF)

CS385 – Computer Architecture, Lecture 2

Reading: Patterson & Hennessy - Sections 2.1 - 2.3, 2.5 - 2.7, 2.10, 2.13, 2.16 - 2.20, A.9, Tutorials/Getting Started with PCSpim (book companion website).
Topics: MIPS instructions, arithmetic, registers, memory, fecth& execute cycle, SPIM simulator
Lecture slides (PDF)

Lecture Notes

  1. Design goal: maximize performance and minimize cost. Primitive (low level) and very restrictive instructions (fixed number and type of operands).
  2. Design principles:
  3. MIPS arithmetic: 3 operands, fixed order, registers only.
  4. Using only registers: R-type instructions.
  5. Registers: 32-bits long, conventions.
  6. Memory organization: words and byte addressing.
  7. Data transfer (load and store) instructions. Example: accessing array elements.
  8. Translating C code into MIPS instructions – the swap example.
  9. Machine Language: instruction format, I-type (Immediate) format for data transfer
  10. Stored program concept: programs in memory, fetch & execute cycle
Exercises: Load this program in the SPIM simulator and analyze the format of the insturctions. Run the program with different values of X and Y and trace the execution in step mode.

CS385 – Computer Architecture, Lecture 3

Reading: Patterson & Hennessy - Chapter 2, Appendix A
Topics: MIPS Instructions: control and addressing modes
Lecture slides (PDF)

Lecture Notes

  1. Implementing the C code for if in MIPS: conditional branch.
  2. Implementing the C code for if–else in MIPS: unconditional branch
  3. Simple for loop
  4. Check for less-than: building a pseudoinstuction for branch if less-than.
  5. Addressing in branch instructions: PC-relative and pseudodirect.
  6. Constants: use of immediate addressing (constants as operands – addi, slti, andi, ori)).
  7. 32-bit constants – manipulate upper 2 bytes separately (load upper immediate)
  8. Summary of MIPS addressing: register (add), immediate (addi), base or displacement (lw), PC-relative (bne), pseudodirect (j)

CS385 – Computer Architecture, Lecture 4

Reading: Patterson & Hennessy - Sections 3.1, 3.2, B1-6.
Topics: Computer arithmetic and ALU design: representing numbers, arithmetic and logic operations
Lecture slides (PDF)

Lecture Notes

  1. Representing numbers: sign bit, one's complement, two's complement.
  2. Arithmetic: addition, subtraction, detecting overflow.
  3. Logical operations: shift, and, or.
  4. Basic ALU building components: and-gate, or-gate, inverter, multiplexor.
  5. ALU for logical operations.
  6. ALU for add, and, or.
  7. Supporting subtraction
Exercises: Implement an overflow detection unit using only the CarryIn and CarryOut bits of ALU-31

Tutorials and practice quizzes on two’s complement numbers:


CS385 – Computer Architecture, Lecture 5

Reading: Patterson & Hennessy - B1-6.
Topics: ALU design: full adder, slt operation, HDL design, carry lookahead
Lecture slides (PDF)
Programs: 4-bit-adder.vl, more examples of Verilog programs, mips-alu.vl, ALU4-mixed.vl
Lecture Notes
  1. Implementation of a full adder:
  2. Supporting set on less-than (slt).
  3. Test for equality (needed for branching)
  4. Designing the ALU in Verilog
  5. Carry Lookahead
Exercises: Implement the 4-bit adder with carry lookahead logic in Verilog using the structural specification approach (gate-level modeling).

CS385 – Computer Architecture, Lecture 6

Reading: Patterson & Hennessy - Sections 3.3, 3.5.
Topics: ALU design: multiplication, representing floating point numbers
Lecture slides (PDF)

Lecture Notes

  1. Implementing multiplication:
  2. Floating point numbers
Tutorials and practice quizzes on floating point numbers:

CS385 – Computer Architecture, Lecture 7

Reading: Sections 4.1 - 4.3
Topics: The Processor, Building a Datapath
Lecture slides (PDF)
Programs: mips-regfile.vl, mips-r-type.vl, mips-r-type_addi.vl

Lecture Notes

  1. Abstract level implementation:
  2. Basic building elements
  3. Fetching instructions and incrementing the program counter
  4. Register file and execution of R-type instructions
  5. Datapath for lw and sw instructions (add data memory and sign extend)
  6. Datapath for branch instructions
Demo:

CS385 – Computer Architecture, Lecture 8

Reading: Patterson & Hennessy - Section 4.4
Topics: Single-cycle control
Lecture slides (PDF)
Programs: mips-r-type.vl, mips-r-type_addi.vl, mips-simple.vl

Lecture Notes

  1. ALU control: mapping the opcode and function bits to the ALU control inputs
  2. Designing the main control unit
  3. Operation of the Datapath (single-cycle implementation):
  4. Problems of the single-cycle implementation

CS385 – Computer Architecture, Lecture 11

Reading: Patterson & Hennessy - B.4, Section 4.13 (http://booksite.elsevier.com/9780124077263/appendices.php)
Topic: Using Hardware Description Language to Design and Simulate the MIPS processor
  1. Behavior model of MIPS - single cycle implementation: mips-simple.vl
  2. Project version (progress report #2). Changes needed:
Exercises


CS385 – Computer Architecture, Lecture 13

Reading: Patterson & Hennessy - Section 4.5
Topic: Introduction to Pipelining
Lecture slides (PDF)

Lecture Notes

  1. Pipelining by analogy (laundry example):
  2. Five stages of the load MIPS instruction
  3. The pipelined datapath
  4. Single cycle, multiple cycle vs. pipeline
  5. Advantages of pipelined execution
  6. Problems with pipelining (pipeline hazards)

CS385 – Computer Architecture, Lecture 14

Reading: Patterson & Hennessy - Section 4.5, 4.6
Topic: Solving pipeline hazards, Designing a pipelined processor
Lecture slides I (PDF)
Lecture slides II (PDF)

Lecture Notes

  1. Structural hazards: single memory
  2. Control hazards:
  3. add $4, $5, $6            beq $1, $2, $40
    beq $1, $2, 40     ==>    add $4, $5, $6
    lw $3, 300($0)            lw $3, 300($0)
  4. Data hazards (dependecies backwards in time):
  5. lw $t0, 0($t1)               lw $t0, 0($t1)
    lw $t2, 4($t1)      ==>      lw $t2, 4($t1)
    sw $t2, 0($t1)               sw $t0, 4($t1)
    sw $t0, 4($t1)               sw $t2, 0($t1)
  6. Designing a pipelined processor

CS385 – Computer Architecture, Lecture 15

Reading: Patterson & Hennessy - Section 4.6, Section 4.13 (http://booksite.elsevier.com/9780124077263/appendices.php)
Topic: Implementing pipeline datapath and control
Lecture slides (PDF)

Lecture Notes

  1. Splitting datapath into stages: using registers to store parts of the instruction
  2. Transferring data forward and backward between the stages: lw example
  3. Corrected datapath: storing rd for the write back stage.
  4. Graphically representing pipelines: multiple-clock-cycle vs. single-clock-cycle diagram
  5. Pipeline control:
  6. Datapath with control
  7. Example: running this code through the pipeline in 9 cycles.

  8. lw   $10, 20($1)
    sub  $11, $2, $3
    and  $12, $4, $5
    or   $13, $6, $7,
    add  $14, $8, $9
Exercises: Section 4.13 (http://booksite.elsevier.com/9780124077263/appendices.php), pages 16-30.


CS385 – Computer Architecture, Lecture 16

Reading: Patterson & Hennessy - Section 4.7, 4.8
Topic: Implementing data and branch hazard control
Lecture slides (PDF)

Lecture Notes

  1. Detecting data dependencies
  2. Forwarding
  3. Data hazards and stalls

  4. If (ID/EX.MemRead and
        (ID/EX.Rt = IF/ID.Rs or
         ID/EX.Rt = IF/ID.Rt))
       stall the pipeline
  5. Branch hazards
  6. Advanced pipelining


CS385 – Computer Architecture, Lecture 17

Reading: Patterson & Hennessy - Chapter 4, Section 4.13 (http://booksite.elsevier.com/9780124077263/appendices.php) Topics: Review of Datapath, Control and Pipelining, HDL implementation (mips-pipe.vl), 3-stage pipeline (mips-pipe3.vl)
Programs: mips-pipe.vl, mips-pipe3.vl

Lecture slides (PDF)

Lecture Notes

Datapath

  1. Abstract level implementation:
  2. Basic building elements
  3. Basic operations

Control

  1. ALU control: mapping the opcode and function bits to the ALU control inputs
  2. Designing the main control unit
  3. Operation of the Datapath (single-cycle implementation):
  4. Problems of the single-cycle implementation

Pipelining

  1. Basic principles of pipelining
  2. MIPS pipelining: the five stages of the lw instruction
  3. Problems with pipelining:
  4. Designing a pipelined processor
  5. Transferring data forward and backward between the stages
  6. Pipeline control
  7. Implementing data and branch hazard control
  8. Advanced pipelining

Demo:


CS385 – Computer Architecture, Lecture 18

Reading: Patterson & Hennessy - Section 5.1
Topic: Memory Hierarchy
Lecture slides (PDF)

Lecture Notes

  1. Memory technologies and trends
  2. Impact on performance
  3. The need of hierarchical memory organization
  4. The principle of locality
  5. Memory hierarchy terminology
  6. Basics of RAM implementation

CS385 – Computer Architecture, Lecture 19

Reading: Patterson & Hennessy - Sections 5.1-5.3
Topic: The Basics of caches
Lecture slides (PDF)
Programs: cache.vl, cache2.vl

Lecture Notes

  1. Direct-mapped cache
  2. Accessing a cache
  3. Writing to the cache (write-through and write-back schemes)
  4. Handling cache misses
  5. Example: DECStation 3100 cache
  6. Spatial locality caches: keeping consistency on write
  7. Main memory organization

CS385 – Computer Architecture, Lecture 20

Reading: Patterson & Hennessy - Section 5.4
Topic: Improving cache performance
Lecture slides (PDF)

Lecture Notes

  1. Measuring cache performance
  2. Flexible placement of blocks in the cache
  3. Locating a block in the cache: N-way cache requires N comparators and N-way multiplexor
  4. Choosing which block to replace: least recently used
  5. Multilevel caches
Exercises

CS385 – Computer Architecture, Lecture 21

Reading: Patterson & Hennessy - Section 5.7
Topic: Virtual Memory
Lecture slides (PDF)

Lecture Notes

  1. The need of VM
  2. VM organization and terminology: virtual address, physical address, page, page offset, page fault, memory mapping (translation).
  3. Design decisions motivated by the very high cost of page faults:
  4. Addressing pages:

CS385 – Computer Architecture, Lecture 22

Reading: Patterson & Hennessy - Section 5.8
Topic: Virtual Memory optimization
Lecture slides (PDF)

Lecture Notes

  1. Optimizing address translation - Translation Lookaside Buffer (TLB):
  2. MIPS R2000 (DECStation 3100) TLB
  3. Overall operation of a memory hierarchy
  4. Memory protection with VM
  5. Using exceptions for handling TLB misses and pages faults: using EPC and Cause registers
  6. Summary of VM

CS385 – Computer Architecture, Lecture 23

Reading: Patterson & Hennessy - Section 5.5, 5.6.
Topic: A general framework of memory hierarchies
Lecture slides (PDF)

Lecture Notes

  1. Associativity schemes
  2. Placing blocks
  3. Miss rates and cache sizes
  4. Finding blocks
  5. Why do we use full associativity and a separate lookup table (page table) in VM
  6. Choosing a block to replace
  7. Writing blocks
  8. The sources of misses
  9. The challenge: reducing the miss rate has a negative effect on the overall performance
  10. Pentium Pro and PowerPC 604

Exercises

5.7.1, 5.7.2, 5.7.3, 5.11

CS385 – Computer Architecture, Lecture 24

Reading: Sections 6.1 - 6.5 (COD 4th Edition - see https://ccsu.blackboard.com/)
Topic: Interfacing Processors and Peripherals - Buses
Lecture slides (PDF)

Lecture Notes

  1. Buses: lines, transactions, types
  2. Synchronous and asynchronous buses
  3. Handshaking protocol
  4. Bus access: master and slave
  5. Bus arbitration schemes
  6. Bus standards

CS385 – Computer Architecture, Lecture 25

Reading: Section 6.6 - 6.8 (COD 4th Edition - see https://ccsu.blackboard.com/)
Topic: Interfacing I/O devices to Memory, CPU and OS
Lecture slides (PDF)

Lecture Notes

  1. The role of the operating system in interfacing I/O devices to Memory
  2. Controlling the I/O devices
  3. Communicating with the processor
  4. Direct memory access (DMA)
  5. DMA and the memory system
  6. Designing an I/O system: latency and bandwidth constraints.

CS385 – Computer Architecture, Lecture 26

Reading: Chapter 6, Section 2.11
Topic: Multiprocessors
Lecture slides (PDF)
COD-Chapter7.pdf

Lecture Notes

  1. Amdahl's Law
  2. Basic approaches to sharing data and types of connectivity
  3. Programming multiprocessors
  4. Multiprocessors connected by a single bus
  5. A parallel program
  6. Multiprocessor cache coherency
  7. Implementing a multiprocessor cache coherency protocol
  8. Synchronization using coherency, locks, atomic swap operation
  9. again: addi $t0, $0, 1 # copy locked value
    ll $t1, 0($s1) # load linked
    sc $t0, 0($s1) # store conditional
    beq $t0, $0, again # branch if store fails
    add $s4, $0, $t1 # put load value in $s4


CS385 – Computer Architecture, Lecture 27

Reading: Chapter 6
Topic: Networks of muiltiprocessors and clusters
Lecture slides (PDF)
COD-Chapter7.pdf

Lecture Notes

  1. Shared memory vs. multiple private memories
  2. Centralized memory vs. distributed memory
  3. Parallel programming by message passing
  4. Distributed memory communication
  5. Memory allocation
  6. Clusters and network topology
  7. Modern clusters:

Digital Design Review Assignment

Log on to Blackboard to see and submit the assignment.

CS385 Assignment 1: Assembly Programming in MIPS (maximum grade 10 points)

Log on to Blackboard to see and submit the assignment.

CS385 Semester Project: Building a mini MIPS machine (maximum grade 45 points including the presentation)

Log on to Blackboard to see and submit the project.

CS385 Midterm Test (maximal grade 20 points)

The midterm test will be available in Blackboard Learn. There will be 20 multiple choice and short answer questions that have to be answered in 90 minutes. The topics include:

CS385 Final Exam (maximal grade 25 points)

The Final Exam will be available in Blackboard Learn at https://ccsu.blackboard.com/. There will be 25 multiple choice and short answer questions that have to be answered in 2 hours. The topics include: