CS502 - Computing and Communication Technology

Fall-2009

Classes: TR 4:00 pm - 5:15 pm, classroom: Frank J. DiLoreto Hall 011
Instructor: Dr. Zdravko Markov, MS 307, (860)-832-2711, http://www.cs.ccsu.edu/~markov/, e-mail: markovz at ccsu.edu
Office hours: TR 10:00 am - 11:00 am, 12:30 pm - 2:00 pm or by appointment

Prerequisites: Must be enrolled in graduate level

Description:  The course offers a comprehensive coverage of the basic concepts of computing and data technologies.  The first part of the course is devoted to the computer organization and design. This part discusses the main components of computers and the basic principles of their operation. It demonstrates the relationship between the software and hardware and focuses on the foundational concepts that are the basis for current computer design. The second part of the course discusses the architecture of the distributed computer systems. The main emphasis here is on data distribution, as the way data are generated, stored and used is naturally distributed. This part discusses various levels of transmitting, storing and using information, data and knowledge and surveys important areas as Information Theory, Switching, Databases, Data classification, Distributed memory and data retrieval, World Wide Web and Data Mining.

Course objectives: Upon successful completion of the course the student will be able to

Required textbook: David A. Patterson and John L. Hennessy, Computer Organization and Design: The Hardware/Software Interface, Fourth Edition, Elsevier, 2008, ISBN: 978-0-12-374493-7.

Required software: SPIM simulator: A free software simulator for running MIPS R2000 assembly language programs available for Unix, DOS, and Windows.

WEB resources:

Honesty policy: It is expected that all students will conduct themselves in an honest manner (see the CCSU Student handbook), and NEVER claim work which is not their own. Violating this policy will result in a substantial grade penalty, and may lead to expulsion from the University.

Assignments, projects and grading: There will be a mid-term test, a mid-term project and a term paper. There will be also a programming assignment and a quiz. The final grade will be based 40% on the term paper, 30% on the mid-term project, 20% on the test, 5% on the programming assignment and 5% on the quiz. The letter grades will be calculated according to the following table:
 
A A- B+ B B- C+ C C- D+ D D- F
95-100 90-94 87-89 84-86 80-83 77-79 74-76 70-73 67-69 64-66 60-63 0-59

All assignments with their due dates are listed below (the project descriptions are given on separate pages) and must be submitted through Blackboard Vista available through CentralPipeline (Student > Blackboard Vista > CS 502) or directly at https://vista.csus.ct.edu/webct/logon/459901950011

Unexcused late submission policy: Assignments submitted more than two days after the due date will be graded one letter grade down. Projects submitted more than a week late will receive two letter grades down. No submissions will be accepted more than two weeks after the due date.

Tentative schedule of classes (weeks) and assignments

NOTE THAT THE MATERIAL FOR WEEKS 1-6 WILL BE UPDATED TO REFLECT THE NEW EDITION OF THE BOOK

  1. September 1-3: Computer instruction set architecture
  2. September 8-10: Computer arithmetic and the ALU design
  3. September 17: Programming assignment due
  4. September 29 - October 1: CPU datapath and control
  5. October 6 - 8: Pipelining
  6. October 15: Building a 16-bit ALU (part of the Midterm Project) due.
  7. October 20 - 22:  Memory hierarchies
  8. October 27 - 29:  Interfacing peripherals and multiprocessors
  9. October 29:  Midterm Project due
  10. October 27:  Mid-term test. To be taken online through Blackboard Vista
  11. Fundamentals of distributed systems
  12. Information Theory
  13. Switching
  14. Database management concepts
  15. Distributed memory and data retrieval
  16. Quiz.due. Week topic: The World Wide Web
  17. Introduction to Data Mining
  18. Term paper due

CS502 - Week 1

Computer Architecture = Instruction Set Architecture + Machine Organization

Reading: Patterson & Hennessy - Chapter 1, Sections 2.1 - 2.3, 2.5 - 2.7, 2.10, 2.13 (optional), 2.16 - 2.20 (optional), B.9, "Spim, pcspim, and xspim" in Section "Software" on the CD.

Lecture Slides: PDF

Lecture Notes:

  1. Levels of Abstraction
  2. Basic Components of a Computer
  3. Example: implementing A=B+C (from instructions to gates)
  4. MIPS instructions
  5. Stored program concept: programs in memory, fetch&execute cycle
  6. The SPIM simulator
Exercises: Load this program in the SPIM simulator and analyze the format of the insturctions. Run the program with different values of X and Y and trace the execution in step mode.

CS502 - Week 2

Computer arithmetic and ALU design

Reading: Patterson & Hennessy - Sections  2.4, 3.1 - 3.2, 3.3 - 3.4 (optional), 3.5 (floating-point representation only), 3.6 - 3.10 (optional), C.5 (CD).

Lecture Slides (PDF), ALU diagram (PDF)

Lecture Notes:

  1. Representing numbers: sign bit, one's complement, two's complement.
  2. Arithmetic: addition, subtraction, detecting overflow.
  3. Building ALU - hierarchical approach
  4. Floating point numbers
Tutorials and practice quizzes:

CS502 - Week 3

CPU datapath and control

Reading: Patterson & Hennessy - Sections 4.1 - 4.4

Lecture Slides: PDF

Lecture Notes:

I. Building a Datapath

  1. Abstract level implementation:
  2. Basic building elements
  3. Fetching instructions and incrementing the program counter
  4. Register file and execution of R-type instructions
  5. Datapath for lw and sw instructions (add data memory and sign extend)
  6. Datapath for branch instructions
Demo: II. Control
  1. ALU control: mapping the opcode and function bits to the ALU control inputs
  2. Designing the main control unit
  3. Operation of the Datapath (single-cycle implementation):
  4. Problems of the single-cycle implementation

CS502 - Week 4

Pipelining

Reading: Patterson & Hennessy - Sections 4.5 - 4.8 (without implementation details).

Lecture Slides: PDF

Lecture notes:

I. Introduction to Pipelining

  1. Pipelining by analogy (laundry example):
  2. Five stages of the load MIPS instruction
  3. The pipelined datapath
  4. Single cycle, multiple cycle vs. pipeline
  5. Advantages of pipelined execution
II. Problems with pipelining (pipeline hazards)
  1. Structural hazards: single memory
  2. Control hazards:
  3. Data hazards (dependecies backwards in time):
Demo:

CS502 - Week 5

Memory hierarchies

Reading: Patterson & Hennessy - Sections 5.1 - 5.5.

Lecture Slides: PDF

Lecture notes:

  1. Memory technologies and trends
  2. Impact on performance
  3. The need of hierarchical memory organization
  4. The principle of locality
  5. Memory hierarchy terminology
  6. The Basics of caches
  7. The need of virtual memory
  8. VM organization and terminology: virtual address, physical address, page, page offset, page fault, memory mapping (translation).
  9. Addressing pages:

CS502 - Week 6

Interfacing peripherals and multiprocessors

Reading: Patterson & Hennessy - Chapters 6.1 - 6.8, 7.1 - 7.4

Lecture Slides: Chapter7.pdf

Lecture notes:

  1. Interfacing Processors and Peripherals - Buses (slides in PDF)
  2. Interfacing I/O devices to Memory, CPU and OS (slides in PDF)
  3. Multiprocessors (slides in PDF)
  4. Networks of muiltiprocessors (slides in PDF)
  5. Modern clusters

CS502 - Week 7

Fundamentals of distributed systems

Additional Reading: Andrew S. Tanenbaum, Maarten van Steen, Distributed Systems: Principles and Paradigms. Lecture Notes (in PDF):
  1. Categories of computer systems
  2. Conventional systems with special purpose components
  3. Multiprocessor systems:
  4. Distributed computer systems

CS502 - Week 8

Information Theory

  1. Fundamentals of information theory (information vs. data)
  2. Sampling Theorem

CS502 - Week 9

Switching

  1. The importance of switching in communication - the cost of switching is high
  2. Definition: transfer input sample points to the correct output ports at the correct time
  3. Terminology
  4. Voice digitization: W=3KHz, sampling at 2*3=6 or 8KHz, 256 levels for quantization (8 bits), Bit rate=64Kb/s
  5. Telephone switching
  6. General framework for switching
  7. Circuit (synchronous) vs. packet (asynchronous) switching: control and routing overhead, virtual packet switching
  8. Switching techniques and networking: switching is the technology allowing to get a message between the nodes of a network
Lecture slides in PPT

CS502 - Week 10

Database management concepts

  1. Database Management Systems (DBMS)
  2. An example of a database (relational): relations (tables), attributes (columns), tuples (rows). Example query: Salesperson='Mary' AND Price>100.
  3. Database schema (e.g. relational): names and types of attributes, addresses, indexing, statistics, authorization rules to access data etc.
  4. Data independence: separation of the physical and logical data (particularly important for distributed systems). The mapping between them is provided by the schema.
  5. Architecture of a DBMS - three levels: external, conceptual and internal schema
  6. Types of DBMS
  7. Basic DBMS types
  8. Retrieving and manipulating data: query processing
  9. Database views: creating user defined subsets of the database, improving the user interface. Example:
    1. CREATE VIEW MarySales(ItemName,Price)
      AS SELECT ItemName, Price
      FROM ITEM, SALES
      WHERE ITEM.Item#=SALES.Item# AND Salesperson="Mary"

      Then the query:

      SELECT ItemName
      FROM MarySales
      WHERE Proce>100

      translates to:

      SELECT ItemName
      FROM ITEM, SALES
      WHERE ITEM.Item#=SALES.Item# AND Salesperson="Mary" AND Price>100
       

  10. Data integrity
  11. Client-Server architectures
  12. Knowledge Bases and KBS (and area of AI)
Lecture slides in PPT

CS502 - Week 11

Distributed memory and data/informition retrieval

  1. The need of memory hierarchy
  2. Data location factors
  3. Directory - a mechanism to locate a data object
  4. Directory systems
  5. Directories and access control (different functions). Points of access control
    1. Application program (problems with using data provided by other applications)
    2. DBMS (the most common approach)
    3. Control at the physical location of data (e.g. library)
    4. Function of the communication networks (e.g. preventing a user from accessing another node, telephone system)
  6. Information retrieval (Slides in PPT)
  7. Web document retrieval
  8. Document classification and clustering:
  9. Web Mining
  10. Web Mining books

CS502 - Week 12

The World Wide Web

  1. A little history
  2. The client side
  3. The server side
  4. Writing WB page in HTML
  5. Java
  6. Locating information on the Web
  7. Web Agents
  8. Semantic Web

MIPS Programming assignment

Due date: September 17

Write a program in MIPS assembler to perform some simple computation (e.g. average of three integers, converting miles into kilometers, Fahrenheit into Celsius etc.). The program must include:
  1. At least one instruction from each instruction type: R-type arithmetic, I-type arithmetic and Memory transfer.
  2. Input and output through system calls.
  3. Comments explaining the format and the meaning of each instruction.
Use the SPIM simulator to debug and run the program and read Patterson & Hennessy,  B.9 and "Spim, pcspim, and xspim" in Section "Software" on the CD. You may also find additional information about MIPS programming using SPIM in CS 254 - Computer organization and assembly language programming and Introduction to RISC Assembly Language Programming (see example programs).

Documentation and submission: Submit the source text of the program as a file attachment through Blackboard Vista > CS 502 > MIPS Programming Assignment.


Mid-term test

The Midterm test is available from Blackboard Vista. There are 20 multiple choice or short answer questions that have to be answered within 3 hours.

The test includes the following topics:


Building a 16-bit ALU

Due date: October 15

Design a 16-bit ALU unit that performs 5 functions on two 16-bit data inputs (a and b) according to the following table:
 
ALU operation bits Operation Result
000 Logical AND a & b
001 Logical OR a | b
010 Add a + b
110 Subtract a - b
011 Rotate left one bit Result0 = a15; Result(i+1) = a(i), i=0,...,14

Along with the Result output (delivering the result of the operation) the ALU must have the following additional outputs:

Use hierarchical design and draw the logic diagram of the 1-bit ALU and then use block level diagram to draw the whole 16-bit ALU.  Include also a short text description of your design.

Documentation and submission: Write a report containing descriptive text, the ALU function code table, the overflow truth table, the 1-bit ALU diagram and the 16-bit ALU diagram. Use MS Word, PDF or HTML format. Submit your report as a file attachment through CCSU Pipeline > Student > Blackboard Vista > CS-502 > ALU Assignment. The maximum grade for this assignment is 10 points (1/3 of the Midterm Project grade).


Mid-term project (30 ponits): Designing a mini MIPS machine

Due date: October 29

Description: Design a simplified version of the MIPS machine - single cycle implementation. Use the components described in the lecture slides and build the machine hierarchically, i.e. draw and describe simpler elements and then connect them into a more complex diagram. The top level diagram should describe the datapath and control (including all control signals). Include a table describing the functionality of the control unit. Do the drawing by a computer drawing program (e.g. Paint). You can also use and edit diagrams from the lecture slides.

The machine should have the following components (described as blocks with proper inputs and outputs):

  1. General purpose registers (register file): 4 registers, 16-bit long, numbered 0 - 3.
  2. Other registers: Program counter (16 bits).
  3. Instruction Memory: Byte addressing, 16-bit word and 16-bit address.
  4. Data Memory: Byte addressing, 16-bit word and 16-bit address.
  5. ALU: 16 bit (include the assignment for week #5 described here).
  6. Main control unit (include a truth table).
  7. Other components necessary to connect the main components: sign extend, shift unit, multiplexes, gates (for the branch logic).
Instruction set:
 
Instruction Opcode
add 0000
sub 0001
and 0010
or 0011
rol (rotate left one bit) 0100
addi 0101
lw 0110
sw 0111
beq (branch on equal) 1000
bne (branch on not equal) 1001

For the meaning of the instructions see the Patterson & Hennessy's text.

Instruction formats:

R-format (add, sub, and, or, rol)
op rs rt rd unused
4 2 2 2 6

I-format (lw, sw, beq, bne, addi)
op rs rt address/offset/value
4 2 2 8

The project report should include the following:

  1. Student name and project title
  2. A descriptive text outlining the design of the machine (instruction set architecture, main components);
  3. A description of the 16-bit ALU including a descriptive text and diagrams (include your report for the ALU assignment or a reference to it);
  4. A diagram showing the datapath and control including all data and control lines marked with their width (number of bits);
  5. A table describing the input/output behavior of the main control unit.
Submission: Write a report with your name on it containing descriptive text and all diagrams and tables. Use MS Word, PDF or HTML format. Submit your report as a file attachment through CCSU Pipeline > Student > Blackboard Vista > CS-502 > MIPS Machine. The maximum grade for this projects is 30 points, which includes the maximum grade for the ALU assignment (10 points).

Extra credit (maximum 10% of the project grade): Implementing additional MIPS instructions, which require changes both in the diagram and in the control table.


Term paper (40 points)

Posted: 11/05/2009
Due:     12/17/2009

Write a paper that analyzes a particular distributed computing environment. Examples of such environments are: a company computer and/or telephone network, a university campus network, hospital computer network, library network, ticket reservation system etc. Choose an environment that you know best and that is rich enough so that you can cover all of the following issues (including the sub items):

  1. General description (describe the environment in general terms, i.e. the name/type and location of the network, who are the users and for what purpose it is used, what is the general structure etc.)
  2. General requirements for the distributed system architecture to be used (in terms of cost, efficiency, security etc.).
  3. Description of the distributed system architecture currently being used
  4. Problems with the current distributed system architecture (requirements that are not met, inadequate resources or technology)
  5. Suggestions for improvement of the existing system architecture or brief description of a new architecture to meet the requirement

Format of the term paper

Write the paper following the general format of a scientific paper: title, author, abstract, introduction (short description of the subject, goals, structure of the paper), main text  (structured into sections and covering the above 5 issues), conclusion and references. Include all materials you use (including WEB resources) in the reference section and cite them properly in the text. Include as much as possible illustrative material (graphs, charts, diagrams, pictures). Type and print the paper using a word processing system (such as Microsoft Word). An example of a term paper can be found here.

Policy on copying

The term paper must be original, i.e. entirely written by yourself. Since the paper describes an existing system obviously you have to use other materials (books, reference materials, web sources, personal communication). Every time you use such a material you have to include a citation and put the title and authors of this material in the reference section. Then you can rephrase or even copy a part of the material (provided that you put it in quotes). If you describe general ideas and approaches you have also to cite their authors. If  the latter is yourself, then use something like "the author believes that .." or " we suggest ...". Papers that include copied paragraphs and phrases without acknowledgment to their authors will get a grade of F. For more information see the CCSU Graduate Policy on Academic Misconduct.

Term paper grading

The term paper grade will be based on the paper format (according to the above mentioned structure), comprehensibility and  presentation (self-contained explanations of the major issues with minimal reference to technical terms and brand names outside the area) and completeness of the material (i.e. all 5 issues mentioned above including the subitems if item 3 should be discussed in sufficient detail).

Submission

Submit the paper as an attachment through Vista/Term paper.

Extra credit

An extra credit of maximum 10% will be given for the preparation of presentation slides (e.g. in PowerPoint format). Prepare no more than 10 slides, that outline your paper. Slides should omit the details and discuss the most important issues including the problems and suggestions for improvement of the presented distributed environment.

Quiz (max grade 5 pts.)

The quiz is available in Vista. There are 20 multiple choice or short answer questions that will have to be answered within 1 hour.

Review topics

  1. Categories of computer systems:
    • Conventional sequential computers
    • Conventional systems with special purpose components
    • Multiprocessor systems
    • Distributed systems
  2. Distributed systems:
    • Issues: data location and security, load distribution, process migration, fault tolerance
    • Types: homogeneous systems, heterogeneous systems
  3. Information Theory
    • Probability: Bayes's theorem
    • Measuring information: basic formula (logarithmic scale, halving strategy), entropy (average information)
  4. Signals:
    • Representation: time, amplitude, frequency
    • Transmission: sampling (Sampling Theorem), digitizing (AD and DA conversion), bandwidth
  5. Switching:
    • Circuit switching
    • Packet switching
    • Time division multiplexing
    • Ethernet approach: packets, conflicts and resending
  6. DBMS:
    • Application area:  large quantity of structured data
    • Types:  tables (relational), trees, networks, objects
    • Data independence: database schema
  7. Relational DBMS
    • Basics of SQL
    • Query processing: plans, optimization
    • Database views
  8. Data integrity
    • Integrity constraints
    • Concurrency control
    • Backup and recovery
    • Security and access control
  9. Directory systems: purpose, types
  10. Information retrieval
    • Application area: finding relevant data using irrelevant keys (not well structured data)
    • Text document retrieval: retrieval queries, retrieval quality
    • Document classification and clustering: document representation (vector space model), similarity measures.