CS502 - Computing and Communication Technology

Fall-2012

Classes: MW 7:20 pm - 8:35 pm, Maria Sanford Hall 204
Instructor: Dr. Zdravko Markov, MS 307, (860)-832-2711, http://www.cs.ccsu.edu/~markov/, e-mail: markovz at ccsu.edu
Office hours: MTWR 11:00 am - 12:15 pm, or by appointment

Prerequisites: Must be enrolled in graduate level

Description:  The course offers a comprehensive coverage of the basic concepts of Computing and Data Science.  The first part of the course is devoted to the Computer Organization and Design. This part discusses the main components of computers and the basic principles of their operation. It demonstrates the relationship between the software and hardware and focuses on the foundational concepts that are the basis for current computer design. The second part of the course discusses the fundametals of Data Science and some related areas. The main emphasis here is on data distribution, as the way data are generated, stored and used is naturally distributed. This part discusses various levels of transmitting, storing and using information, data and knowledge and surveys important areas as Information Theory, Switching, Databases, Data classification, Distributed memory and data retrieval, World Wide Web and Data/Web Mining.

Course objectives: Upon successful completion of the course the student will be able to

Required textbook: David A. Patterson and John L. Hennessy, Computer Organization and Design: The Hardware/Software Interface, Fourth Edition, Elsevier, 2008, ISBN: 978-0-12-374493-7.

Required software: SPIM simulator: A free software simulator for running MIPS R2000 assembly language programs available for Unix, DOS, and Windows.

WEB resources:

Class Participation: Regular attendance and active class participation is expected from all students. If you must miss a class, try to inform the instructor of this in advance. In case of missed classes and work due to plausible reasons (such as illness or accidents) limitted assistance will be offered. Unexcused absences will result in the student being totally responsible for the make-up process.

Assignments, projects and grading: There will be a mid-term test, a mid-term project and a term paper. There will be also a programming assignment and a quiz. The final grade will be based 40% on the term paper, 30% on the mid-term project, 20% on the test, 5% on the programming assignment and 5% on the quiz, and will be affected by classroom participation, conduct and attendance. All grades will be availabe in Blackboard Vista. The letter grades will be calculated according to the following table:
 
A A- B+ B B- C+ C C- D+ D D- F
95-100 90-94 87-89 84-86 80-83 77-79 74-76 70-73 67-69 64-66 60-63 0-59

All assignments with their due dates are listed below in the class schedule (the project descriptions are given on separate pages) and must be submitted in Blackboard Vista available through CentralPipeline (Blackboard Vista link) or directly at https://vista.csus.ct.edu/webct/logon/34918482122051.

Unexcused late submission policy: Assignments submitted more than two days after the due date will be graded one letter grade down. Projects submitted more than a week late will receive two letter grades down. No submissions will be accepted more than two weeks after the due date.

Honesty policy: The CCSU honor code for Academic Integrity is in effect in this class. It is expected that all students will conduct themselves in an honest manner and NEVER claim work which is not their own. Violating this policy will result in a substantial grade penalty, and may lead to expulsion from the University. You may find it online at http://web.ccsu.edu/academicintegrity/GradAcadMisconductPolicy.htm. Please read it carefully.

Tentative schedule of classes, assignments, projects and tests (by week)

Note: Dates will be posted for all classes, project and test due days. Check the schedule regularly for updates!
  1. August 29, September 5, 10, 12: Computer instruction set architecture
  2. September 17, 19, 24: Computer arithmetic and the ALU design
  3. September 26: Programming assignment due
  4. September 26, October 1: CPU datapath and control
  5. October 3, 8, 10: Pipelining
  6. October 10: Building a 16-bit ALU (part of the Midterm Project) due.
  7. October 15, 17: Memory hierarchies
  8. October 22, 24: Interfacing peripherals and multiprocessors
  9. October 29: Midterm Project due
  10. November 1-7: Mid-term test. To be taken online through Blackboard Vista
  11. November 5, 7: No classes
  12. November 12: Fundamentals of distributed systems
  13. November 14: Information Theory
  14. November 19: Switching
  15. November 26: Database management concepts
  16. November 28: Distributed memory and data retrieval
  17. December 3: The World Wide Web
  18. December 5: Introduction to Data Mining or term paper presentations
  19. December 5-18: Quiz. To be taken online through Blackboard Vista
  20. December 17, 7-9: Last class (optional)
  21. December 18: Term paper due

CS502 - Week 1

Computer Architecture = Instruction Set Architecture + Machine Organization

Reading: Patterson & Hennessy - Chapter 1, Sections 2.1 - 2.3, 2.5 - 2.7, 2.10, 2.13 (optional), 2.16 - 2.20 (optional), Appendix B, "Spim, pcspim, and xspim" in Section "Software" on the CD.

Lecture Slides: PDF

Lecture Notes:

  1. Levels of Abstraction
  2. Basic Components of a Computer
  3. Example: implementing A=B+C (from instructions to gates)
  4. MIPS instructions
  5. Stored program concept: programs in memory, fetch&execute cycle
  6. The SPIM simulator
Exercises: Load this program in the SPIM simulator and analyze the format of the insturctions. Run the program with different values of X and Y and trace the execution in step mode.

CS502 - Week 2

Computer arithmetic and ALU design

Reading: Patterson & Hennessy - Sections  2.4, 3.1 - 3.2, 3.3 - 3.4 (optional), 3.5 (floating-point representation only), 3.6 - 3.10 (optional), C.5 (CD).

Lecture Slides (PDF), ALU diagram (PDF)

Lecture Notes:

  1. Representing numbers: sign bit, one's complement, two's complement.
  2. Arithmetic: addition, subtraction, detecting overflow.
  3. Building ALU - hierarchical approach
  4. Floating point numbers
Tutorials and practice quizzes:

CS502 - Week 3

CPU datapath and control

Reading: Patterson & Hennessy - Sections 4.1 - 4.4

Lecture Slides: PDF

Lecture Notes:

I. Building a Datapath

  1. Abstract level implementation:
  2. Basic building elements
  3. Fetching instructions and incrementing the program counter
  4. Register file and execution of R-type instructions
  5. Datapath for lw and sw instructions (add data memory and sign extend)
  6. Datapath for branch instructions
Demo: II. Control
  1. ALU control: mapping the opcode and function bits to the ALU control inputs
  2. Designing the main control unit
  3. Operation of the Datapath (single-cycle implementation):
  4. Problems of the single-cycle implementation

CS502 - Week 4

Pipelining

Reading: Patterson & Hennessy - Sections 4.5 - 4.8 (without implementation details).

Lecture Slides: PDF

Lecture notes:

I. Introduction to Pipelining

  1. Pipelining by analogy (laundry example):
  2. Five stages of the load MIPS instruction
  3. The pipelined datapath
  4. Single cycle, multiple cycle vs. pipeline
  5. Advantages of pipelined execution
II. Problems with pipelining (pipeline hazards)
  1. Structural hazards: single memory
  2. Control hazards:
  3. Data hazards (dependecies backwards in time):
Demo:

CS502 - Week 5

Memory hierarchies

Reading: Patterson & Hennessy - Sections 5.1 - 5.5.

Lecture Slides: PDF

Lecture notes:

  1. Memory technologies and trends
  2. Impact on performance
  3. The need of hierarchical memory organization
  4. The principle of locality
  5. Memory hierarchy terminology
  6. The Basics of caches
  7. The need of virtual memory
  8. VM organization and terminology: virtual address, physical address, page, page offset, page fault, memory mapping (translation).
  9. Addressing pages:
Exercises

CS502 - Week 6

Interfacing peripherals and multiprocessors

Reading: Patterson & Hennessy - Chapters 6.1 - 6.8, 7.1 - 7.4

Lecture Slides: Chapter7.pdf

Lecture notes:

  1. Interfacing Processors and Peripherals - Buses (slides in PDF)
  2. Interfacing I/O devices to Memory, CPU and OS (slides in PDF)
  3. Multiprocessors (slides in PDF)
  4. Networks of muiltiprocessors (slides in PDF)
  5. Modern clusters

CS502 - Week 7

Fundamentals of distributed systems

Additional Reading: Andrew S. Tanenbaum, Maarten van Steen, Distributed Systems: Principles and Paradigms. Lecture Notes (in PDF):
  1. Categories of computer systems
  2. Conventional systems with special purpose components
  3. Multiprocessor systems:
  4. Distributed computer systems
Examples of distributed systems

CS502 - Week 8

Information Theory

  1. Fundamentals of information theory (information vs. data)
  2. Sampling Theorem

CS502 - Week 9

Switching

  1. The importance of switching in communication - the cost of switching is high
  2. Definition: transfer input sample points to the correct output ports at the correct time
  3. Terminology
  4. Voice digitization: W=3KHz, sampling at 2*3=6 or 8KHz, 256 levels for quantization (8 bits), Bit rate=64Kb/s
  5. Telephone switching
  6. General framework for switching
  7. Circuit (synchronous) vs. packet (asynchronous) switching: control and routing overhead, virtual packet switching
  8. Switching techniques and networking: switching is the technology allowing to get a message between the nodes of a network
Lecture slides in PPT

CS502 - Week 10

Database management concepts

  1. Database Management Systems (DBMS)
  2. An example of a database (relational): relations (tables), attributes (columns), tuples (rows). Example query: Salesperson='Mary' AND Price>100.
  3. Database schema (e.g. relational): names and types of attributes, addresses, indexing, statistics, authorization rules to access data etc.
  4. Data independence: separation of the physical and logical data (particularly important for distributed systems). The mapping between them is provided by the schema.
  5. Architecture of a DBMS - three levels: external, conceptual and internal schema
  6. Types of DBMS
  7. Basic DBMS types
  8. Retrieving and manipulating data: query processing
  9. Database views: creating user defined subsets of the database, improving the user interface. Example:
    1. CREATE VIEW MarySales(ItemName,Price)
      AS SELECT ItemName, Price
      FROM ITEM, SALES
      WHERE ITEM.Item#=SALES.Item# AND Salesperson="Mary"

      Then the query:

      SELECT ItemName
      FROM MarySales
      WHERE Proce>100

      translates to:

      SELECT ItemName
      FROM ITEM, SALES
      WHERE ITEM.Item#=SALES.Item# AND Salesperson="Mary" AND Price>100
       

  10. Data integrity
  11. Client-Server architectures
  12. Knowledge Bases and KBS (and area of AI)
Lecture slides in PPT

CS502 - Week 11

Distributed memory and data/informition retrieval

  1. The need of memory hierarchy
  2. Data location factors
  3. Directory - a mechanism to locate a data object
  4. Directory systems
  5. Directories and access control (different functions). Points of access control
    1. Application program (problems with using data provided by other applications)
    2. DBMS (the most common approach)
    3. Control at the physical location of data (e.g. library)
    4. Function of the communication networks (e.g. preventing a user from accessing another node, telephone system)
  6. Information retrieval (Slides in PPT)
  7. Web document retrieval (Slides in PDF)
  8. Document classification and clustering:
  9. Web Mining
  10. Web Mining books

CS502 - Week 12

The World Wide Web

  1. A little history
  2. The client side
  3. The server side
  4. Writing WB page in HTML
  5. Java
  6. Locating information on the Web
  7. Web Agents
  8. Semantic Web

MIPS Programming assignment (5 points)

Due date: September 26

Write a program in MIPS assembler to perform some simple computation (e.g. average of three integers, converting miles into kilometers, Fahrenheit into Celsius etc.). The program must include:
  1. At least one instruction from each instruction type: R-type arithmetic, I-type arithmetic and Memory transfer.
  2. Input and output through system calls.
  3. Comments explaining the format and the meaning of each instruction.
Use the SPIM simulator to debug and run the program and read Patterson & Hennessy,  B.9 and "Spim, pcspim, and xspim" in Section "Software" on the CD. You may also find additional information about MIPS programming using SPIM in CS 254 - Computer organization and assembly language programming and Introduction to RISC Assembly Language Programming (see example programs).

Documentation and submission: Submit the source text of the program as a file attachment through Blackboard Vista > CS 502 > MIPS Programming Assignment.


Mid-term test (20 points)

The Midterm test will be available from Blackboard Vista (the dates will be posted around midterm). There are 20 multiple choice or short answer questions that have to be answered within 3 hours.

The test includes the following topics:


Building a 16-bit ALU (10 points)

Not available at this time

Mid-term project (20 ponits): Designing a mini MIPS machine

Not available at this time

Term paper (40 points)

Not available at this time.

Quiz (5 points)

The quiz will be available in Blackboard Vista. There are 20 multiple choice questions that will have to be answered within 1 hour.

Review topics

  1. Categories of computer systems:
    • Conventional sequential computers
    • Conventional systems with special purpose components
    • Multiprocessor systems
    • Distributed systems
  2. Distributed systems:
    • Issues: data location and security, load distribution, process migration, fault tolerance
    • Types: homogeneous systems, heterogeneous systems
  3. Information Theory
    • Probability: Bayes's theorem
    • Measuring information: basic formula (logarithmic scale, halving strategy), entropy (average information)
  4. Signals:
    • Representation: time, amplitude, frequency
    • Transmission: sampling (Sampling Theorem), digitizing (AD and DA conversion), bandwidth
  5. Switching:
    • Circuit switching
    • Packet switching
    • Time division multiplexing
    • Ethernet approach: packets, conflicts and resending
  6. DBMS:
    • Application area:  large quantity of structured data
    • Types:  tables (relational), trees, networks, objects
    • Data independence: database schema
  7. Relational DBMS
    • Basics of SQL
    • Query processing: plans, optimization
    • Database views
  8. Data integrity
    • Integrity constraints
    • Concurrency control
    • Backup and recovery
    • Security and access control
  9. Information retrieval
    • Application area: finding relevant data using irrelevant keys (not well structured data)
    • Text document retrieval: inverted index, retrieval queries, retrieval quality (precision, recall), relevance ranking
    • Analyzing the Web structure: page rank