文档介绍:Google puting Faculty Training Workshop
Module I: Introduction to MapReduce
This presentation includes course content © University of Washington
Redistributed under the mons Attribution license.
All other contents:
© Spinnaker Labs, Inc.
Workshop Syllabus
Seven lecture modules
Information about teaching the course
Technical info about Google tools & Hadoop
Example course lectures
Four lab exercises
Assigned to students in UW course
© Spinnaker Labs, Inc.
Overview
University of Washington Curriculum
Teaching Methods
Reflections
Student Background
Course Staff Requirements
Introductory Lecture Material
© Spinnaker Labs, Inc.
UW: Course Summary
Course title: “Problem Solving on Large Scale Clusters”
Primary purpose: developing large-scale problem solving skills
Format: 6 weeks of lectures + labs, 4 week project
© Spinnaker Labs, Inc.
UW: Course Goals
Think creatively about large-scale problems in a parallel fashion; design parallel solutions
Manage large data sets under memory, bandwidth limitations
Develop a foundation in parallel algorithms for large-scale data
Identify and understand engineering trade-offs in real systems
© Spinnaker Labs, Inc.
Lectures
2 hours, once per week
Half formal lecture, half discussion
Mostly covered systems & background
Included group activities for reinforcement
© Spinnaker Labs, Inc.
Classroom Activities
Worksheets included pseudo-code programming, working through examples
Performed in groups of 2—3
Small-group discussions about engineering and systems design
Groups of ~10
Course staff facilitated, but mostly open-ended
© Spinnaker Labs, Inc.
Readings
No textbook
One academic paper per week
., “Simplified Data Processing on Large Clusters”
Short homework prehension
Formed basis for discussion
© Spinnaker Labs, Inc.
Lecture Schedule
Introduction to puting
MapReduce: Theory and works and Distributed Reliability
Real-World Distributed Systems
Distributed File Systems
Other Distributed Systems
© Spinnaker