Introduction to Distributed and Cloud Computing
HW 1
Utility Computing, Distributed Computing, and Paradigms
- Utility computing offers computing resources as a metered service.
- Distributed computing is a computer system in which several interconnected computers share the computing tasks assigned to the system.
- Paradigms of distributed computing include cluster computing, grid computing, and cloud computing.
Improving Performance
- Three ways to improve performance: work harder, work smarter, and get help.
- In computer analogy, the three ways to run applications faster include using faster hardware, using optimized algorithms and techniques to solve computational tasks, and using multiple computers to solve a particular task.
Clusters and Parallel Computing
- A cluster is a type of parallel and distributed system, which consists of a collection of interconnected stand-alone computers working together as a single integrated computing resource.
- Parallel computing is a form of computation in which many calculations are carried out simultaneously. Large problems can often be divided into smaller ones, which are then solved concurrently (“in parallel”).
Memory Models and Grid Computing
- Two basic models for memory models in a cluster include: shared and distributed.
- Grid computing coordinates resource sharing and problem-solving in dynamic, multi-institutional virtual organizations.
Cloud Services and Pricing
- Two major services provided by the Cloud include computing service and storage service.
- Cloud pricing models are based on paying for what is used.
Acronyms
- SOA stands for Service Oriented Architecture. SLA stands for Service Level Agreement. SLO stands for Service Level Objective. QoS stands for Quality of Service. IaaS stands for Infrastructure as a Service. PaaS stands for Platform as a Service. SaaS stands for Software as a Service.
Parallelism Levels and Load Balancing
- Parallelism in different levels include: bit-level parallelism, instruction-level parallelism, data-level parallelism, and task-level parallelism (please follow the order in the slide).
- Load balancing is a technique to distribute workload evenly across two or more computers, network links, CPUs, hard drives, or other resources, in order to get optimal resource utilization, maximize throughput, minimize response time, and avoid overload.
Thin Clients and IaaS, PaaS, SaaS
- A thin client is a computer or a computer program that depends heavily on some other computer to fulfill its traditional computational roles. This stands in contrast to the traditional fat client, a computer designed to take on these roles by itself.
- IaaS is the deployment platform that abstracts the infrastructure. PaaS is the development platform that abstracts the infrastructure, OS, and middleware to drive developer productivity. SaaS is the finished applications that you rent and customize.
- IaaS enabling technique is virtualization including server virtualization, storage virtualization, and network virtualization. PaaS enabling technique is runtime environment design. SaaS enabling technique is web service.
- IaaS provided services include resource management interface and system monitoring interface. PaaS provides services including programming IDE and system control interface. SaaS provides services including web-based applications and web portals.
- In Infrastructure as a Service (IaaS), the capability provided to the consumer is to provision processing, storage, networks, and other fundamental computing resources where the consumer can deploy and run arbitrary software, which can include operating systems and applications.
- In Platform as a Service (PaaS), the capability provided to the consumer is to deploy onto the cloud infrastructure consumer-created or acquired applications created using programming languages and tools supported by the provider.
- In Software as a Service (SaaS), the capability provided to the consumer is to use the provider’s applications running on a cloud infrastructure. The applications are accessible from various client devices through a thin client interface such as a web browser (e.g., web-based email).
Examples of IaaS, PaaS, and SaaS
Amazon EC2 is IaaS, Microsoft Windows Azure is PaaS, Google App Engine is PaaS, Hadoop is PaaS, Google Apps (e.g., Gmail, Google Docs) are SaaS, SalesForce.com is SaaS.
Advantages of Parallel Computing
Advantage(s) of parallel computing include:
- High Performance
- Cost Efficient
- Improve Utilization
Flynn’s Taxonomy
Which program/computer can be classified based on Flynn’s Taxonomy:
- A. SISD
- B. MISD
- C. SIMD
- D. MIMD
- E. All of the above
Answer Key: E
Benefits of Cloud Computing
Cloud computing brings many benefits. For the market and enterprises, which of the following should be reduced?
- A. Capital expenditure
- D. Initial investment
Cloud computing brings many benefits. For the end-user and individuals, which of the following should be reduced?
- A. Local computing power
- B. Local storage power
Important Security and Privacy Issues
- Data Protection: To be considered protected, data from one customer must be properly segregated from that of another.
- Identity Management: Every enterprise will have its own identity management system to control access to information and computing resources.
- Application Security: Cloud providers should ensure that applications available as a service via the cloud are secure.
- Privacy: Providers ensure that all critical data are masked and that only authorized users have access to data in its entirety.
Cloud Deployment Models
There are four primary cloud deployment models including:
- A. Hybrid cloud
- B. Private cloud
- C. Public cloud
- D. Inter-cloud
- E. Community cloud
- F. Virtual private cloud
Answer Key: A, B, C, E
HW 2
Problems in Conventional Case
What are the problems in the conventional case?
- A. Companies IT investment for peak capacity
- B. Lack of agility for IT infrastructure
- C. IT maintenance cost for every company
- D. Usually suffered from hardware failure risk
True Statements about IaaS
What are true in the statements below:
- A. IaaS cloud provider takes care of all the IT infrastructure complexities
- B. IaaS cloud provider provides all the infrastructure functionalities
- C. IaaS cloud provider guarantees qualified infrastructure services
- D. IaaS cloud provider charges clients according to their resource usage
Answer Key: A, B, C, D
IaaS Cloud Providers and Virtualization
IaaS cloud providers
- A. Allocate a new physical machine for each incomer
- B. Prepare a pool of pre-installed machines for different requests
- C. Use virtualization
- D. Provide a platform allowing customers to develop, run, and manage applications
Answer Key: C
Virtualization techniques can help virtualize
- A. computation resources
- B. storage resources
- C. communication resources
- D. application resources
Answer Key: A, B, C
Physical Resources and XEN
Different physical resources include:
- A. CPU
- B. Storage
- C. Network
- D. Application
Answer Key: A, B, C
XEN uses
- A. Emulation technique
- B. Virtualization technique
- C. Process virtual machine
- D. System virtual machine
Answer Key: B, D
Virtualization Approaches
In which of the following virtualization approaches, VMM simulates enough hardware to allow an unmodified guest OS?
- A. Full-Virtualization
- B. Para-Virtualization
- C. Half-Virtualization
- D. Partial virtualization
Answer Key: A
In which of the following virtualization approaches, VMM does not necessarily simulate hardware, but instead offers a special API that can only be used by the modified guest OS?
- A. Full-Virtualization
- B. Para-Virtualization
- C. Half-Virtualization
- D. Partial Virtualization
Answer Key: B
Emulation Implementations and Code Cache
Three emulation implementations include
- A. Interpretation
- B. Register Mapping
- C. Static Binary Translation
- D. Dynamic Binary Translation
Answer Key: A, C, D
Which of the following implementations use code cache?
- A. Interpretation
- B. Static Binary Translation
- C. Dynamic Binary Translation
- D. Register Mapping
Answer Key: B, C
Binary Code Optimization and VMM Properties
How to optimize binary codes?
- A. Static optimization (compiling time optimization)
- B. System optimization
- C. Dynamic optimization (run time optimization)
- D. Software Optimization
Answer Key: A, C
In Popek and Goldberg terminology, a VMM must present all three properties
- A. Simple
- B. Equivalence (Same as a real machine)
- C. Resource control (Totally control)
- D. Efficiency (Native execution)
Answer Key: B, C, D
Trap Types and Trap and Emulate Model
Trap types include:
- A. System Call
- B. Critical instructions
- C. Hardware Interrupts
- D. Exception
Answer Key: A, C, D
In the Trap and Emulate Model, when executing privileged instructions, hardware will make processor trap into the___.
- A. VMM
- B. Guest
- C. Host
- D. VM
Answer Key: A
Virtualizing Unvirtualizable Hardware and Virtual Machine Vendors
How to virtualize unvirtualizable hardware:
- A. Para-virtualization
- B. Binary translation
- C. Hardware assistance
- D. Full virtualization
Answer Key: A, B, C
Virtual machine vendors include:
- A. VMware
- B. Xen
- C. KVM
Answer Key: A, B
Virtualization Enabled Cloud Properties and Live VM Migration
Virtualization enabled cloud properties include:
- A. Scalability
- B. Availability
- C. Manageability
- D. Performance
Answer Key: A, B, C, D
Live VM migration includes the following steps:
- A. Pre-migration process
- B. Reservation process
- C. Iterative pre-copy
- D. Stop and copy
- E. Commitment
Answer Key: A, B, C, D, E
Fill in the Blanks
Pros of Interpretation includes a. Easy to implement, Cons of Interpretation includes b. Poor performance; Pros of Static Binary Translation includes d. High performance, Cons of Static Binary Translation includes c. High implementation complexity.
Interpretation interprets b. Blocks one by one. Static Binary Translation translates a. Instructions one by one.
Register Mapping is an issue in a. Emulation. If the number of host registers is c. Smaller than the guest, it will involve more effort in register mapping.
b. Privileged instructions are those instructions that trap if the machine is in user mode and do not trap if the machine is in kernel mode. a. Sensitive instructions are those instructions that interact with hardware, which include control-sensitive and behavior-sensitive instructions. c. Innocuous instructions are all other instructions. d. Critical instructions are those sensitive but not privileged instructions.
True or False
Virtualization is a new idea in computer science. False
Emulation techniques simulate an independent environment where guest ISA and host ISA are the same. False
Virtualization techniques simulate an independent environment where guest ISA and host ISA are different. False
System virtual machine adds up layer over an operating system which is used to simulate the programming environment for the execution of individual processes. False
Process virtual machine simulates the complete system hardware stack and supports the execution of a complete operating system. False
In full-virtualization, VMM simulates enough hardware to allow an unmodified guest OS; In para-Virtualization, VMM does not necessarily simulate hardware, but instead offers a special API that can only be used by the modified guest OS. True
In the type 1 – bare metal virtualization type, VMMs are software applications running within a conventional operating system. False
In the type 2 – hosted virtualization type, VMMs run directly on the host’s hardware as a hardware control and guest operating system monitor. False
Unlike emulation, virtualization needs to translate each binary instruction to host ISA. False
Unlike emulation, virtualization does not need to worry about unmatched special register mapping. True
In virtualization, instruction privileges should be well-controlled. True
In trap, CPU will jump to hardware exception handler vector, and execute system operations in user mode. False
Under the virtualization theorem, x86 architecture cannot be virtualized directly. Other techniques are needed. True
Legacy processors were not designed for virtualization purposes at the beginning. True
In order to straighten those problems out, Intel introduces one more operation mode of x86 architecture: VMX Root Operation (Root Mode) and VMX Non-Root Operation (Non-Root Mode). True
Live migration of virtual machines is not an essential technique for cloud property implementation. False
VMware Distributed Resource Scheduler does not automatically balance the Workloads according to set limits and guarantees. False
VMware Site Recovery Manager enables an easy transition from a production site to a Disaster Recovery site. True
VMware VMotion makes it possible to move Virtual Machines, without interrupting the applications running inside. True
VMware makes all Servers and Applications protected against component and complete system failure. True
HW 3
True or False
1. PaaS cannot guarantee the quality of resources, services, and applications. False
2. In PaaS, consumers can acquire and return resources from the resource pool on demand. True
3. PaaS reduces the complexity and responsibility of cloud infrastructure. True
4. PaaS acts as a bridge between consumer and hardware. True
5. In PaaS, the runtime environment is automatic control such that consumers can focus on their services. True
6. PaaS increases the responsibility of managing the development environment. False
7. Compared to IaaS, PaaS increases the development period. False
8. In PaaS, end-users can see the alert about the lack of memory. False
9. Security is not important in PaaS. False
10. PaaS cannot run applications in parallel. False
11. MapReduce is scale-up instead of scale-out. False
12. MapReduce does not provide fault tolerance. False
13. In Chord, the identifier of successor(k) definitely is not equal to k’s identifier. False
14. Chord has strategies to handle node joins, departures, and failures. True
15. Hadoop is an open-source implementation of MapReduce and is currently enjoying wide popularity. True
Short Answers
16. Hadoop presents MapReduce as an analytics engine and under the hood uses a distributed storage layer referred to as the Hadoop Distributed File System (HDFS).
17. MapReduce adopts a master-slave architecture. The master node in MapReduce is referred to as Job Tracker (JT). Each slave node in MapReduce is referred to as Task Tracker (TT). MapReduce adopts a pull scheduling strategy rather than a push one. I.e., JT does not push map and reduce tasks to TTs but rather TTs pull them by making pertaining requests.
18. In MapReduce, the scheduling includes map task scheduling and reduce task scheduling. Examples of the schedulers include FIFO scheduler and Fair scheduler.
19. In contrast to Hadoop’s two-stage disk-based MapReduce paradigm, Spark’s in-memory primitives provide performance up to 100 times faster for certain applications.
20. P2P networks can be classified into structured P2P networks and unstructured P2P networks.
21. Chord uses consistent hashing to hash a file’s key or a node’s IP address to an ID. Key k is assigned to the successor(k).
22. In Chord, when a node searches key k, it searches its finger table for the node j, whose ID most immediately precedes k, and asks j for the node it knows whose ID is closest to k.
Multiple Choice
23. In IaaS, users manage the following layers: A. Applications, B. Data, C. Runtime, D. Middleware, E. OS
24. In PaaS, users manage the following layers: A. Applications, B. Data
25. ____ is a computing platform that abstracts the infrastructure, OS, and middleware to drive developer productivity. B. PaaS
26. _____ provides a running environment or a development and testing platform to consumers to design their applications or services. B. PaaS
27. _______ provides automatic make-decisions of dispatching jobs to available resources. B. PaaS
28. In PaaS, the runtime environment is automatic control such that consumers can focus on their services. Specifically, PaaS provides: A. Dynamic provisioning, B. Load balancing, C. Fault tolerance, D. System monitoring
29. Which of the following services are PaaS? A. Microsoft Windows Azure, B. Hadoop, C. Google App Engine
30. MapReduce provides: A. Automatic parallelization/distribution, B. I/O scheduling, D. Fault tolerance
Short Answers
31. PaaS provides a development and testing platform for running developed applications on the runtime environment.
32. PaaS supports automatic backup and disaster recovery such that consumers do not need to worry about system failures.
33. For availability, PaaS also needs to provide system resilience by duplicating applications or services.
34. For manageability, PaaS needs to provide automatic control, analysis, and measurement for the resource usage.
35. PaaS needs to provide authentication and authorization to differentiate the access rights of different users.
36. In PaaS, Consumers could develop and test their applications via web browsers or other thin-clients.
37. MapReduce is a programming model for data processing.
38. MapReduce assumes a tree-style network topology. Nodes are spread over different racks embraced in one or many data centers. A salient point is that the bandwidth between two nodes is dependent on their relative locations in the network topology. For example, nodes that are on the same rack will have higher bandwidth between them as opposed to nodes that are off-rack.
39. MapReduce divides the workload into multiple independent tasks and schedules them across cluster nodes. A work performed by each task is done in isolation from one another.
40. In MapReduce, chunks are processed in isolation by tasks called Mappers. The outputs from the Mappers are denoted as intermediate outputs (IOs) and are brought into a second set of tasks called Reducers. The process of bringing together IOs into a set of Reducers is known as the shuffling process. The Reducers produce the final outputs (FOs). Overall, MapReduce breaks the data flow into two phases, map phase and reduce phase.
Lecture Notes
Lecture 1: Overview of Distributed Computing
Trends of Computing
- Utility Computing: Utility computing is a service provisioning model in which a service provider makes computing resources and infrastructure management available to the customer as needed and charges them for specific usage rather than a flat rate.
- Cluster Computing: A computer cluster consists of a set of loosely or tightly connected computers that work together so that, in many respects, they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software.
- Grid Computing: Grid computing is a computer network in which each computer’s resources are shared with every other computer in the system. Processing power, memory, and data storage are all community resources that authorized users can tap into and leverage for specific tasks.
How to Run Applications Faster
- There are 3 ways to improve performance: Work Harder, Work Smarter, Get Help
- Computer analogy: Using faster hardware, Using optimized algorithms and techniques to solve computational tasks, Using multiple computers to solve a particular task
Parallel Computing
Parallel computing: A form of computation in which many calculations are carried out simultaneously.
Memory Models
- Shared Memory Model: Memory can be simultaneously accessed by multiple processes with an intent to provide communication among them or avoid redundant copies.
- Distributed Memory Model: A multiple-processor computer system in which each processor has its own private memory. Computational tasks can only operate on local data, and if remote data is required, the computational task must communicate with one or more remote processors.
Flynn Taxonomy
Programs and computers are classified by whether they were operating using a single set or multiple sets of instructions and whether or not those instructions were using a single or multiple sets of data.
Applications of Cloud Computing
- Scientific application
- Computer animation
- Computer games
- Image processing
- Data mining
The “Cloud”
The term “cloud” is often used as a metaphor for the Internet. A simplified way to represent the complicated operations in the network. Currently, the term “cloud” is further used as an abstraction of complexities. E.g., servers, applications, data, and heterogeneous platforms.
Cloud Computing in IT
An acquisition and delivery model of IT resources. Helps improve business performance and control the costs of delivering IT resources to the organization. From a user perspective: Provides a means of acquiring computing services via the internet while making the technology beyond the user device almost invisible. From an organization perspective: Delivers services for consumer and business needs in a simplified way, providing unbounded scale and differentiated quality of service to foster rapid innovation and decision-making.
Pay for what is used
Cloud Services
- Upstream – IaaS provider: Amazon EC2, CHT hicloud, etc.
- Midstream – PaaS provider: Google GAE, Windows Azure, etc.
- Downstream – SaaS provider: Saleforce.com, Google docs, etc.
- End Users: Thin client
Summary
- Long united, must divide; long divided, must unite.
- Modern IT requires: To increase capacity or add capabilities to their infrastructure dynamically without investing money in the purchase of new infrastructure. All the while without needing to conduct training for new personnel, without the need for licensing new software.
- Given a solution to the above-mentioned demands: Cloud computing is the next big thing in the world of IT.
Lecture 2: Intro to Cloud Computing
- Central ideas: Utility Computing, SOA – Service Oriented Architecture, SLA – Service Level Agreement
- Properties and characteristics: High scalability and elasticity, High availability and reliability, High manageability and interoperability, High accessibility and portability, High performance and optimization
- Enabling techniques: Hardware virtualization, Parallelized and distributed computing, Web service
A service-level agreement (SLA) is a contract between a network service provider and a customer that specifies, usually in measurable terms (QoS), what services the network service provider will furnish.
Scalability and Elasticity
- What is scalability? A desirable property of a system, a network, or a process, which indicates its ability to either handle growing amounts of work in a graceful manner or to be readily enlarged.
- What is elasticity? The degree to which a system is able to adapt to workload changes by provisioning and de-provisioning resources in an autonomic manner, such that at each point in time the available resources match the current demand as closely as possible.
- But how to achieve these properties? Dynamic provisioning, Multi-tenant design
Dynamic Provisioning is a simplified way to explain a complex networked server computing environment where server computing instances are provisioned or deployed from an administrative console or client application by the server administrator, network administrator, or any other enabled user.
What is multi-tenant design? Multi-tenant refers to a principle in software architecture where a single instance of the software runs on a server, serving multiple client organizations. With a multi-tenant architecture, a software application is designed to virtually partition its data and configuration thus each client organization works with a customized virtual application instance.
Thin client is a computer or a computer program which depends heavily on some other computer to fulfill its traditional computational roles. This stands in contrast to the traditional fat client, a computer designed to take on these roles by itself.
Benefits of Cloud Computing
- For the market and enterprises: Reduce initial investment, Reduce capital expenditure, Improve industrial specialization, Improve resource utilization
- For the end-user and individuals: Reduce local computing power, Reduce local storage power, Variety of thin client devices in daily life
Infrastructure as a Service – IaaS
- Examples: Amazon EC2, Eucalyputs, OpenNebula
IaaS is the deployment platform that abstracts the infrastructure.
- IaaS enabling technique: Virtualization (Server Virtualization, Storage Virtualization, Network Virtualization)
- IaaS provided services: Resource Management Interface, System Monitoring Interface
Platform as a Service – PaaS
- Examples: Microsoft Windows Azure, Google App Engine, Hadoop, … etc
PaaS is the development platform that abstracts the infrastructure, OS, and middleware to drive developer productivity.
- PaaS enabling technique: Runtime Environment
- PaaS provides services: Programming IDE (Programming APIs, Development tools), System Control Interface (Policy-based approach, Workflow-based approach)
Software as a Service – SaaS
- Examples: Google Apps (e.g., Gmail, Google Docs, Google sites, …etc), SalesForce.com, EyeOS
SaaS is the finished applications that you rent and customize.
- SaaS enabling technique: Web Service
- SaaS provides services: Web-based Applications (General applications, Business applications, Scientific applications, Government applications), Web Portal
Cloud Deployment Models
- Public Cloud: The cloud infrastructure is made available to the general public or a large industry group and is owned by an organization selling cloud services. Also known as external cloud or multi-tenant cloud, this model essentially represents a cloud environment that is openly accessible. Basic characteristics: Homogeneous infrastructure, Common policies, Shared resources and multi-tenant, Leased or rented infrastructure, Economies of scale.
- Private Cloud: The cloud infrastructure is operated solely for an organization. It may be managed by the organization or a third party and may exist on-premise or off-premise. Also referred to as internal cloud or on-premise cloud, a private cloud intentionally limits access to its resources to service consumers that belong to the same organization that owns the cloud. Basic characteristics: Heterogeneous infrastructure, Customized and tailored policies, Dedicated resources, In-house infrastructure, End-to-end control.
- Community Cloud: The cloud infrastructure is shared by several organizations and supports a specific community that has shared concerns (e.g., mission, security requirements, policy, and compliance considerations).
- Hybrid Cloud: The cloud infrastructure is a composition of two or more clouds (private, community, or public) that remain unique entities but are bound together by standardized or proprietary technology that enables data and application portability (e.g., cloud bursting for load-balancing between clouds).
Lecture 3: Intro to IaaS (i.e. Virtualization)
Virtualization techniques will help.
- For computation resources: Virtual Machine technique
- For storage resources: Virtual Storage technique
- For communication resources: Virtual Network technique
Machine Level Abstraction: ISA
This is the major division between hardware and software. Examples: X86, ARM, MIPS
OS Level Abstraction
- For compiler or library developers, a machine is defined by ABI (Application Binary Interface).
- This defines the basic OS interface which may be used by libraries or users.
- Examples: User ISA, OS system call
Library Level Abstraction
- For application developers, a machine is defined by API (Application Programming Interface).
- This abstraction provides the well-rounded functionalities.
- Examples: User ISA, Standard C library, Graphical library
Emulation vs. Virtualization
- Emulation technique: Simulate an independent environment where guest ISA and host ISA are different. Example: Emulate x86 architecture on the ARM platform.
- Virtualization technique: Simulate an independent environment where guest ISA and host ISA are the same. Example: Virtualize x86 architecture to multiple instances.
Process Virtual Machine
Adds up layer over an operating system which is used to simulate the programming environment for the execution of individual processes.
System Virtual Machine
Simulates the complete system hardware stack and supports the execution of a complete operating system. Xen, KVM, VMware (USED IN IaaS)
Virtualization Types
- Type 1 – Bare metal: VMMs run directly on the host’s hardware as a hardware control and guest operating system monitor.
- Type 2 – Hosted: VMMs are software applications running within a conventional operating system.
Virtualization Approaches
- Full-Virtualization: VMM simulates enough hardware to allow an unmodified guest OS.
- Para-Virtualization: VMM does not necessarily simulate hardware, but instead offers a special API that can only be used by the modified guest OS.
Lecture 3.1: Virtualization
Emulation Implementations
- Interpretation: Emulator interprets only one instruction at a time. Pros: Easy to implement. Cons: Poor performance.
- Static Binary Translation: Emulator translates a block of guest binary at a time and further optimizes for repeated instruction executions. Pros: Emulator can reuse the translated host code. Emulator can apply more optimization when translating guest blocks. Cons: Implementation complexity will increase.
- Dynamic Binary Translation: This is a hybrid approach of emulator, which mixes two approaches above. Pros: Transparently implement binary translation. Cons: Hard to implement.
Design Challenges and Issues
:
Register mapping problem
Performance improvement
• How to optimize binary codes ?
Static optimization (compiling time optimization)
• Optimization techniques apply to generate binary code base on the
semantic information in source code.
Dynamic optimization (run time optimization)
• Optimization techniques apply to generated binary code base on the
run time information which relate to program input data.
• Why we use dynamic optimization technique ?
Advantages :
• It can benefit from dynamic profiling.
• It is not constrained by a compilation unit.
• It knows the exact execution environment.
In Popek and Goldberg terminology, a VMM must present
all three properties :
Equivalence (Same as real machine)
Resource control (Totally control)
Efficiency (Native execution)
By the classification of CPU modes, we divide instructions
into following types :
Privileged instructions
• Those instructions that trap if the machine is in user mode and do not
trap if the machine is in kernel mode.
Sensitive instructions
• Those instructions that interact with hardware, which include controlsensitive
and behavior-sensitive instructions.
Innocuous instructions
• All other instructions.
Critical instructions
• Those sensitive but not privileged instructions.
Trap types :
System Call
• Invoked by applications in user mode.
• For example, application ask OS for system IO.
Hardware Interrupts
• Invoked by some hardware events in any mode.
• For example, hardware clock timer trigger event.
Exception
• Invoked when unexpected error or system malfunction occur.
• For example, execute privilege instructions in user mode.
How to virtualize unvirtualizable hardware :
Para-virtualization
• Modify guest OS to skip the critical instructions.
• Implement some hyper-calls to trap guest OS to VMM.
Binary translation
• Use emulation technique to make hardware virtualizable.
• Skip the critical instructions by means of these translations.
Hardware assistance
• Modify or enhance ISA of hardware to provide virtualizable architecture.
• Reduce the complexity of VMM implementation.
Hardware assistance
VMX Root Operation (Root Mode)
• All instruction behaviors in this mode are no different to traditional ones.
• All legacy software can run in this mode correctly.
• VMM should run in this mode and control all system resources.
VMX Non-Root Operation (Non-Root Mode)
• All sensitive instruction behaviors in this mode are redefined.
• The sensitive instructions will trap to Root Mode.
• Guest OS should run in this mode and be fully virtualized through typical
“trap and emulation model”.
Migration
• Challenges of live migration :
VMs have lots of state in memory
Some VMs have soft real-time
requirements :
• For examples, web servers,
databases and game servers, …etc.
• Need to minimize down-time
• Relocation strategy :
1. Pre-migration process
2. Reservation process
3. Iterative pre-copy
4. Stop and copy
5. Commitment
Summary:
• Server virtualization technique :
CPU virtualization
• Ring compression, Intel VT-x, …etc
Memory virtualization
• Shadow page table, Intel EPT, …etc
IO virtualization
• Device model, Intel VT-d, PCIe SR-IOV, …etc
• Ecosystem :
VMware implements both type-1 & type-2 virtualization
Xen implements both para and full virtualization
KVM implements in Linux mainstream kernel
• Cloud properties :
Enabled by live migration technique
Scalability, Availability, Manageability and Performance
Lecture 4: PaaS
• PaaS providers can provide a runtime
environment for the developer platform
• Runtime environment is automatic control such
that consumers can focus on their services
Dynamic provisioning
• On-demand resource provisioning
Load balancing
• Distribute workload evenly among resources
Fault tolerance
• Continuously operating in the presence of failures
System monitoring
• Monitor the system status and measure the usage of resources
Enabling Services are the main focus of consumers
• Consumers can make use of these sustaining
services to develop their applications
Programming IDE
• Integrate the full functionalities supported from the runtime
environment
• Provide some development tools, such as profiler, debugger and
testing environment
System Control Interfaces
• Make the decision according to some principles and requirements
• Describe the flow of installation and configuration of resources
Lecture 4.1: MapReduce
Programming paradigm for data-intensive computing
Distributed & parallel execution model
Simple to program
• The framework automates many tedious tasks (machine selection,
failure handling, etc.)
Lecture 12 — Cloud Issues and Challenges
Centralization increases the efficiency of management.
Introduce the cloud security and issue
There are lots of attacks in Internet, and cloud
computing still can not escape.
Some properties of cloud would avoid some attacks, but
some properties could increase the scope of damage.
• Introduce the cloud standard, law and privacy
issue
The cloud security standard classifies the security
issues and provides the security suggestions.
Two security examples: law and privacy.
• Security is the degree of protection against danger,
damage, loss and crime.
Security is focused on hardware mechanism, software
vulnerability and information hiding.
Malicious attackers want to steal data or destroy system
• Security evolution can be divided into three parts
Secure technique
Secure management
Cyber security
Original security that provides the basic protection
mechanism
Authorization specifies the access right to resource
and reject the request from the unknown users.
Encryption makes information unreadable to anyone
except those possessing special knowledge.
Uninterrupted power supply (UPS) and remote
backup server avoid the data loss or server
interruption caused by natural disasters
Proposed and executed one or more feasible
security policy
Access-control-policy is high elasticity that authorizes
a group of users to perform a set of actions on a set of
resources.
Some sort of industry guidelines train and teach
employee the computer security concept.
Incident response plans are the standard procedures
for dealing with emergencies
Internet now is a new battlefield that everyone is
in Internet everyday
Intrusion prevention system (IPS) monitor, identify
and log information about network and/or system
activities for malicious activity.
• Attempt to block/stop activity, and report immediately.
Social engineer is manipulating people into
performing actions or divulging confidential
information, rather than by breaking in or using
cracking technique
Information security means to avoid to access,
record, disclose, modify and destruct without
authorizing
Confidentiality
User access the sensitive data only when getting the
permission.
Data or information could be separated into several
secure level
- Normal, security or top secret
Integrity
Data cannot be modified undetectably.
Information keeps accurate and complete during
transmission.
• Availability
User can get all data what they need anywhere and
anytime.
Properties of cloud computing reduce the part of
security issue
The property of availability provides the services
anytime and reduce the probability of downtime.
The property of scalability avoids the starvation of
resource and can accommodate a large number of users.
• But cloud computing still needs to maintain a high
degree of attention on security
Cloud computing provides services to everyone
There is a lot of sensitive information.
Users do not want any delay or stop service.
Cloud vendors want more and more users to use cloud
service.
- But some people also think this is a business.
Malicious attackers hide in Internet and try to
Steal the important or sensitive information
Collect user’ data and sell the data
Control or destroy the compromised system
Raise awareness
• Focus on the attackers’ motivation.
Money
Philosophy
Show off
How does a hacker attack the system or steal some
information?
• Hacker is not the God
Still need some background and practices.
Still need to prepare some preprocess work.
Still need some tools.
• There are two parts of attack behavior
Penetration
Attack and destroy
Penetration
Collect all public information.
Try to find the weakness or vulnerability.
Try to get the access right or administrator privileges.
• Attack and destroy
Steal or delete data
Crash the system
Monkey business
Penetration
• Hacker finds the weak point to access the target
server and leave without trace
Weak password or SQL injection
Program vulnerability
Erase all log file
• Penetration test (also called pentest) is a method
of evaluating the security
Simulate an attack from a malicious source.
Analyze the feasibility of an attack and guide improving
the system security.
Attack and Destroy
• Hackers try to steal data, block service and
destroy system
Compared with penetration, attacks do not need to be
hidden.
Paralysis of the operational capabilities of the host and
the firewall.
Install malicious code to destroy the system and data.
• An attack action may be accompanied by
penetration.
Penetration
The server in Internet would suffer hundreds of attacks
Ping and port scan
Try to log in or get the correct password
• Cloud computing also collects ten to hundreds PB
information per day
Hacker may get millions of information items with a
successful attack.
Malicious attack will not stop.
Methods
• Cloud computing environment is also a group of
computers
Old attack methods can still pose a threat in some case.
But some methods cannot damage system anymore.
• Cloud properties can reduce the probability of the
system under attack
Scalability
Accessibility
Management
Lecture 5 — Intro to SaaS and techniques
ASP
• Application service provider (ASP) is one of the
choice
ASP moves the applications to Internet
ASP packaged software into an object that can be rented
• The provider is responsible for maintaining the
hardware and for managing the system
Customers can focus on the usage
Customers need not to worry about the system’s
operation
SaaS is another solution
Not sold off
Pay-as-you-go
Interact between client and back-end platform
• Compared
ASP is a provider
SaaS is a service model
Software as a Service (SaaS)
delivering software as a service over the Internet
eliminating the need to install
simplifying maintenance and support
• SaaS not only has the advantage of the ASP, but
also has extra benefit
Fit your requirement
Increasing or decreasing resources on demand
Easy to apply, easy to use and easy to leave
In general, SaaS can be classified into 4 maturity
levels by the mode of delivery and system maturity
Ad-hoc
-The simplest architecture of SaaS
The vendor provide many hardware which can install
the software user need.
Each customer has his own hosts and the applications
run on these hosts.
The hosts and applications are independence for each
other.
Customer can reduce the costs of hardware and
administration.
Configurable
-The second level of SaaS which
Provides greater program flexibility through
configurable metadata
Provides many instances to difference users through
detailed configuration
Simplifies maintenance and updating of a common code
base
Multi-tenant
– The third level of SaaS which
Adds multi-tenant to the second level
Provides a single program service to all customers by
setting configurations
More efficient use of server resource than level 2, but
ultimately is limited in its scalability
Scalability and customized
-The last level of SaaS
SaaS adds the property of scalability and load-balance.
The system’s capacity can increase or decrease to match
demand.
Customers use the application on the SaaS is similar as
on the local host.
Cloud computing:
• Accessibility
Access service anywhere
and anytime
• Elasticity
Service all consumers
• Manageability
Easy to be controlled,
maintained and modified.
• Reliability
Access control and avoid
the phishing web page.
Centralization
• Cloud computing collect all data and all computing
capacity in one or few data centers
Centralized management, deployment and update
• Consumer is not limited to local residents, instead,
everyone can become a user of cloud computing
service
Low ability vendor supplied would decrease the user
experience
The distance of user and data center could be larger
than hundreds to thousands of kilometers
Thin client
• If a user cares about the price of the device or the
usage of time
Keep the core functionality and necessary interface
• Reduce the hardware and design cost.
• Reduce the energy usage.
• Reduce the management and maintenance load.
Remove hard disk, heavy OS …etc.
• Only reserve a monitor, I/O interface, communication interface
and flash memory or CD-ROM.
Benefit
• Management
Remove the mechanical movable devices, like CD-ROM,
can reduce the ratio of failure
Lots of data would not store in the client
• Cost
It would reduce the total cost without CD-ROM
Remove the extra hardware low energy consumption.
vs Thick Client
• Thick clients, also call heavy clients, are fullfeatured
computers
Compared with thin clients that lose full functionality
without a network connection, thick clients can provide
a full functionality.
Compared with thin client, thick client is more heavy
and complex that would be used carefully.
• Both of two are used in many scenario
Need to compute complexity jobs
Need to work in everywhere
Salesforce
The CRM (Customer Relationship Management) is the
well-known product.
• Salesforce provides many solutions on CRM
Pay the rent per month
Consumer can focus on the business, rather than stuck
in the development environment.
Summary
• SaaS is not a single technique nor a new term
A group of old and new techniques
• Server, platform, communication, interface and client device
The server and platform can use the cloud techniques
or traditional techniques
Communication methods have to consider the crossplatform
and multi-user environment
Interface needs to simple and easy used
User device and interface affect the
user experience
Lecture 4.1 — Chord
Chord Protocol
— overview
• Fast distributed computation of a hash function mapping
keys to nodes.
• Using consistent hashing
— load balance
–minimum necessary to maintain a
balanced load
- Scalability: A node needs a small amount of information
Chord Protocol
— Consistent Hashing
• Each node and key has an m-bit identifier
• Node’s identifier
– hashing the node’s IP address
• Key’s identifier
– hashing the key
• Key k is assigned to the successor(k)
– identifier of successor(k) is equal to or follows k’s
identifier
How to map stuff:
circular 7-bit key id space
A key is stored at its successor: node with equal or next higher ID
Lecture 4.1 — MapReduce
MapReduce: A Bird’s-Eye View
In MapReduce, chunks are processed in
isolation by tasks called Mappers
The outputs from the mappers are denoted as
intermediate outputs (IOs) and are brought
into a second set of tasks called Reducers
The process of bringing together IOs into a set
of Reducers is known as shuffling process
The Reducers produce the final outputs (FOs)
Overall, MapReduce breaks the data flow into two phases,
map phase and reduce phase
Map: extract something you care about from each record
Reduce: aggregate, summarize, filter, or transform
The programmer in MapReduce has to specify two functions, the
map function and the reduce function that implement the Mapper
and the Reducer in a MapReduce program
In MapReduce, data elements are always structured as
key-value (i.e., (K, V)) pairs
The map and reduce functions receive and emit (K, V) pairs
Partitions
In MapReduce, intermediate output values are not usually
reduced together
All values with the same key are presented to a single
Reducer together
More specifically, a different subset of intermediate key space is
assigned to each Reducer
These subsets are known as partitions
Distributed file system (HDFS)
Single namespace for entire cluster
Replicates data 3x for fault-tolerance
• MapReduce framework
Executes user jobs specified as “map” and “reduce” functions
Manages work distribution & fault-tolerance
Task Scheduling in MapReduce
MapReduce adopts a master-slave architecture
The master node in MapReduce is referred
to as Job Tracker (JT)
Each slave node in MapReduce is referred
to as Task Tracker (TT)
MapReduce adopts a pull scheduling strategy rather than
a push one
I.e., JT does not push map and reduce tasks to TTs but rather TTs pull
them by making pertaining requests
Map and Reduce Task Scheduling
Every TT sends a heartbeat message periodically to JT
encompassing a request for a map or a reduce task to run
I. Map Task Scheduling:
JT satisfies requests for map tasks via attempting to schedule mappers
in the vicinity of their input splits (i.e., it considers locality)
II. Reduce Task Scheduling:
However, JT simply assigns the next yet-to-run reduce task to a
requesting TT regardless of TT’s network location and its implied effect
on the reducer’s shuffle time (i.e., it does not consider locality)
Job Scheduling in MapReduce
In MapReduce, an application is represented as a job
A job encompasses multiple map and reduce tasks
MapReduce in Hadoop comes with a choice of schedulers:
The default is the FIFO scheduler which schedules jobs
in order of submission
There is also a multi-user scheduler called the Fair scheduler which
aims to give every user a fair share of the cluster
capacity over time
Fault Tolerance in MapReduce
1. If a task crashes:
Retry on another node
• OK for a map because it has no dependencies
• OK for reduce because map outputs are on disk
If the same task fails repeatedly, fail the job or ignore that input
block (user-controlled)
2.If a node crashes:
Re-launch its current tasks on other nodes
Re-run any maps the node previously ran
• Necessary because their output files were lost along with the
crashed node
3. If a task is going slowly (straggler):
Launch second copy of task on another node (“speculative
execution”)
Take the output of whichever copy finishes first, and kill the other
Surprisingly important in large clusters
Stragglers occur frequently due to failing hardware, software bugs,
misconfiguration, etc
Single straggler may noticeably slow down a job
Lecture 4.1 — PaaS File System
Design Considerations
• Namespace
Physical mapping
Logical volume
• Consistency
What to do when more than one user reads/writes on the
same file?
• Security
Who can do what to a file?
Authentication/Access Control List (ACL)
• Reliability
Can files not be damaged at power outage or other
hardware failures?
Local FS on Unix like systems (3/4)
Journaling
Changes to the filesystem is logged in a journal before it is
committed
• useful if an atomic action needs two or more writes
– e.g., appending to a file (update metadata + allocate space +
write the data)
• can play back a journal to recover data quickly in case of hardware
failure.
What to log?
• changes to file content: heavy overhead
• changes to metadata: fast, but data corruption may occur
Implementations: xfs3, ReiserFS, IBM’s JFS, etc.
Snapshot
A snapshot = a copy of a set of files and directories at a
point in time
• read-only snapshots, read-write snapshots
• usually done by the filesystem itself, sometimes by LVMs
• backing up data can be done on a read-only snapshot without
worrying about consistency
Copy-on-write is a simple and fast way to create snapshots
• current data is the snapshot
• a request to write to a file creates a new copy, and work from
there afterwards
Implementation: UFS, Sun’s ZFS, etc
DFS: Allows access to files from multiple hosts sharing via
a computer network
• Must support concurrency
Make varying guarantees about locking, who “wins” with
concurrent writes, etc…
Must gracefully handle dropped connections
• May include facilities for transparent replication and
fault tolerance
• Different implementations sit in different places on
complexity/feature scale
Design Considerations of DFS(1/2)
Interface
• file system, block I/O, custom made
Security
• various authentication/authorization schemes
Reliability (fault-tolerance)
• continue to function when some hardware fail (disks, nodes,
power, etc.)
Namespace (virtualization)
• provide logical namespace that can span across physical
boundaries
Consistency
• all clients get the same data all the time
• related to locking, caching, and synchronization
Parallel
• multiple clients can have access to multiple disks at the same time
Scope
- local area network vs. wide area network
HDFS’s Feature(1/2)
• Large data sets and files
Support Petabytes size
• Heterogeneous
Could be deployed on different hardware
• Streaming data access
Batch processing rather than interactive user access
Provide high aggregate data bandwidth
HDFS’s Feature(2/2)
• Fault-Tolerance
The norm rather than exception
Automatic recovery or report failure
• Coherency Model
Write-once-read-many
This assumption simplifies coherency
• Data Locality
Move compute to data