A. Usage
1. find out the main usage scenarios2. find out the side usage scenarios and group them into 3 categories-
a) need to be considered now.
b) will possibly happen in the future but not now.
c) more likely will never need
B. Scale
from the aspects of different practical resources-a. processing resource: which might include CPU and RAM
b. network: not only mean something like LAN bandwidth. should be considered from the viewpoint of request/response processing flow. Its corresponding path. So this part may also include something like the data passing in two system module is through RAM or Ethernet cable or crossing Internet.
c. storage
from the 3 aspects above, we will ask the questions below
1. how many request per second/day/week/month/year by average
2. how many request in peak situation
3. how long expected the current system last being workable
4. to serve each request, how mush corresponding storage is expected
5. come out to read/write per second
C. Abstract Design
to come out core components and their connections.might be like
1. web server: for a web kind service. to deliver webpage to user. to pass request from user to application server. to collect response from application server to construct the result page for the user.
1. application layer (to service requests): main computing resource. handling more complicated works.
2. data storage layer (to support application)
D. Bottlenecks
find out the possible bottleneck components in the components of abstract design given the "scale" part we already have some numbers.is it traffic, computing, storage, or searching the storage?
E. Scaling
Treatments:
- Vertical scaling
- Horizontal scaling
- Caching
- Load balancing
- Database replication
- Database partitioning
- you already target the bottleneck tier. first try to list the characteristic of the bottlenecks.
- for example, if the bottleneck is the amount of the data is very huge and consequently the performance of read/write/search is concerned, figure out the detailed characteristics first. How's the expected frequency of reading data? How's writing? How big of data each read/write carry?
Rules
- Public servers of a scalable web service are hidden behind a load balancer. Every server contains exactly the same codebase and does not store any user-related data, like sessions or profile pictures, on local disc or memory.
Sessions need to be stored in a centralized data store which is accessible to all your application servers. It can be an external database or an external persistent cache, like Redis. - How can you make sure that a code change is sent to all your servers without one server still serving old code?
=> Capistrano
DB Scaling
- switch to a better and easier to scale NoSQL database like MongoDB or CouchDB.
- Joins will now need to be done in your application code.
- cache
- always mean in-memory caches like Memcached or Redis. Never do file-based caching, it makes cloning and auto-scaling of your servers just a pain
- Cached Database Queries:
- Whenever you do a query to your database, you store the result dataset in cache.
- A hashed version of your query is the cache key.
- The main issue is the expiration. It is hard to delete a cached result when you cache a complex query. When one piece of data changes (for example a table cell) you need to delete all cached queries who may include that table cell.
- Cached Objects
- strong recommendation and I always prefer this pattern.
- For example, a class called “Product” which has a property called “data”. It is an array containing prices, texts, pictures, and customer reviews of your product. The property “data” is filled by several methods in the class doing several database requests which are hard to cache, since many things relate to each other. Now, do the following: when your class has finished the “assembling” of the data array, directly store the data array, or better yet the complete instance of the class, in the cache!
- sharding
- data for User A is stored on one server and the data for User B is stored on another server
- It doesn't use replication. Replicating data from a master server to slave servers is a traditional approach to scaling. Data is written to a master server and then replicated to one or more slave servers. At that point read operations can be handled by the slaves, but all writes happen on the master. Obviously the master becomes the write bottleneck and a single point of failure.
- problems
- Rebalancing data. Let's say some user has a particularly large friends list that blows your storage capacity for the shard. You need to move the user to a different shard.
- Joining data from multiple shards. To create a complex friends page, or a user profile page, or a thread discussion page, you usually must pull together lots of different data from many different sources.
- How do you partition your data in shards?
Computing/processing asynchronism
- doing the time-consuming work in advance and serving the finished work with a low request time
- Very often this paradigm is used to turn dynamic content into static content. Pages of a website, maybe built with a massive framework or CMS, are pre-rendered and locally stored as static HTML files on every change.
- RabbitMQ is one of many systems which help to implement async processing. You could also use ActiveMQ or a simple Redis list. The basic idea is to have a queue of tasks or jobs that a worker can process.
Peak
- add a load balancer with a cluster of machines for supporting peak traffic and ensure availability. Not necessary to be on in the normal traffic time.
Examples
Reference
- https://www.hiredintech.com/classrooms/system-design/lesson/52
- lecloud.net/post/7295452622/scalability-for-dummies-part-1-clones
- http://highscalability.com/blog/2009/8/6/an-unorthodox-approach-to-database-design-the-coming-of-the.html
留言
張貼留言