2022年7月12日 星期二

System design interview 的流程架構是什麼? 應該要怎麼準備和進行?


之前在台灣面試時都沒遇過 system design,好像也較少聽聞有 system design 的關卡;
在英國時也不是所有面試時都會考 system design,通常是用來判斷該工程師是不是 senior 時會才比較會遇到。身為一個 senior engineer 除了被要求技術能力外,對於系統的架構設計和溝通能力也都會被期望達到一定的程度,同時也會預期多少具有領導能力。

而面試範圍可以小到設計一個 rate limiter,也可以大到設計一個 Netflix,其中要考慮的深度和廣度也就跟著不同,而且也有可能因為公司需求、軟硬體限制等而有所改變,沒有所謂最佳的解答。

在網路上看文章、影片 Mock system design interview 時發現大家面試時,同樣一個問題,面試官的著眼點、流程的進行架構都不太一樣,那究竟什麼樣的內容是 System design inteview 時必須要談到的呢?

這裡用 designing a URL shortener 當範例,盡可能將流程結構化,並列出在設計時會考量的點,希望在面試時自己心中能有個大綱,好跑過一輪必考量的點;不過 system design 可以包含的範圍實在太廣了,可能會有些疏漏,還請見諒。


一、平常練習時

1. 熟悉 system design 相關觀念

System design 比較不像已經有 LeetCode 幫你整理好要準備的清單,我個人是閱讀 System Design Interview – An insider's guide 這本書,或是直接上網查詢教學影片,而這邊列下的是常出現會使用到的觀念。

 

  1. Network protocols and proxies : 網絡協議和代理是系統設計最常見的組成部分

  2. Databases : 主要功能有讀取 / 更新 / 刪除資料,包含了 relational 和 non-relational 資料庫

  3. Latency, throughput and availability : 常用來衡量系統效能的三個指標

  4. Load balancing : 用分配附載量的方式來改善系統的效能和可靠度

  5. Caching : 運算中用來存放資料的儲存層,可讓有效率的重複使用之前擷取或運算的資料

  6. Sharding : 通過 sharding DB 的方執行水平的擴展 

  7. Polling, SSE and WebSockets : 用於從服務器傳輸大量數據到客戶端,或對服務器傳輸大量數據的技術

 

 


2. 自當面試官和面試者

平時練習時便可以自己同時擔任面試官和面試者,除了可以幫助練習英文溝通,也可以訓練自己假設自己是面試官會問什麼問題,更加強自己準備的範圍。





3. 練習打字速度和增加白板熟悉度

解題速度很重要,但打字速度也很重要。


面試的時間說長不長說短不短,如果你已經有構思卻卡在打字速度是件很可惜的事情,另外因為疫情的關係會用到線上白板畫圖,如果提前知道是用哪種軟體,可以先練習避免到時候手忙腳亂。


蒐集到有使用過的白板類型 : excalidraw, miro, whimsical, jamboard, coderpad, google draw 等等,目前個人覺得最好用的是 excalidraw。




4. 多練習不同種類的題型

各種 system design 幾乎都是開放式的問題,有可能因為軟硬體限制、公司需求等而有所改變,沒有所謂最佳的解答。


而且範圍可以小到設計一個 rate limiter,也可以大到設計 Netflix,其中要考慮的深度和廣度也就跟著不同,多練習不同題型好培養自己對於各種系統設計的熟悉度,如 design Twitter, a file transfer system, youtube, notification system 等等。

 

 


5. 用畫圖方式紀錄 / 描繪系統設計

用敘述的不僅是面試官或是你自己都很容易忘記前面講過的條件,所以在問到這個系統的目標和需求時,就可以先列下來,後續再講到 table, system structure 設計時利用繪圖更能清楚的呈現系統架構。


 


6. 找人幫忙 Mock interview

自己練習和實際上場時又會有差異,所以最好的辦法就是找人練習面試,Facebook 上有很多程式相關社團可以在上面徵人互相 Mock interview,如果可以找到自己想應徵的公司的工程師幫忙面試最好,如果可以的話多找不同人練習,除了可以訓練自己聽不同口音外,也可以讓自己能快速的適應不同人的面試思考方式。

 

 

 

二、面試前

確認所有你所需的設備都可正常運作且有備援,網路穩定環境安靜。

比如無線耳機、滑鼠、鍵盤是不是電源都足夠,如果突然電量不足有沒有替代方案。





三、面試時流程

1. Clarify (0 ~ 8 mins)

1.1 Clarify the goal


花時間和面試官釐清這個系統的目標和需求,不要聽完 / 看完題目就直接開始解題,除了技術能力,溝通和合作能力也很重要,而且如果沒有跟面試官討論就開始解題

有可能會搞錯方向,或是把問題想得太過於複雜。

 

遇到要設計系統的可能是你有 / 沒有使用過的


  1. 遇到有使用過的可以詢問說,根據我知道的這系統是怎麼運作的,有什麼功能等等,和面試官確認你們討論的層面在同一個水平上


  1. 如果遇到沒使用過的時候,就可以請對方稍微介紹一下這個系統,"I know what it is but I am not familiar with xxx, would you mind to give me a quick overview of what we’re looking for.”


Interviewer: The question that we’ll be doing is designing a URL shortener.

Candidate: To make sure that I fully understand what you’re asking, as I know about a URL shortener is that a user puts a long URL and the system will return a shorter URL. When someone clicks the short URL then directs to the original URL, am I correct?

Interviewer: Yes, that’s correct.





1.2 Clarify the functional and non-functional requirements

 

接著問清楚關於功能性與非功能性的需求


功能性需求可以問的問題比如 : 

  1. 每天有多少使用者?

  2. 高峰期使用者有多少?

  3. 流量 (天 / 時 / 分 / 秒) 是多少?

  4. 需要支援哪些裝置 (桌機、手機)?

  5. 資料的型別有什麼 ?

    1. 文字 (標題 / 名稱 / 介紹)

    2. 圖片 (縮圖)

    3. 使用者資料 (上次上線時間 / 建立時間 / 我的最愛)

    4. 行為紀錄 (錯誤訊息)


  1. 是否有訂閱 / 推薦 / 搜尋 / 新增 / 上傳 / 修改 / 刪除 / 評論等功能?

  2. 需要支援哪些其他功能?



非功能性需求可以問的問題比如 : 

  1. 有哪些性能要求,如高可用性、低延遲等等

  2. 未來有延展性的需求嗎?

  3. 是否有安全性考量?

  4. 資料需要儲存多久?

  5. Business goal 是什麼?

 

 

有很多功能和非功能需求是討論不完的,為了更聚焦在面試官在意的部份,可以詢問像是 "Should we focus on xxx? Or should I drill into a specific component of the overall system?"

 

因為每個面試官在意的點可能都不一樣,這樣詢問可以確保不會深入對方其實並不介意的部份。

 

詢問問題後可以一邊在白板上列出目標和需求,除了讓自己有筆記可以看外,也可以避免面試官忘記自己講過什麼。


Candidate: The first thing I’d like to dive into are sort of functional requirements, what’s the traffic volume per day?

Interviewer: Let’s assume 100 million URLs are generated.


Candidate: What characters are allowed in the shortened URL?

Interviewer: A combination of numbers (0-9) and character (a-z, A-Z).


Candidate: What will be the length of the shortened URL?

Interviewer: The average URL length is 100.


Candidate: Can the shortened URLs be deleted or updated?

Interviewer: To keep things simple,  the shortened URLs can’t be deleted or updated.


Candidate: Considering availability and scalability, how long will these URLs be stored?

Interviewer: Let’s say 10 years.





1.3 Specify your assumptions


知道系統的目標、需求和應著重的點後,敘述自己的思路和假設,持續問問題和要求 feedback。


Candidate: So I'm assuming this system would be available for a large number of users, and the links would need to be available for an extended period of time.


Interviewer: Yes, exactly.





2. Design high level (8 ~ 20 mins)

2.1 Calculate metrics


計算出粗略的數字,包括 CPU, memory, IO transfer 等等,並注意單位。


要計算的項目如 : 

  1. 每秒寫入量

  2. 每秒讀取量

  3. 每個資料的大小是多少

  4. 需維護系統多久時間



Candidate: Let me do some rough calculations, the daily generated URLs is 100 million, then 100 million / 24 / 60 / 60 = 1160, 1,160 URLs were generated per second.

Interviewer: Keep going.


Candidate: And I'm assuming that the ratio of read operation to write operation is 10:1, So we can say there are roughly 10 times more reads per second than writes. 

Interviewer: Reads are definitely higher than writes, but that sounds a bit low as the ratio.


Candidate: Let's increase it by an order of magnitude, say, 100 times?

Interviewer: Okay, sounds good.


Candidate: Can I assume that the peak reads per second should be 116,000 times?

Interviewer: Okay.


Candidate: For the storage requirement, daily generated URLs is 100 million, and the average URL length is 100 bytes, storing for 10 years will be 100 * 100 * 365 * 10 = 36.5 TB, when n = 7, 62^n = 3.5 trillion is greater than 36.5 TB, so we can use the length of hashValue equals 7.


(The length of hashValue and the corresponding maximal number of URLs it can support)



2.2 Identify high-level components or services


可能要考量的點有 : 

  1. 需要什麼 server,如 Clould services, DB, cache, CDN 等

  2. 需不需要設計 API

  3. 需不需要使用第三方系統 (有沒有地區限制)



Candidate: Considering the scale of the system and preventing single point of failure, the single server clearly will not meet the load or availability requirements. We need multiple servers, load balancer to distribute network traffic, using cache to store the frequently used URLs,  and also we need a DB. 

Interviewer: Great.






2.3 Design the database


設計表格和資料的存取解釋和決定要使用哪種資料庫 : 

  1. RDBMS (relational databases),如 PostgreSQL、MySQL

  2. noSQL (non-relational databases)

    1. Key-Value 資料庫,如 Cassandra

    2. 記憶體資料庫 (In-memory Database),如 Redis

    3. 圖學資料庫 (Graph Database),如 Neo4

    4. 文件資料庫 (Document Database),如 MongoDB



Candidate: Okay, so I’m going to determine the data schema. 

Interviewer: Go ahead.


Candidate: We will need to store the original URL and the shortened URL, also the primary key to identify.

Interviewer: Are these the only data to store?


Candidate: We also need to store a timestamp, we can save created time or expired time, I prefer to store the created time.

Interviewer: Is there a difference there?


Candidate: If we store the date created, we’ll be able to change expiry policy parameters more easily, and enable more flexibility there. If we store the expiry date though, it means we won’t need to calculate it from the created date.

Interviewer: Okay, sounds good.


Candidate: The table schema will be like this. And we only have one table, so I will choose to use non-relational databases.

Interviewer: That makes sense.





3. Drill down the design (20 ~ 45 mins)

3.1 Draw an architecture diagram


這裡可以深入討論更多細節,畫出主要元件和架構圖。


可能會有元件如下:

  1. Web server

  2. Cache

  3. CDN

  4. Load balancer

  5. Retry mechanism

  6. Security

  7. Template

  8. User setting

  9. Authentication

  10. Rate limiting

  11. Monitor

  12. Event tracking


Candidate: Okay, so this is the basic system architecture. I added an ID generator because we need multiple databases, and if each server is creating an ID, we will be bound to get collisions. We could make a central, shared ID generator to guarantee that the generated ID is unique.


Interviewer: That makes sense.






3.2 Identify bottlenecks


當已經有架構圖的雛型時,接著要考慮的是會遇到的 bottlenecks,比如系統的擴編性、效能、彈性等等。

  • 指出 bottleneck

  • 指出可以改進的地方

  • 指出若要擴大要如何設計

  • 討論 error cases,比如系統故障、斷線等等

  • 討論 option issues,比如如何監控數據和 error logs,如何 roll out system


Candidate: The problem is the whole system relies on the central ID generator, if it fails then the system won’t be able to generate the ID. Another problem is we only have one ID generator, which limits the speed. 

Interviewer: That’s certainly a problem.


Candidate: We could add more ID generators, and each of the servers would be able to serve from an assigned range of numbers, if one is down, another one can also generate the ID.

Interviewer: Sounds good.





3.3 Dive into a specific component 


可以討論你最擅長的元件開始,如果不確定選擇哪個元件討論最好,也可以詢問說 "If you don’t mind, which component do you think would be best to explore?"


  • 如聊天系統,如何降低延遲時

  • 如短網址,如何用 hash function 去減縮長網址成短網址

  • ...


如果要談某元件的設計可以談到非常深入,但有時候太深入可能並不是這個面試的重點,談太淺又怕被以為是考慮不足,這時候就可以問比如說 "would you like to dig into the finer details of this component, or shall we move on?"


Candidate: To shorten a long URL, we should implement a hash function, but it might encounter collision, so I will use hash function (MD5, SHA256, etc.) first, then encode with base 62. 

Interviewer: Yes, that solved the collision problem.





4. Bring it all together (45 ~ 60 mins)


重新回顧這個系統設計的目標和需求,敘述你的哪些設計符合目標和需求,也可以提出你為什麼做這些取捨,如果在怎樣的情境下可能會有其他不同的考慮等等;如果有時間也可以簡單的重述整個你設計的系統的運行流程,並提到前面使用到的元件。


這些步驟主要是幫面試官複習你剛剛設計的系統,因為在一來一往溝通中還是有可能雙方會有些誤會,在工作上更是如此,藉由這樣的方式顯示你的溝通能力。



Candidate: A potential problem we could face is malicious users. There are two methods in my head, one is to create a rate limiter and filter out requests, and another one is to store the data of user’s information to limit users through API keys.

Interviewer: What’s your considerations between these two methods?


Candidate: I think it depends on the business consideration, as we know that most people don’t like to create a new account, but we can use social login to solve these problems and collect the user data for analytics.

Interviewer: That makes sense.


Candidate: I’m going to check if the design satisfies our initial objectives now.

Interviewer: Okay.


Candidate: First a user clicks a short URL, the load balancer forwards the request to web servers, If a shortURL is already in the cache then returns the longURL directly, If a shortURL is not in the cache, fetch the longURL from the database. 

Interviewer: Good, you’ve done a great job.





四、References

  1. 31 system design interview questions
  2. How to answer system design interview questions






--
相關文章



--
如果我有稍微幫助到你,可以小額補助我一下咖啡費用 : P
一個是我台新帳戶 - (812) 28881002998810
英國 Revolut 帳戶 - Sort code : 04-00-75 Account : 92582044



--

沒有留言:

張貼留言