Redis 三部曲之二部曲-哨兵模式(Sentinel)
圖片來源https://hackmd.io/@tienyulin/redis-master-slave-replication-sentinel-cluster
說明 Redis Sentinel 是官方推薦的高可用性解決方案,他能監控多個主從集群,當發現主節點當機,能夠自動運行主重切換。本次練習延續使用上篇主從模式,針對主從模式節點實施監控。
監控(Monitoring)
目的: Sentinel監控主節點和從節點健康狀態
實現: Sentinel定期向Redis節點發送PING命令,檢查它们的響應時間和可用性。
行為: 如果主節點或從節點未響應,哨兵會將其標記為主觀下線。
主觀下線(Subjective Down)
定義: 當一個哨兵認為某個Redis節點(主節點或從節點)不可用時,稱為主觀下線。
觸發條件: 可能由於網路問題或硬體故障等引起的單個哨兵的主觀判斷。
客觀下線(Objective Down)
定義: 當足夠數樣的哨兵獨立地認為某個Redis節點不可用時,稱為客觀下線。
觸發條件: 超過quorum數量的哨兵在獨立判斷某個節點主觀下線後,將其標記為客觀下線。
quorum: 在配置中定義的哨兵數量的一半加一,用於確保在判断客觀下線時達成多數共識。
故障轉移(Failover)
目的: 在主節點不可用時,將某個從節點提升為新的主節點,以保證服務的可用性。
觸發條件: 當主節點被足夠數量的哨兵標記為主觀下線時,觸發故障轉移。
領頭哨兵產生機制: 起初發現 Master 下線的哨兵會發起一個選舉(採用 Raft 演算法),並要求其他哨兵選他做為領頭哨兵,領頭哨兵會負責進行節點故障的恢復。當選的標準是要有超過一半的哨兵同意,所以哨兵的數量建議是設定奇數個。
步骤:
選擇一個從節點最為新的主節點。
哨兵協調其他從節點切換到新主節點。
更新客户端以連接到新的主節點。
故障轉移完成,新的主節點對外提供服務。
檔案架構
實作
新增volume儲存位置
1 2 3 mkdir redis-shard/redis-sentinel0 \ redis-shard/redis-sentinel1 \ redis-shard/redis-sentinel2
建立Dockerfile及sentinel.conf
1 2 3 4 5 6 7 8 9 10 11 12 13 14 FROM redis:latestARG REDIS_PORT \ REDIS_MASTER_PORT \ REDIS_MASTER_HOST COPY sentinel.conf /etc/redis/sentinel.conf RUN sed -i 's/REDIS_PORT/' $REDIS_PORT '/g' /etc/redis/sentinel.conf && \ sed -i 's/REDIS_MASTER_HOST/' $REDIS_MASTER_HOST '/g' /etc/redis/sentinel.conf && \ sed -i 's/REDIS_MASTER_PORT/' $REDIS_MASTER_PORT '/g' /etc/redis/sentinel.conf ENTRYPOINT redis-sentinel /etc/redis/sentinel.conf
1 2 3 4 5 6 7 8 9 10 port REDIS_PORT # 設定要監控的 Master,最後的 2 代表判定客觀下線所需的哨兵數 sentinel monitor mymaster REDIS_MASTER_HOST REDIS_MASTER_PORT 2 # 哨兵 Ping 不到 Master 超過此毫秒數會認定主觀下線,預設30秒,因測試改5秒 sentinel down-after-milliseconds mymaster 5000 # failover 超過次毫秒數即代表 failover 失敗,預設3分鐘 sentinel failover-timeout mymaster 180000
容器規劃
1 docker container inspect redis-master0 | grep IPAddres | awk '{print $2}' | tail -1
1 docker container inspect redis-master0 redis-master1 redis-master2 | grep '"IPAddress"\|com.docker.compose.service'
建立sentinel container 修改sentinel監工的master node HOST及IP。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 version: '3.4' services: redis-sentinel0: container_name: redis-sentinel0 build: context: redis args: - REDIS_PORT=26379 - REDIS_MASTER_HOST=192.168.112.2 - REDIS_MASTER_PORT=6379 volumes: - ./redis-shard/redis-sentinel0:/data ports: - 26379 :26379 networks: - redis-net redis-sentinel1: container_name: redis-sentinel1 build: context: redis args: - REDIS_PORT=26380 - REDIS_MASTER_HOST=192.168.112.2 - REDIS_MASTER_PORT=6379 volumes: - ./redis-shard/redis-sentinel1:/data ports: - 26380 :26380 networks: - redis-net redis-sentinel2: container_name: redis-sentinel2 build: context: redis args: - REDIS_PORT=26381 - REDIS_MASTER_HOST=192.168.112.2 - REDIS_MASTER_PORT=6379 volumes: - ./redis-shard/redis-sentinel2:/data ports: - 26381 :26381 networks: - redis-net networks: redis-net: name: redis-scaling-network
1 2 3 4 5 6 7 8 9 10 $ docker exec -it redis-sentinel1 redis-cli -p 26380 info sentinel # Sentinel sentinel_masters:1 sentinel_tilt:0 sentinel_tilt_since_seconds:-1 sentinel_running_scripts:0 sentinel_scripts_queue_length:0 sentinel_simulate_failure_flags:0 master0:name=mymaster,status=ok,address=192.168.112.2:6379,slaves=2,sentinels=3
建立流程說明 以redis-sentinel0為例子
Redis Sentinel運行在哨兵模式下,監聽在26379端口,保存新的配置到磁盤及產製哨兵ID
1 2 3 redis-sentinel0 | 7:X 09 Jan 2024 05:49:17.877 * Running mode=sentinel, port=26379. redis-sentinel0 | 7:X 09 Jan 2024 05:49:17.881 * Sentinel new configuration saved on disk redis-sentinel0 | 7:X 09 Jan 2024 05:49:17.881 * Sentinel ID is 21722b10bad0c77589da54afe783fe1119db7af2
哨兵開始監控名為mymaster的Redis主節點(Master),位於172.19.0.2地址,6379端口,並設置仲裁數為2,並添加從節點(Slave)連結到mymaster
1 2 3 4 5 redis-sentinel0 | 7:X 09 Jan 2024 05:49:17.881 # +monitor master mymaster 172.19.0.2 6379 quorum 2 redis-sentinel0 | 7:X 09 Jan 2024 05:49:17.882 * +slave slave 172.19.0.3:9001 172.19.0.3 9001 @ mymaster 172.19.0.2 6379 redis-sentinel0 | 7:X 09 Jan 2024 05:49:17.884 * Sentinel new configuration saved on disk redis-sentinel0 | 7:X 09 Jan 2024 05:49:17.884 * +slave slave 172.19.0.4:9002 172.19.0.4 9002 @ mymaster 172.19.0.2 6379 redis-sentinel0 | 7:X 09 Jan 2024 05:49:17.887 * Sentinel new configuration saved on disk
哨兵節點之間相互發現並建立通信,修復從節點的配置信息,确保其正確連結到mymaster。
1 2 3 4 5 6 redis-sentinel0 | 7:X 09 Jan 2024 05:49:20.092 * +sentinel sentinel 464859cce3e0077c7e6cf41bb4bf3c5346fff9a7 172.19.0.6 26381 @ mymaster 172.19.0.2 6379 redis-sentinel0 | 7:X 09 Jan 2024 05:49:20.098 * Sentinel new configuration saved on disk redis-sentinel0 | 7:X 09 Jan 2024 05:49:20.208 * +sentinel sentinel dab60cdcfaa8d039f105a24c26725919399595f5 172.19.0.7 26380 @ mymaster 172.19.0.2 6379 redis-sentinel0 | 7:X 09 Jan 2024 05:49:20.212 * Sentinel new configuration saved on disk redis-sentinel0 | 7:X 09 Jan 2024 05:52:18.677 * +fix-slave-config slave 172.19.0.4:9002 172.19.0.4 9002 @ mymaster 172.19.0.2 6379 redis-sentinel0 | 7:X 09 Jan 2024 05:52:18.677 * +fix-slave-config slave 172.19.0.3:9001 172.19.0.3 9001 @ mymaster 172.19.0.2 6379
故障轉移實作 試著停止redis-master0容器查看故障轉移態樣
1 docker container stop redis-master0
首先可以看到三個哨兵都認定 master 為 sdown(主觀下線),這時 redis-sentinel1 便認定為 odown(客觀下線),並打算發起投票要求成為領頭哨兵。
1 2 3 4 5 6 redis-sentinel2 | 6:X 09 Jan 2024 08:46:04.856 # +sdown master mymaster 172.19.0.2 6379 redis-sentinel1 | 7:X 09 Jan 2024 08:46:04.875 # +sdown master mymaster 172.19.0.2 6379 redis-sentinel0 | 7:X 09 Jan 2024 08:46:04.916 # +sdown master mymaster 172.19.0.2 6379 redis-sentinel1 | 7:X 09 Jan 2024 08:46:04.938 # +odown master mymaster 172.19.0.2 6379 #quorum 2/2 redis-sentinel1 | 7:X 09 Jan 2024 08:46:04.938 # +new-epoch 1 redis-sentinel1 | 7:X 09 Jan 2024 08:46:04.938 # +try-failover master mymaster 172.19.0.2 6379
三個sentinel進行投票表決,最後由redis-sentinel1當選。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 redis-sentinel1 | 7:X 09 Jan 2024 08:46:04.938 # +try-failover master mymaster 172.19.0.2 6379 redis-sentinel1 | 7:X 09 Jan 2024 08:46:04.943 * Sentinel new configuration saved on disk redis-sentinel1 | 7:X 09 Jan 2024 08:46:04.943 # +vote-for-leader dab60cdcfaa8d039f105a24c26725919399595f5 1 redis-sentinel2 | 6:X 09 Jan 2024 08:46:04.949 * Sentinel new configuration saved on disk redis-sentinel2 | 6:X 09 Jan 2024 08:46:04.949 # +new-epoch 1 redis-sentinel0 | 7:X 09 Jan 2024 08:46:04.950 * Sentinel new configuration saved on disk redis-sentinel0 | 7:X 09 Jan 2024 08:46:04.950 # +new-epoch 1 redis-sentinel0 | 7:X 09 Jan 2024 08:46:04.953 * Sentinel new configuration saved on disk redis-sentinel0 | 7:X 09 Jan 2024 08:46:04.954 # +vote-for-leader dab60cdcfaa8d039f105a24c26725919399595f5 1 redis-sentinel1 | 7:X 09 Jan 2024 08:46:04.954 * 21722b10bad0c77589da54afe783fe1119db7af2 voted for dab60cdcfaa8d039f105a24c26725919399595f5 1 redis-sentinel2 | 6:X 09 Jan 2024 08:46:04.956 * Sentinel new configuration saved on disk redis-sentinel2 | 6:X 09 Jan 2024 08:46:04.956 # +vote-for-leader dab60cdcfaa8d039f105a24c26725919399595f5 1 redis-sentinel1 | 7:X 09 Jan 2024 08:46:04.957 * 464859cce3e0077c7e6cf41bb4bf3c5346fff9a7 voted for dab60cdcfaa8d039f105a24c26725919399595f5 1 redis-sentinel1 | 7:X 09 Jan 2024 08:46:04.996 # +elected-leader master mymaster 172.19.0.2 6379
接著 redis-sentinel1 選出了 redis-slave0(slave 172.19.0.3 9001) 作為 Master,並且下了 slaveof noone 的指令使其解除 Slave 狀態變回獨立的 Master,隨後將 redis-slave0 升格為 Master。
1 2 3 4 5 6 7 8 9 10 redis-sentinel1 | 7:X 09 Jan 2024 08:46:04.996 # +failover-state-select-slave master mymaster 172.19.0.2 6379 redis-sentinel0 | 7:X 09 Jan 2024 08:46:05.007 # +odown master mymaster 172.19.0.2 6379 #quorum 3/2 redis-sentinel0 | 7:X 09 Jan 2024 08:46:05.007 * Next failover delay: I will not start a failover before Tue Jan 9 08:52:05 2024 redis-sentinel1 | 7:X 09 Jan 2024 08:46:05.072 # +selected-slave slave 172.19.0.3:9001 172.19.0.3 9001 @ mymaster 172.19.0.2 6379 redis-sentinel1 | 7:X 09 Jan 2024 08:46:05.073 * +failover-state-send-slaveof-noone slave 172.19.0.3:9001 172.19.0.3 9001 @ mymaster 172.19.0.2 6379 redis-sentinel1 | 7:X 09 Jan 2024 08:46:05.173 * +failover-state-wait-promotion slave 172.19.0.3:9001 172.19.0.3 9001 @ mymaster 172.19.0.2 6379 redis-sentinel2 | 6:X 09 Jan 2024 08:46:05.923 # +odown master mymaster 172.19.0.2 6379 #quorum 3/2 redis-sentinel2 | 6:X 09 Jan 2024 08:46:05.923 * Next failover delay: I will not start a failover before Tue Jan 9 08:52:05 2024 redis-sentinel1 | 7:X 09 Jan 2024 08:46:05.977 * Sentinel new configuration saved on disk redis-sentinel1 | 7:X 09 Jan 2024 08:46:05.978 # +promoted-slave slave 172.19.0.3:9001 172.19.0.3 9001 @ mymaster 172.19.0.2 6379
設定完新的 Master 後,redis-sentinel1發動故障轉移重新配置從節點的狀態,從節點 172.19.0.4:9002(redis-slave1) 已经被重新配置,連接到新的主節點 172.19.0.3 9001(redis-slave0)。
1 2 redis-sentinel1 | 7:X 09 Jan 2024 08:46:05.978 # +failover-state-reconf-slaves master mymaster 172.19.0.2 6379 redis-sentinel1 | 7:X 09 Jan 2024 08:46:06.041 * +slave-reconf-sent slave 172.19.0.4:9002 172.19.0.4 9002 @ mymaster 172.19.0.2 6379
redis-sentinel0和 redis-sentinel2 開始從 redis-sentinel1 取得設定然後更新自己的設定,至此整個故障轉移就完成了。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 redis-sentinel0 | 7:X 09 Jan 2024 08:46:06.043 # +config-update-from sentinel dab60cdcfaa8d039f105a24c26725919399595f5 172.19.0.7 26380 @ mymaster 172.19.0.2 6379 redis-sentinel0 | 7:X 09 Jan 2024 08:46:06.043 # +switch-master mymaster 172.19.0.2 6379 172.19.0.3 9001 redis-sentinel2 | 6:X 09 Jan 2024 08:46:06.042 # +config-update-from sentinel dab60cdcfaa8d039f105a24c26725919399595f5 172.19.0.7 26380 @ mymaster 172.19.0.2 6379 redis-sentinel1 | 7:X 09 Jan 2024 08:46:06.041 * +slave-reconf-sent slave 172.19.0.4:9002 172.19.0.4 9002 @ mymaster 172.19.0.2 6379 redis-sentinel0 | 7:X 09 Jan 2024 08:46:06.044 * +slave slave 172.19.0.4:9002 172.19.0.4 9002 @ mymaster 172.19.0.3 9001 redis-sentinel2 | 6:X 09 Jan 2024 08:46:06.042 # +switch-master mymaster 172.19.0.2 6379 172.19.0.3 9001 redis-sentinel0 | 7:X 09 Jan 2024 08:46:06.044 * +slave slave 172.19.0.2:6379 172.19.0.2 6379 @ mymaster 172.19.0.3 9001 redis-sentinel0 | 7:X 09 Jan 2024 08:46:06.049 * Sentinel new configuration saved on disk redis-sentinel2 | 6:X 09 Jan 2024 08:46:06.043 * +slave slave 172.19.0.4:9002 172.19.0.4 9002 @ mymaster 172.19.0.3 9001 redis-sentinel2 | 6:X 09 Jan 2024 08:46:06.043 * +slave slave 172.19.0.2:6379 172.19.0.2 6379 @ mymaster 172.19.0.3 9001 redis-sentinel2 | 6:X 09 Jan 2024 08:46:06.048 * Sentinel new configuration saved on disk redis-sentinel1 | 7:X 09 Jan 2024 08:46:06.977 * +slave-reconf-inprog slave 172.19.0.4:9002 172.19.0.4 9002 @ mymaster 172.19.0.2 6379 redis-sentinel1 | 7:X 09 Jan 2024 08:46:06.978 * +slave-reconf-done slave 172.19.0.4:9002 172.19.0.4 9002 @ mymaster 172.19.0.2 6379 redis-sentinel1 | 7:X 09 Jan 2024 08:46:07.060 # -odown master mymaster 172.19.0.2 6379 redis-sentinel1 | 7:X 09 Jan 2024 08:46:07.060 # +failover-end master mymaster 172.19.0.2 6379 redis-sentinel1 | 7:X 09 Jan 2024 08:46:07.060 # +switch-master mymaster 172.19.0.2 6379 172.19.0.3 9001 redis-sentinel1 | 7:X 09 Jan 2024 08:46:07.060 * +slave slave 172.19.0.4:9002 172.19.0.4 9002 @ mymaster 172.19.0.3 9001 redis-sentinel1 | 7:X 09 Jan 2024 08:46:07.060 * +slave slave 172.19.0.2:6379 172.19.0.2 6379 @ mymaster 172.19.0.3 9001
此時因為 redis-master0 還是處於關閉的狀態,所以三個哨兵還是會判斷其為主觀下線,但是因為他已經成為 Slave,所以不會進行故障轉移。
1 2 3 4 redis-sentinel1 | 7:X 09 Jan 2024 08:46:07.065 * Sentinel new configuration saved on disk redis-sentinel2 | 6:X 09 Jan 2024 08:46:11.046 # +sdown slave 172.19.0.2:6379 172.19.0.2 6379 @ mymaster 172.19.0.3 9001 redis-sentinel0 | 7:X 09 Jan 2024 08:46:11.088 # +sdown slave 172.19.0.2:6379 172.19.0.2 6379 @ mymaster 172.19.0.3 9001 redis-sentinel1 | 7:X 09 Jan 2024 08:46:12.071 # +sdown slave 172.19.0.2:6379 172.19.0.2 6379 @ mymaster 172.19.0.3 9001
心得 哨兵模式在挑選新的主節點時,哨兵可能會考慮從節點的健康狀況、同步狀態、復製偏移量等因素。這確保了選擇的節點是最合適的,以提供高性能和數據一致性。每個從節點都有一個優先級,哨兵可能會選擇具有更高優先級的從節點作為新的主節點,優先級的設置可以通過配置文件進行調整。以上透由分享哨兵模式的使用範例提供給大家作為Redis高可用的選擇方案之一,配合主從模式下提供故障轉移和高可用性,適用於數據集相對較小,單機內存容量能夠滿足需求的場景。
參考文章 Redis Notes - Sentinel (for High Availability) redis sentinel基本命令与参数 High availability with Redis Sentinel | Redis Redis (六) - 主從複製、哨兵與叢集模式