0%

Redis Sentinel

Redis 三部曲之二部曲-哨兵模式(Sentinel)

image

圖片來源https://hackmd.io/@tienyulin/redis-master-slave-replication-sentinel-cluster

說明

Redis Sentinel 是官方推薦的高可用性解決方案,他能監控多個主從集群,當發現主節點當機,能夠自動運行主重切換。本次練習延續使用上篇主從模式,針對主從模式節點實施監控。

監控(Monitoring)

  • 目的: Sentinel監控主節點和從節點健康狀態
  • 實現: Sentinel定期向Redis節點發送PING命令,檢查它们的響應時間和可用性。
  • 行為: 如果主節點或從節點未響應,哨兵會將其標記為主觀下線。

主觀下線(Subjective Down)

  • 定義: 當一個哨兵認為某個Redis節點(主節點或從節點)不可用時,稱為主觀下線。
  • 觸發條件: 可能由於網路問題或硬體故障等引起的單個哨兵的主觀判斷。

客觀下線(Objective Down)

  • 定義: 當足夠數樣的哨兵獨立地認為某個Redis節點不可用時,稱為客觀下線。
  • 觸發條件: 超過quorum數量的哨兵在獨立判斷某個節點主觀下線後,將其標記為客觀下線。
  • quorum: 在配置中定義的哨兵數量的一半加一,用於確保在判断客觀下線時達成多數共識。

故障轉移(Failover)

  • 目的: 在主節點不可用時,將某個從節點提升為新的主節點,以保證服務的可用性。
  • 觸發條件: 當主節點被足夠數量的哨兵標記為主觀下線時,觸發故障轉移。
  • 領頭哨兵產生機制: 起初發現 Master 下線的哨兵會發起一個選舉(採用 Raft 演算法),並要求其他哨兵選他做為領頭哨兵,領頭哨兵會負責進行節點故障的恢復。當選的標準是要有超過一半的哨兵同意,所以哨兵的數量建議是設定奇數個。
  • 步骤:
    1. 選擇一個從節點最為新的主節點。
    2. 哨兵協調其他從節點切換到新主節點。
    3. 更新客户端以連接到新的主節點。
    4. 故障轉移完成,新的主節點對外提供服務。

檔案架構

image

實作

  1. 新增volume儲存位置
1
2
3
mkdir redis-shard/redis-sentinel0 \
redis-shard/redis-sentinel1 \
redis-shard/redis-sentinel2
  1. 建立Dockerfile及sentinel.conf
  • Dockerfile
1
2
3
4
5
6
7
8
9
10
11
12
13
14
FROM redis:latest

ARG REDIS_PORT \
REDIS_MASTER_PORT \
REDIS_MASTER_HOST

COPY sentinel.conf /etc/redis/sentinel.conf

RUN sed -i 's/REDIS_PORT/'$REDIS_PORT'/g' /etc/redis/sentinel.conf && \
sed -i 's/REDIS_MASTER_HOST/'$REDIS_MASTER_HOST'/g' /etc/redis/sentinel.conf && \
sed -i 's/REDIS_MASTER_PORT/'$REDIS_MASTER_PORT'/g' /etc/redis/sentinel.conf

ENTRYPOINT redis-sentinel /etc/redis/sentinel.conf

  • sentinel.conf
1
2
3
4
5
6
7
8
9
10
port REDIS_PORT

# 設定要監控的 Master,最後的 2 代表判定客觀下線所需的哨兵數
sentinel monitor mymaster REDIS_MASTER_HOST REDIS_MASTER_PORT 2

# 哨兵 Ping 不到 Master 超過此毫秒數會認定主觀下線,預設30秒,因測試改5秒
sentinel down-after-milliseconds mymaster 5000

# failover 超過次毫秒數即代表 failover 失敗,預設3分鐘
sentinel failover-timeout mymaster 180000
  1. 容器規劃
  • 查詢sentinel監控的master node
1
docker container inspect redis-master0 | grep IPAddres | awk '{print $2}' | tail -1
1
docker container inspect redis-master0 redis-master1 redis-master2 | grep '"IPAddress"\|com.docker.compose.service'
  • 建立sentinel container
    修改sentinel監工的master node HOST及IP。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
version: '3.4'
services:
redis-sentinel0:
container_name: redis-sentinel0
build:
context: redis
args:
- REDIS_PORT=26379
- REDIS_MASTER_HOST=192.168.112.2 #edit redis master host
- REDIS_MASTER_PORT=6379 #edit redis master port
volumes:
- ./redis-shard/redis-sentinel0:/data
ports:
- 26379:26379
networks:
- redis-net

redis-sentinel1:
container_name: redis-sentinel1
build:
context: redis
args:
- REDIS_PORT=26380
- REDIS_MASTER_HOST=192.168.112.2 #edit redis master host
- REDIS_MASTER_PORT=6379 #edit redis master port
volumes:
- ./redis-shard/redis-sentinel1:/data
ports:
- 26380:26380
networks:
- redis-net

redis-sentinel2:
container_name: redis-sentinel2
build:
context: redis
args:
- REDIS_PORT=26381
- REDIS_MASTER_HOST=192.168.112.2 #edit redis master host
- REDIS_MASTER_PORT=6379 #edit redis master port
volumes:
- ./redis-shard/redis-sentinel2:/data
ports:
- 26381:26381
networks:
- redis-net

networks:
redis-net:
name: redis-scaling-network

  • 查看sentinel狀態
1
2
3
4
5
6
7
8
9
10
$ docker exec -it redis-sentinel1 redis-cli -p 26380 info sentinel

# Sentinel
sentinel_masters:1
sentinel_tilt:0
sentinel_tilt_since_seconds:-1
sentinel_running_scripts:0
sentinel_scripts_queue_length:0
sentinel_simulate_failure_flags:0
master0:name=mymaster,status=ok,address=192.168.112.2:6379,slaves=2,sentinels=3

建立流程說明

以redis-sentinel0為例子

  1. Redis Sentinel運行在哨兵模式下,監聽在26379端口,保存新的配置到磁盤及產製哨兵ID
1
2
3
redis-sentinel0  | 7:X 09 Jan 2024 05:49:17.877 * Running mode=sentinel, port=26379.
redis-sentinel0 | 7:X 09 Jan 2024 05:49:17.881 * Sentinel new configuration saved on disk
redis-sentinel0 | 7:X 09 Jan 2024 05:49:17.881 * Sentinel ID is 21722b10bad0c77589da54afe783fe1119db7af2
  1. 哨兵開始監控名為mymaster的Redis主節點(Master),位於172.19.0.2地址,6379端口,並設置仲裁數為2,並添加從節點(Slave)連結到mymaster
1
2
3
4
5
redis-sentinel0  | 7:X 09 Jan 2024 05:49:17.881 # +monitor master mymaster 172.19.0.2 6379 quorum 2
redis-sentinel0 | 7:X 09 Jan 2024 05:49:17.882 * +slave slave 172.19.0.3:9001 172.19.0.3 9001 @ mymaster 172.19.0.2 6379
redis-sentinel0 | 7:X 09 Jan 2024 05:49:17.884 * Sentinel new configuration saved on disk
redis-sentinel0 | 7:X 09 Jan 2024 05:49:17.884 * +slave slave 172.19.0.4:9002 172.19.0.4 9002 @ mymaster 172.19.0.2 6379
redis-sentinel0 | 7:X 09 Jan 2024 05:49:17.887 * Sentinel new configuration saved on disk
  1. 哨兵節點之間相互發現並建立通信,修復從節點的配置信息,确保其正確連結到mymaster。
1
2
3
4
5
6
redis-sentinel0  | 7:X 09 Jan 2024 05:49:20.092 * +sentinel sentinel 464859cce3e0077c7e6cf41bb4bf3c5346fff9a7 172.19.0.6 26381 @ mymaster 172.19.0.2 6379
redis-sentinel0 | 7:X 09 Jan 2024 05:49:20.098 * Sentinel new configuration saved on disk
redis-sentinel0 | 7:X 09 Jan 2024 05:49:20.208 * +sentinel sentinel dab60cdcfaa8d039f105a24c26725919399595f5 172.19.0.7 26380 @ mymaster 172.19.0.2 6379
redis-sentinel0 | 7:X 09 Jan 2024 05:49:20.212 * Sentinel new configuration saved on disk
redis-sentinel0 | 7:X 09 Jan 2024 05:52:18.677 * +fix-slave-config slave 172.19.0.4:9002 172.19.0.4 9002 @ mymaster 172.19.0.2 6379
redis-sentinel0 | 7:X 09 Jan 2024 05:52:18.677 * +fix-slave-config slave 172.19.0.3:9001 172.19.0.3 9001 @ mymaster 172.19.0.2 6379

故障轉移實作

試著停止redis-master0容器查看故障轉移態樣

1
docker container stop redis-master0
  1. 首先可以看到三個哨兵都認定 master 為 sdown(主觀下線),這時 redis-sentinel1 便認定為 odown(客觀下線),並打算發起投票要求成為領頭哨兵。
1
2
3
4
5
6
redis-sentinel2  | 6:X 09 Jan 2024 08:46:04.856 # +sdown master mymaster 172.19.0.2 6379
redis-sentinel1 | 7:X 09 Jan 2024 08:46:04.875 # +sdown master mymaster 172.19.0.2 6379
redis-sentinel0 | 7:X 09 Jan 2024 08:46:04.916 # +sdown master mymaster 172.19.0.2 6379
redis-sentinel1 | 7:X 09 Jan 2024 08:46:04.938 # +odown master mymaster 172.19.0.2 6379 #quorum 2/2
redis-sentinel1 | 7:X 09 Jan 2024 08:46:04.938 # +new-epoch 1
redis-sentinel1 | 7:X 09 Jan 2024 08:46:04.938 # +try-failover master mymaster 172.19.0.2 6379
  1. 三個sentinel進行投票表決,最後由redis-sentinel1當選。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
redis-sentinel1  | 7:X 09 Jan 2024 08:46:04.938 # +try-failover master mymaster 172.19.0.2 6379
redis-sentinel1 | 7:X 09 Jan 2024 08:46:04.943 * Sentinel new configuration saved on disk
redis-sentinel1 | 7:X 09 Jan 2024 08:46:04.943 # +vote-for-leader dab60cdcfaa8d039f105a24c26725919399595f5 1
redis-sentinel2 | 6:X 09 Jan 2024 08:46:04.949 * Sentinel new configuration saved on disk
redis-sentinel2 | 6:X 09 Jan 2024 08:46:04.949 # +new-epoch 1
redis-sentinel0 | 7:X 09 Jan 2024 08:46:04.950 * Sentinel new configuration saved on disk
redis-sentinel0 | 7:X 09 Jan 2024 08:46:04.950 # +new-epoch 1
redis-sentinel0 | 7:X 09 Jan 2024 08:46:04.953 * Sentinel new configuration saved on disk
redis-sentinel0 | 7:X 09 Jan 2024 08:46:04.954 # +vote-for-leader dab60cdcfaa8d039f105a24c26725919399595f5 1
redis-sentinel1 | 7:X 09 Jan 2024 08:46:04.954 * 21722b10bad0c77589da54afe783fe1119db7af2 voted for dab60cdcfaa8d039f105a24c26725919399595f5 1
redis-sentinel2 | 6:X 09 Jan 2024 08:46:04.956 * Sentinel new configuration saved on disk
redis-sentinel2 | 6:X 09 Jan 2024 08:46:04.956 # +vote-for-leader dab60cdcfaa8d039f105a24c26725919399595f5 1
redis-sentinel1 | 7:X 09 Jan 2024 08:46:04.957 * 464859cce3e0077c7e6cf41bb4bf3c5346fff9a7 voted for dab60cdcfaa8d039f105a24c26725919399595f5 1
redis-sentinel1 | 7:X 09 Jan 2024 08:46:04.996 # +elected-leader master mymaster 172.19.0.2 6379
  1. 接著 redis-sentinel1 選出了 redis-slave0(slave 172.19.0.3 9001) 作為 Master,並且下了 slaveof noone 的指令使其解除 Slave 狀態變回獨立的 Master,隨後將 redis-slave0 升格為 Master。
1
2
3
4
5
6
7
8
9
10
redis-sentinel1  | 7:X 09 Jan 2024 08:46:04.996 # +failover-state-select-slave master mymaster 172.19.0.2 6379
redis-sentinel0 | 7:X 09 Jan 2024 08:46:05.007 # +odown master mymaster 172.19.0.2 6379 #quorum 3/2
redis-sentinel0 | 7:X 09 Jan 2024 08:46:05.007 * Next failover delay: I will not start a failover before Tue Jan 9 08:52:05 2024
redis-sentinel1 | 7:X 09 Jan 2024 08:46:05.072 # +selected-slave slave 172.19.0.3:9001 172.19.0.3 9001 @ mymaster 172.19.0.2 6379
redis-sentinel1 | 7:X 09 Jan 2024 08:46:05.073 * +failover-state-send-slaveof-noone slave 172.19.0.3:9001 172.19.0.3 9001 @ mymaster 172.19.0.2 6379
redis-sentinel1 | 7:X 09 Jan 2024 08:46:05.173 * +failover-state-wait-promotion slave 172.19.0.3:9001 172.19.0.3 9001 @ mymaster 172.19.0.2 6379
redis-sentinel2 | 6:X 09 Jan 2024 08:46:05.923 # +odown master mymaster 172.19.0.2 6379 #quorum 3/2
redis-sentinel2 | 6:X 09 Jan 2024 08:46:05.923 * Next failover delay: I will not start a failover before Tue Jan 9 08:52:05 2024
redis-sentinel1 | 7:X 09 Jan 2024 08:46:05.977 * Sentinel new configuration saved on disk
redis-sentinel1 | 7:X 09 Jan 2024 08:46:05.978 # +promoted-slave slave 172.19.0.3:9001 172.19.0.3 9001 @ mymaster 172.19.0.2 6379
  1. 設定完新的 Master 後,redis-sentinel1發動故障轉移重新配置從節點的狀態,從節點 172.19.0.4:9002(redis-slave1) 已经被重新配置,連接到新的主節點 172.19.0.3 9001(redis-slave0)。
1
2
redis-sentinel1  | 7:X 09 Jan 2024 08:46:05.978 # +failover-state-reconf-slaves master mymaster 172.19.0.2 6379
redis-sentinel1 | 7:X 09 Jan 2024 08:46:06.041 * +slave-reconf-sent slave 172.19.0.4:9002 172.19.0.4 9002 @ mymaster 172.19.0.2 6379
  1. redis-sentinel0和 redis-sentinel2 開始從 redis-sentinel1 取得設定然後更新自己的設定,至此整個故障轉移就完成了。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
redis-sentinel0  | 7:X 09 Jan 2024 08:46:06.043 # +config-update-from sentinel dab60cdcfaa8d039f105a24c26725919399595f5 172.19.0.7 26380 @ mymaster 172.19.0.2 6379
redis-sentinel0 | 7:X 09 Jan 2024 08:46:06.043 # +switch-master mymaster 172.19.0.2 6379 172.19.0.3 9001
redis-sentinel2 | 6:X 09 Jan 2024 08:46:06.042 # +config-update-from sentinel dab60cdcfaa8d039f105a24c26725919399595f5 172.19.0.7 26380 @ mymaster 172.19.0.2 6379
redis-sentinel1 | 7:X 09 Jan 2024 08:46:06.041 * +slave-reconf-sent slave 172.19.0.4:9002 172.19.0.4 9002 @ mymaster 172.19.0.2 6379
redis-sentinel0 | 7:X 09 Jan 2024 08:46:06.044 * +slave slave 172.19.0.4:9002 172.19.0.4 9002 @ mymaster 172.19.0.3 9001
redis-sentinel2 | 6:X 09 Jan 2024 08:46:06.042 # +switch-master mymaster 172.19.0.2 6379 172.19.0.3 9001
redis-sentinel0 | 7:X 09 Jan 2024 08:46:06.044 * +slave slave 172.19.0.2:6379 172.19.0.2 6379 @ mymaster 172.19.0.3 9001
redis-sentinel0 | 7:X 09 Jan 2024 08:46:06.049 * Sentinel new configuration saved on disk
redis-sentinel2 | 6:X 09 Jan 2024 08:46:06.043 * +slave slave 172.19.0.4:9002 172.19.0.4 9002 @ mymaster 172.19.0.3 9001
redis-sentinel2 | 6:X 09 Jan 2024 08:46:06.043 * +slave slave 172.19.0.2:6379 172.19.0.2 6379 @ mymaster 172.19.0.3 9001
redis-sentinel2 | 6:X 09 Jan 2024 08:46:06.048 * Sentinel new configuration saved on disk
redis-sentinel1 | 7:X 09 Jan 2024 08:46:06.977 * +slave-reconf-inprog slave 172.19.0.4:9002 172.19.0.4 9002 @ mymaster 172.19.0.2 6379
redis-sentinel1 | 7:X 09 Jan 2024 08:46:06.978 * +slave-reconf-done slave 172.19.0.4:9002 172.19.0.4 9002 @ mymaster 172.19.0.2 6379
redis-sentinel1 | 7:X 09 Jan 2024 08:46:07.060 # -odown master mymaster 172.19.0.2 6379
redis-sentinel1 | 7:X 09 Jan 2024 08:46:07.060 # +failover-end master mymaster 172.19.0.2 6379
redis-sentinel1 | 7:X 09 Jan 2024 08:46:07.060 # +switch-master mymaster 172.19.0.2 6379 172.19.0.3 9001
redis-sentinel1 | 7:X 09 Jan 2024 08:46:07.060 * +slave slave 172.19.0.4:9002 172.19.0.4 9002 @ mymaster 172.19.0.3 9001
redis-sentinel1 | 7:X 09 Jan 2024 08:46:07.060 * +slave slave 172.19.0.2:6379 172.19.0.2 6379 @ mymaster 172.19.0.3 9001
  1. 此時因為 redis-master0 還是處於關閉的狀態,所以三個哨兵還是會判斷其為主觀下線,但是因為他已經成為 Slave,所以不會進行故障轉移。
1
2
3
4
redis-sentinel1  | 7:X 09 Jan 2024 08:46:07.065 * Sentinel new configuration saved on disk
redis-sentinel2 | 6:X 09 Jan 2024 08:46:11.046 # +sdown slave 172.19.0.2:6379 172.19.0.2 6379 @ mymaster 172.19.0.3 9001
redis-sentinel0 | 7:X 09 Jan 2024 08:46:11.088 # +sdown slave 172.19.0.2:6379 172.19.0.2 6379 @ mymaster 172.19.0.3 9001
redis-sentinel1 | 7:X 09 Jan 2024 08:46:12.071 # +sdown slave 172.19.0.2:6379 172.19.0.2 6379 @ mymaster 172.19.0.3 9001

心得

哨兵模式在挑選新的主節點時,哨兵可能會考慮從節點的健康狀況、同步狀態、復製偏移量等因素。這確保了選擇的節點是最合適的,以提供高性能和數據一致性。每個從節點都有一個優先級,哨兵可能會選擇具有更高優先級的從節點作為新的主節點,優先級的設置可以通過配置文件進行調整。以上透由分享哨兵模式的使用範例提供給大家作為Redis高可用的選擇方案之一,配合主從模式下提供故障轉移和高可用性,適用於數據集相對較小,單機內存容量能夠滿足需求的場景。

參考文章

Redis Notes - Sentinel (for High Availability)
redis sentinel基本命令与参数
High availability with Redis Sentinel | Redis
Redis (六) - 主從複製、哨兵與叢集模式

  • 版權聲明: 本網誌所有文章除特別聲明外,均採用 BY-NC-SA 許可協議。轉載請註明出處!