mongodの4インスタンスでレプリケーションを構成し、各インスタンスで障害が発生した場合、どのようにフェイルオーバーするかの動作検証を行います。 votesやpriorityの値を変更して、可用性向上(Primaryインスタンスを選出できるか)を試みています。 OS: amazon-linux-ami/2014.03 (64-bit) MongoDB: 2.6.4
■システム構成概要図
■方針
mongod0/mongo1はアプリケーションと対面するインスタンス。 mongod2はバックアップ用のインスタンス。Priorityの値を0に設定しPrimaryに昇格させない。バックアップ時の計画停止を想定。 mongod3はArbiter固定のインスタンス。 mongo-a/mongo-b/mongo-cの3台のサーバの内、1台に障害が発生してもアプリケーションに対してDBサービス提供が継続できることを目標とする。 さらに、mongod2を計画停止中に他のインスタンスに障害が発生してもPrimaryを選出してサービスが継続できることを目標とする。
■起動用設定ファイル準備
## mongo-aサーバ
[ec2-user@mongo-a ~]$ mkdir /data/mongo_rs/mongod0 [ec2-user@mongo-a ~]$ cat /data/mongo_rs/conf/mongod0.conf bind_ip=mongo-a port=30000 dbpath=/data/mongo_rs/mongod0 pidfilepath=/var/run/mongodb/mongod0.pid logpath=/var/log/mongodb/mongod0.log logappend=true fork=true replSet=rep1 httpinterface=true [ec2-user@mongo-a ~]$
## mongo-bサーバ
[ec2-user@mongo-b ~]$ mkdir /data/mongo_rs/mongod1 [ec2-user@mongo-b ~]$ cat /data/mongo_rs/conf/mongod1.conf bind_ip=mongo-b port=30001 dbpath=/data/mongo_rs/mongod1 pidfilepath=/var/run/mongodb/mongod1.pid logpath=/var/log/mongodb/mongod1.log logappend=true fork=true replSet=rep1 [ec2-user@mongo-b ~]$サーバが異なるインスタンスについてはポート番号を変える必要はないが、管理しやすさを考慮してインスタンス番号と合わせてみた。
## mongo-cサーバ
[ec2-user@mongo-c ~]$ mkdir /data/mongo_rs/mongod2 [ec2-user@mongo-c ~]$ mkdir /data/mongo_rs/mongod3 [ec2-user@mongo-c ~]$ cat /data/mongo_rs/conf/mongod2.conf bind_ip=mongo-c port=30002 dbpath=/data/mongo_rs/mongod2 pidfilepath=/var/run/mongodb/mongod2.pid logpath=/var/log/mongodb/mongod2.log logappend=true fork=true replSet=rep1 [ec2-user@mongo-c ~]$ cat /data/mongo_rs/conf/mongod3.conf bind_ip=mongo-c port=30003 dbpath=/data/mongo_rs/mongod3 pidfilepath=/var/run/mongodb/mongod3.pid logpath=/var/log/mongodb/mongod3.log logappend=true fork=true replSet=rep1 [ec2-user@mongo-c ~]$
■起動/停止/ステータス確認スクリプト準備
## 起動用
[ec2-user@mongo-a ~]$ vi start-mongod0.sh [ec2-user@mongo-a ~]$ cat start-mongod0.sh sudo mongod --config /data/mongo_rs/conf/mongod0.conf [ec2-user@mongo-a ~]$ chmod 755 start-mongod0.sh [ec2-user@mongo-a ~]$※オフィシャルサイトでは起動前に"ulimit -n 64000"を推奨しているようだが、EC2がしょぼいので今回はパス。
## 停止用
[ec2-user@mongo-a ~]$ vi stop-mongod0.sh [ec2-user@mongo-a ~]$ cat stop-mongod0.sh sudo kill -9 `ps -ef | grep mongod0.conf | grep -v grep | awk '{print $2}'` [ec2-user@mongo-a ~]$ chmod 755 stop-mongod0.sh [ec2-user@mongo-a ~]$
## 確認用
[ec2-user@mongo-a ~]$ vi mongod-status.sh [ec2-user@mongo-a ~]$ cat mongod-status.sh mongo mongo-a:30000 --eval "printjson(rs.status())" | egrep "name|stateStr" echo "" mongo mongo-a:30000 --eval "printjson(rs.conf())" | egrep "host|votes|priority|arbiter" [ec2-user@mongo-a ~]$ chmod 755 mongod-status.sh [ec2-user@mongo-a ~]$各サーバに同様のスクリプトを準備。
■mongod起動
[ec2-user@mongo-a ~]$ ./start-mongod0.sh about to fork child process, waiting until server is ready for connections. forked process: 1871 child process started successfully, parent exiting [ec2-user@mongo-a ~]$同様に各サーバのmongodも起動。
■replication設定
## ReplicaSetの初期化
[ec2-user@mongo-a ~]$ mongo mongo-a:30000 MongoDB shell version: 2.6.4 connecting to: mongo-a:30000/test > rs.initiate(); { "info2" : "no configuration explicitly specified -- making one", "me" : "mongo-a:30000", "info" : "Config now saved locally. Should come online in about a minute.", "ok" : 1 } > rs.conf(); { "_id" : "rep1", "version" : 1, "members" : [ { "_id" : 0, "host" : "mongo-a:30000" } ] } rep1:PRIMARY>
## SecondaryとArbiter追加
rep1:PRIMARY> rs.add( { _id: 1, host: "mongo-b:30001" } ) { "ok" : 1 } rep1:PRIMARY> rs.add( { _id: 2, host: "mongo-c:30002", priority: 0, votes: 0 } ) { "ok" : 1 } rep1:PRIMARY> rs.add( { _id: 3, host: "mongo-c:30003", arbiterOnly: true } ) { "ok" : 1 } rep1:PRIMARY>
## 設定確認
rep1:PRIMARY> rs.conf(); { "_id" : "rep1", "version" : 4, "members" : [ { "_id" : 0, "host" : "mongo-a:30000" }, { "_id" : 1, "host" : "mongo-b:30001" }, { "_id" : 2, "host" : "mongo-c:30002", "votes" : 0, "priority" : 0 }, { "_id" : 3, "host" : "mongo-c:30003", "arbiterOnly" : true } ] } rep1:PRIMARY> rs.status(); { "set" : "rep1", "date" : ISODate("2014-09-17T07:13:20Z"), "myState" : 1, "members" : [ { "_id" : 0, "name" : "mongo-a:30000", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 592, "optime" : Timestamp(1410937918, 1), "optimeDate" : ISODate("2014-09-17T07:11:58Z"), "electionTime" : Timestamp(1410937614, 1), "electionDate" : ISODate("2014-09-17T07:06:54Z"), "self" : true }, { "_id" : 1, "name" : "mongo-b:30001", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 103, "optime" : Timestamp(1410937918, 1), "optimeDate" : ISODate("2014-09-17T07:11:58Z"), "lastHeartbeat" : ISODate("2014-09-17T07:13:19Z"), "lastHeartbeatRecv" : ISODate("2014-09-17T07:13:20Z"), "pingMs" : 0, "syncingTo" : "mongo-a:30000" }, { "_id" : 2, "name" : "mongo-c:30002", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 91, "optime" : Timestamp(1410937918, 1), "optimeDate" : ISODate("2014-09-17T07:11:58Z"), "lastHeartbeat" : ISODate("2014-09-17T07:13:19Z"), "lastHeartbeatRecv" : ISODate("2014-09-17T07:13:19Z"), "pingMs" : 2, "syncingTo" : "mongo-a:30000" }, { "_id" : 3, "name" : "mongo-c:30003", "health" : 1, "state" : 7, "stateStr" : "ARBITER", "uptime" : 82, "lastHeartbeat" : ISODate("2014-09-17T07:13:18Z"), "lastHeartbeatRecv" : ISODate("2014-09-17T07:13:18Z"), "pingMs" : 2 } ], "ok" : 1 } rep1:PRIMARY>votesとpriorityのdefault値は"1"となる。
default値の場合はrs.conf();のOutputとして表示されない。
## 上記設定の目論見:
mongod2のvotesの値を0にすることで、votesの初期合計値は3となる。
mongod2を含む2インスタンスが停止しても、残りインスタンスのvotesの合計値が2となり、過半数を超える(= 2/3)ため、Primaryが選出される。
◆参考: votesの値を変更
rep1:PRIMARY> cfg = rs.conf(); { "_id" : "rep1", "version" : 4, "members" : [ { "_id" : 0, "host" : "mongo-a:30000" }, { "_id" : 1, "host" : "mongo-b:30001" }, (以下略) rep1:PRIMARY> cfg.members[1].votes = 0; 0 rep1:PRIMARY> rep1:PRIMARY> rs.reconfig(cfg); 2014-09-17T16:23:13.386+0900 DBClientCursor::init call() failed 2014-09-17T16:23:13.388+0900 trying reconnect to mongo-a:30000 (10.54.10.90) failed 2014-09-17T16:23:13.388+0900 reconnect mongo-a:30000 (10.54.10.90) ok reconnected to server after rs command (which is normal) rep1:SECONDARY> rs.conf(); { "_id" : "rep1", "version" : 5, "members" : [ { "_id" : 0, "host" : "mongo-a:30000" }, { "_id" : 1, "host" : "mongo-b:30001", "votes" : 0 }, (以下略)※エラーメッセージの調査はですね、、、後回し。
◆参考: Replicasetから除外
あらかじめ除外対象のmongod(今回はmongod2とする)を停止しておく
[ec2-user@mongo-c ~]$ ./stop-mongod2.sh [ec2-user@mongo-c ~]$
rep1:PRIMARY> rs.remove("mongo-c:30002"); 2014-09-17T16:29:42.296+0900 DBClientCursor::init call() failed 2014-09-17T16:29:42.297+0900 Error: error doing query: failed at src/mongo/shell/query.js:81 2014-09-17T16:29:42.298+0900 trying reconnect to mongo-a:30000 (10.54.10.90) failed 2014-09-17T16:29:42.298+0900 reconnect mongo-a:30000 (10.54.10.90) ok rep1:PRIMARY>
[ec2-user@mongo-a ~]$ ./mongod-status.sh "name" : "mongo-a:30000", "stateStr" : "PRIMARY", "name" : "mongo-b:30001", "stateStr" : "SECONDARY", "name" : "mongo-c:30003", "stateStr" : "ARBITER", "host" : "mongo-a:30000" "host" : "mongo-b:30001" "host" : "mongo-c:30003", "arbiterOnly" : true [ec2-user@mongo-a ~]$
## 再び追加
[ec2-user@mongo-c ~]$ ./start-mongod2.sh about to fork child process, waiting until server is ready for connections. forked process: 2723 child process started successfully, parent exiting [ec2-user@mongo-c ~]$
[ec2-user@mongo-a ~]$ mongo mongo-a:30000 MongoDB shell version: 2.6.4 connecting to: mongo-a:30000/test rep1:PRIMARY> rs.add( { _id: 2, host: "mongo-c:30002", priority: 0, votes: 0 } ) { "ok" : 1 } rep1:PRIMARY>
◆rs関連コマンド
rep1:PRIMARY> rs.help(); rs.status() { replSetGetStatus : 1 } checks repl set status rs.initiate() { replSetInitiate : null } initiates set with default settings rs.initiate(cfg) { replSetInitiate : cfg } initiates set with configuration cfg rs.conf() get the current configuration object from local.system.replset rs.reconfig(cfg) updates the configuration of a running replica set with cfg (disconnects) rs.add(hostportstr) add a new member to the set with default attributes (disconnects) rs.add(membercfgobj) add a new member to the set with extra attributes (disconnects) rs.addArb(hostportstr) add a new member which is arbiterOnly:true (disconnects) rs.stepDown([secs]) step down as primary (momentarily) (disconnects) rs.syncFrom(hostportstr) make a secondary to sync from the given member rs.freeze(secs) make a node ineligible to become primary for the time specified rs.remove(hostportstr) remove a host from the replica set (disconnects) rs.slaveOk() shorthand for db.getMongo().setSlaveOk() rs.printReplicationInfo() check oplog size and time range rs.printSlaveReplicationInfo() check replica set members and replication lag db.isMaster() check who is primary reconfiguration helpers disconnect from the database so the shell will display an error, even if the command succeeds. see also http://:28017/_replSet for additional diagnostic info rep1:PRIMARY>
■障害シミュレーション
## 正常時
[ec2-user@mongo-a ~]$ ./mongod-status.sh "name" : "mongo-a:30000", "stateStr" : "PRIMARY", "name" : "mongo-b:30001", "stateStr" : "SECONDARY", "name" : "mongo-c:30002", "stateStr" : "SECONDARY", "name" : "mongo-c:30003", "stateStr" : "ARBITER",
## mongo-aサーバ停止
[ec2-user@mongo-c ~]$ ./mongod-status.sh "name" : "mongo-a:30000", "stateStr" : "(not reachable/healthy)", "name" : "mongo-b:30001", "stateStr" : "PRIMARY", "name" : "mongo-c:30002", "stateStr" : "SECONDARY", "name" : "mongo-c:30003", "stateStr" : "ARBITER",
mongo-bサーバのmongod1インスタンスがPRIMARYに選出される。mongod2インスタンスが選出されないのはpriorityの値を"0"に設定しているから。
## mongo-bサーバ停止
[ec2-user@mongo-a ~]$ ./mongod-status.sh "name" : "mongo-a:30000", "stateStr" : "PRIMARY", "name" : "mongo-b:30001", "stateStr" : "(not reachable/healthy)", "name" : "mongo-c:30002", "stateStr" : "SECONDARY", "name" : "mongo-c:30003", "stateStr" : "ARBITER",
## mongo-cサーバ停止
[ec2-user@mongo-a ~]$ ./mongod-status.sh "name" : "mongo-a:30000", "stateStr" : "PRIMARY", "name" : "mongo-b:30001", "stateStr" : "SECONDARY", "name" : "mongo-c:30002", "stateStr" : "(not reachable/healthy)", "name" : "mongo-c:30003", "stateStr" : "(not reachable/healthy)",
Default設定のままだと生きているインスタンスが両方とも"SECONDARY"となってしまうが、votesの設定が功を奏し"PRIMARY"を保持できている。
## mongo-cのSecondaryを停止中にmongo-aサーバ停止
[ec2-user@mongo-c ~]$ ./mongod-status.sh "name" : "mongo-a:30000", "stateStr" : "(not reachable/healthy)", "name" : "mongo-b:30001", "stateStr" : "PRIMARY", "name" : "mongo-c:30002", "stateStr" : "(not reachable/healthy)", "name" : "mongo-c:30003", "stateStr" : "ARBITER",
バックアップ用インスタンスであるmongod2を停止中に、運悪く他のインスタンスが落ちても1つなら大丈夫。
以上、全パターン想定どおりとなりました。
■まとめ
## 3インスタンス構成
votes/priority 設定なし(default) |
mongod0 votes:1 priority:1 |
mongod1 votes:1 priority:1 |
mongod2 votes:1 priority:1 |
|||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Initial State | PRIMARY | SECONDARY | ARBITER | |||||||||||
Failure | mongod0 | × | PRIMARY | ARBITER | ||||||||||
mongod1 | PRIMARY | × | ARBITER | |||||||||||
mongod2 | PRIMARY | SECONDARY | × |
## 4インスタンス構成(1)
votes/priority 設定なし(default) |
mongod0 votes:1 priority:1 |
mongod1 votes:1 priority:1 |
mongod2 votes:1 priority:1 |
mongod3 votes:1 ArbiterOnly |
||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Initial State | PRIMARY | SECONDARY | SECONDARY | ARBITER | ||||||||||||||
Failure | mongod0 | × | PRIMARY (SECONDARY) |
SECONDARY (PRIMARY) |
ARBITER | |||||||||||||
mongod1 | PRIMARY | × | SECONDARY | ARBITER | ||||||||||||||
mongod2 | PRIMARY | SECONDARY | × | ARBITER | ||||||||||||||
mongod3 | PRIMARY | SECONDARY | SECONDARY | × | ||||||||||||||
mongod0,1 | × | × | SECONDARY | ARBITER | ||||||||||||||
mongod0,2 | × | SECONDARY | × | ARBITER | ||||||||||||||
mongod0,3 | × | SECONDARY | SECONDARY | × | ||||||||||||||
mongod1,2 | SECONDARY | × | × | ARBITER | ||||||||||||||
mongod1,3 | × | × | SECONDARY | ARBITER | ||||||||||||||
mongod2,3 | SECONDARY | SECONDARY | × | × |
つまり、mongod2,mongod3が同居しているmongo-cサーバに障害が発生した時点でPrimaryを保持できなくなりサービスを継続できない。
## 4インスタンス構成(2)
votes/priority カスタマイズ設定 |
mongod0 votes:1 priority:1 |
mongod1 votes:1 priority:1 |
mongod2 votes:0 priority:0 |
mongod3 votes:1 ArbiterOnly |
||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Initial State | PRIMARY | SECONDARY | SECONDARY | ARBITER | ||||||||||||||
Failure | mongod0 | × | PRIMARY | SECONDARY | ARBITER | |||||||||||||
mongod1 | PRIMARY | × | SECONDARY | ARBITER | ||||||||||||||
mongod2 | PRIMARY | SECONDARY | × | ARBITER | ||||||||||||||
mongod3 | PRIMARY | SECONDARY | SECONDARY | × | ||||||||||||||
mongod0,1 | × | × | SECONDARY | ARBITER | ||||||||||||||
mongod0,2 | × | PRIMARY | × | ARBITER | ||||||||||||||
mongod0,3 | × | SECONDARY | SECONDARY | × | ||||||||||||||
mongod1,2 | PRIMARY | × | × | ARBITER | ||||||||||||||
mongod1,3 | × | × | SECONDARY | ARBITER | ||||||||||||||
mongod2,3 | PRIMARY | SECONDARY | × | × |
こちらの情報が何かのお役に立てましたら幸いです。
今回はかなり頑張って記載したので、ご利用、転記する場合はコメントいただけるとありがたいです。m(_ _)m