[TOC]
# MongDB分片副本集集群
副本集解決主節點發生故障導致數據丟失或不可用的問題,但遇到需要存儲海量數據的情況時,副本集機制就束手無策了。副本集中的一臺機器可能不足以存儲數據,或者說集群不足以提供可接受的讀寫吞吐量。這就需要用到 MongoDB 的分片(Sharding)技術,這也是 MongoDB 的另外一種集群部署模式。
分片是指將數據拆分并分散存放在不同機器上的過程。有時也用分區來表示這個概念。將數據分散到不同的機器上,不需要功能強大的大型計算機就可以存儲更多的數據,處理更大的負載。
MongoDB 支持自動分片,可以使數據庫架構對應用程序不可見,簡化系統管理。對應用程序而言,就如同始終在使用一個單機的 MongoDB 服務器一樣。
MongoDB 的分片機制允許創建一個包含許多臺機器的集群,將數據子集分散在集群中,每個分片維護著一個數據集合的子集。與副本集相比,使用集群架構可以使應用程序具有更強大的數據處理能力。
下圖展示了在MongoDB中使用分片集群結構分布:

上圖中主要有如下所述三個主要組件:
- Shard:
用于存儲實際的數據塊,實際生產環境中一個shard server角色可由幾臺機器組個一個relica set承擔,防止主機單點故障
- Config Server:
mongod實例,存儲了整個 ClusterMetadata,其中包括 chunk信息。4.0版本以后要求最低3個副本集
- Query Routers:
前端路由,客戶端由此接入,且讓整個集群看上去像單一數據庫,前端應用可以透明使用。
# 開始搭建
準備三臺機器,這里用的三臺虛擬機
系統: CentOS-7.2
軟件: MongDB-4.2.7
role/port | 192.168.0.41 | 192.168.0.42 | 192.168.0.43
---|---|---|---
config | 30000 | 30000 | 30000
mongos | 30010 | 30010 | 30010
shard1 | 27001 | 27001 | 27001
shard2 | 27002 | 27002 | 27002
shard3 | 27003 | 27003 | 27003
官網新版本配置文件說明:
[https://docs.mongodb.com/manual/reference/configuration-options/](https://note.youdao.com/)
## config集群
1. 創建文件及文件夾
```
[root@localhost /]# mkdir -p /usr/local/mongodb/data/config
[root@localhost /]# mkdir -p /usr/local/mongodb/log/config
[root@localhost /]# touch /usr/local/mongodb/log/config.log
```
2. 創建config.cfg
```
[root@localhost /]# cat /usr/local/mongodb/conf/config.cfg
systemLog:
destination: file
logAppend: true
path: /usr/local/mongodb/log/config.log
storage:
dbPath: /usr/local/mongodb/data/config
journal:
enabled: true
# engine:
# wiredTiger:
processManagement:
fork: true # fork and run in background
#pidFilePath: /var/run/mongodb/config1.pid # location of pidfile
timeZoneInfo: /usr/share/zoneinfo
# network interfaces
net:
port: 30000
bindIp: 0.0.0.0
#security:
# authorization: enabled
# keyFile: /usr/local/mongodb/conf/mongodb.keyfile
#operationProfiling:
replication:
replSetName: configs
sharding:
clusterRole: configsvr
## Enterprise-Only Options
#auditLog:
#snmp:
```
3. 啟動config
```
[root@localhost mongodb]# mongod -f /usr/local/mongodb/conf/config.cfg
about to fork child process, waiting until server is ready for connections.
forked process: 223
child process started successfully, parent exiting
```
ps: 30000端口成功監聽
另外2臺的操作同上
4. 創建config集群
任意一臺機器上操作
```
[root@localhost mongodb]# mongo --port 30000
> rs.initiate(
{
_id: "configs",
configsvr: true,
members: [
{ _id : 0, host : "192.168.0.41:30000" },
{ _id : 1, host : "192.168.0.42:30000" },
{ _id : 2, host : "192.168.0.43:30000" },
]
}
)
"ok" : 1,
"$gleStats" : {
"lastOpTime" : Timestamp(1591233702, 1),
"electionId" : ObjectId("000000000000000000000000")
},
"lastCommittedOpTime" : Timestamp(0, 0),
"$clusterTime" : {
"clusterTime" : Timestamp(1591233702, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
},
"operationTime" : Timestamp(1591233702, 1)
}
```
ps: 初始化id必須要和config.cfg內定義的replSetName相同才能成功
rs.status() 查看集群狀態
## shard分片集群
1. 創建文件及文件夾
```
[root@localhost /]# mkdir -p /usr/local/mongodb/data/shard{1..3}
[root@localhost /]# mkdir -p /usr/local/mongodb/log/shard{1..3}
[root@localhost /]# touch /usr/local/mongodb/log/shard{1..3}/shard.log
```
2. 創建shard1.cfg,shard2.cfg,shard3.cfg
```
[root@localhost /]# cat /usr/local/mongodb/conf/shard1.cfg
systemLog:
destination: file
logAppend: true
path: /usr/local/mongodb/log/shard1/shard.log
storage:
dbPath: /usr/local/mongodb/data/shard1
journal:
enabled: true
# engine:
# wiredTiger:
processManagement:
fork: true # fork and run in background
#pidFilePath: /var/run/mongodb/shard1.pid # location of pidfile
timeZoneInfo: /usr/share/zoneinfo
net:
port: 27001
bindIp: 0.0.0.0
#security:
# authorization: enabled
# keyFile: /usr/local/mongodb/conf/mongodb.keyfile
#operationProfiling:
replication:
replSetName: shard1
sharding:
clusterRole: shardsvr
## Enterprise-Only Options
#auditLog:
#snmp:
[root@localhost /]# cat /usr/local/mongodb/conf/shard2.cfg
systemLog:
destination: file
logAppend: true
path: /usr/local/mongodb/log/shard2/shard.log
storage:
dbPath: /usr/local/mongodb/data/shard2
journal:
enabled: true
# engine:
# wiredTiger:
processManagement:
fork: true # fork and run in background
#pidFilePath: /var/run/mongodb/shard1.pid # location of pidfile
timeZoneInfo: /usr/share/zoneinfo
net:
port: 27002
bindIp: 0.0.0.0
#security:
#operationProfiling:
replication:
replSetName: shard2
sharding:
clusterRole: shardsvr
## Enterprise-Only Options
#auditLog:
#snmp:
```
shard3.cfg參考shard2.cfg
3. 啟動shard
```
[root@localhost mongodb]# mongod -f /usr/local/mongodb/conf/shard1.cfg
[root@localhost mongodb]# mongod -f /usr/local/mongodb/conf/shard2.cfg
[root@localhost mongodb]# mongod -f /usr/local/mongodb/conf/shard3.cfg
```
另外2臺的操作同上
4. 創建shard集群
任意一臺機器上執行
```
[root@localhost mongodb]# mongo --port 27001
> rs.initiate(
{
_id: "shard1",
members: [
{ _id : 0, host : "192.168.0.41:27001" },
{ _id : 1, host : "192.168.0.42:27001" },
{ _id : 2, host : "192.168.0.43:27001" },
]
}
)
[root@localhost mongodb]# mongo --port 27002
> rs.initiate(
{
_id: "shard2",
members: [
{ _id : 0, host : "192.168.0.41:27002" },
{ _id : 1, host : "192.168.0.42:27002" },
{ _id : 2, host : "192.168.0.43:27002" },
]
}
)
[root@localhost mongodb]# mongo --port 27003
> rs.initiate(
{
_id: "shard3",
members: [
{ _id : 0, host : "192.168.0.41:27003" },
{ _id : 1, host : "192.168.0.42:27003" },
{ _id : 2, host : "192.168.0.43:27003" },
]
}
)
```
目前已經創建了shard1,shard2,shard3三個具有三副本的復制集,后續還會創建分片集群
## mongos服務
1. 準備文件
monogs作為集群訪問的入口,是不存儲數據的,只需要創建日志文件。每個服務器上啟動一個mongos,提高可用性。
```
[root@localhost mongodb]# touch /usr/local/mongodb/log/mongos.log
```
2. 創建mongos.cfg
```
[root@localhost /]# cat /usr/local/mongodb/conf/mongos.cfg
systemLog:
destination: file
logAppend: true
path: /usr/local/mongodb/log/mongos.log
processManagement:
fork: true # fork and run in background
#pidFilePath: /var/run/mongodb/mongos1.pid # location of pidfile
timeZoneInfo: /usr/share/zoneinfo
net:
port: 30010
bindIp: 0.0.0.0
#security:
#operationProfiling:
#eplication:
sharding:
# 配置哪些ip 需要添加
#configDB: configs/192.168.0.153:30000,192.168.0.216:30000,192.168.0.218:30000
configDB: configs/192.168.0.41:30000,192.168.0.42:30000,192.168.0.43:30000
# clusterRole: configsvr
## Enterprise-Only Options
#auditLog:
#snmp:
```
ps: configDB指向三個config服務的地址和端口
3. 啟動mongos
```
[root@localhost mongodb]# mongos -f /usr/local/mongodb/conf/mongos.cfg
```
ps:mongos啟動不是使用mongod
每臺機器都要啟動一個mongos服務
## 為各副本集添加秘鑰認證
1. 生成秘鑰文件
任意一臺機器操作
```
[root@localhost mongodb]# openssl rand 700 -base64 > /usr/local/mongodb/conf/mongodb.keyfile
[root@localhost mongodb]# chmod 600 /usr/local/mongodb/conf/mongodb.keyfile
```
2. 將秘鑰分發到其他服務器
```
[root@localhost mongodb]# scp /usr/local/mongodb/conf/mongodb.keyfile root@192.168.0.42:/usr/local/mongodb/conf
[root@localhost mongodb]# scp /usr/local/mongodb/conf/mongodb.keyfile root@192.168.0.43:/usr/local/mongodb/conf
```
3. 修改配置文件
config.cfg,shard.cfg,mongos.cfg三類文件新增以下內容
```
security:
keyFile: /usr/local/mongodb/conf/mongodb.keyfile
```
## 創建用于連接的用戶
1. 命令行創建用戶
```
mongos> use admin
switched to db admin
mongos> db.createUser({user:"mongouser",pwd:"mongopwd",roles:["root"]})
mongos> db.auth("mongouser","mongopwd")
mongos> use config
switched to db config
mongos> db.createUser({user:"mongouser",pwd:"mongopwd",roles:["root"]})
mongos> db.auth("mongouser","mongopwd")
```
2. 修改配置文件
config.cfg,shard.cfg兩類文件新增以下內容
```
security:
authorization: enabled
```
3. 重啟服務mongo服務
4. 客戶端連接
5. 用戶管理常用命令
- show users // 查看當前庫下的用戶
- db.dropUser('testadmin') // 刪除用戶
- db.updateUser('admin', {pwd: '654321'}) // 修改用戶密碼
- db.auth('admin', '654321') // 密碼認證
## 可能遇到的問題
1. 啟用認證后,遇到"command replSetGetStatus requires authentication"
```
configs:SECONDARY> rs.status()
{
"operationTime" : Timestamp(1591253807, 2),
"ok" : 0,
"errmsg" : "command replSetGetStatus requires authentication",
"code" : 13,
"codeName" : "Unauthorized",
"lastCommittedOpTime" : Timestamp(1591253807, 2),
"$clusterTime" : {
"clusterTime" : Timestamp(1591253807, 2),
"signature" : {
"hash" : BinData(0,"QCmeAVDoq7jZzwHfPYNn/mU4eS8="),
"keyId" : NumberLong("6834296761922617360")
}
}
}
configs:SECONDARY> use admin
switched to db admin
configs:SECONDARY> db.auth('mongouser','mongopwd')
1
```
2. 啟動mongos實例一直無響應
- 可能是啟動順序,需要啟動config實例后操作
- 認證秘鑰問題
# 使用實例
1. 連接mongos
任意一臺上連接
```
[root@localhost mongodb]# mongo --port 30010
```
2. 添加分片
```
mongos> use admin
switched to db admin
mongos> sh.addShard("shard1/192.168.0.41:27001,192.168.0.42:27001,192.168.0.43:27001")
{
"shardAdded" : "shard1",
"ok" : 1,
"operationTime" : Timestamp(1591241739, 4),
"$clusterTime" : {
"clusterTime" : Timestamp(1591241739, 4),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
mongos> sh.addShard("shard2/192.168.0.41:27002,192.168.0.42:27002,192.168.0.43:27002")
{
"shardAdded" : "shard2",
"ok" : 1,
"operationTime" : Timestamp(1591241810, 2),
"$clusterTime" : {
"clusterTime" : Timestamp(1591241810, 2),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
mongos> sh.addShard("shard3/192.168.0.41:27003,192.168.0.42:27003,192.168.0.43:27003")
{
"shardAdded" : "shard3",
"ok" : 1,
"operationTime" : Timestamp(1591241833, 6),
"$clusterTime" : {
"clusterTime" : Timestamp(1591241833, 6),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
}
}
```
- 查看分片狀態
sh.status()
- 移除分片
db.runCommand({removeShard : "shard1"})
3. 開啟數據庫分片(創建數據庫)
```
mongos> use admin
mongos> sh.enableSharding("notice_set")
```
4. 指定數據庫表中的字段key為分片鍵,并對其使用哈希分片策略
```
mongos> sh.shardCollection("notice_set.notice",{"msgid": "hashed"})
```
5. 添加測試數據
```
mongos> use notice_set
mongos> for (i = 1; i <= 10; i++) db.notice.insert({"msgid":i})
```
可以查看每個分片的數據量等于10
# 連接集群的方式
1. 單機連接
選任一mongos直接連接,如果該mongos不可用,服務也無法繼續使用
```
mongos --host --port
```
2. 集群連接
配置上所有mongos,某一個mongos不可用,服務將繼續
```
mongodb://[username:password@]host1[:port1][,host2[:port2],...[,hostN[:portN]]][/[database][?options]]
```
- mongodb:// 前綴,代表這是一個Connection String
- username:password@ 如果啟用了鑒權,需要指定用戶密碼
- hostX:portX 多個 mongos 的地址列表
- /database 鑒權時,用戶帳號所屬的數據庫
- ?options 指定額外的連接選項