MongoDB 和 GridFS 用于內部和內部數據中心數據復制 · HighScalability 中文示例

# MongoDB 和 GridFS 用于內部和內部數據中心數據復制 > 原文： [http://highscalability.com/blog/2013/1/14/mongodb-and-gridfs-for-inter-and-intra-datacenter-data-repli.html](http://highscalability.com/blog/2013/1/14/mongodb-and-gridfs-for-inter-and-intra-datacenter-data-repli.html) ![](https://img.kancloud.cn/74/23/74232cfd34c405958951678e3500ac92_240x167.png) *這是 LogicMonitor 副總裁 [Jeff Behl](jbehl@logicmonitor.com) 的來賓帖子。 [在過去 20 年中，Jeff](@jeffbehl) 有點過時，他為許多基于 SaaS 的公司設計和監督基礎架構。* ## 用于災難恢復的數據復制災難恢復計劃的必然部分是確保客戶數據存在于多個位置。對于 LogicMonitor，這是一個基于 SaaS 的物理，虛擬和云環境的監視解決方案，我們希望在數據中心內外都可以復制客戶數據文件。前者用于防止設施中的單個服務器丟失，而后者用于在數據中心完全丟失的情況下進行恢復。 ## 我們在哪里：Rsync 像大多數在 Linux 環境下工作的人一樣，我們使用了可信賴的朋友 rsync 來復制數據。 ![](https://img.kancloud.cn/92/02/9202c6f872e0f123bc901c90954c9cda_451x313.png) Rsync is tried, true and tested, and works well when the number of servers, the amount of data, and the number of files is not horrendous. ?When any of these are no longer the case, situations arise, and when the number of rsync jobs needed increases to more than a handful, one is inevitably faced with a number of issues: * 備份作業重疊 * 備份作業時間增加 * 過多的同時作業使服務器或網絡超載 * 完成 rsync 作業后，沒有簡單的方法來協調所需的其他步驟 * 沒有簡便的方法來監視作業計數，作業統計信息并在發生故障時得到警報 Here at LogicMonitor [our philosophy](http://blog.logicmonitor.com/2012/07/17/our-philosophy-of-monitoring/) and reason for being is rooted in the belief that everything in your infrastructure needs to be monitored, so the inability to easily monitor the status of rsync jobs was particularly vexing (and no, we do not believe that emailing job status is monitoring!). ?We needed to get better statistics and alerting, both in order to keep track of backup jobs, but?also to be able to put some logic into the jobs themselves to prevent issues like too many running simultaneously.The obvious solution was to store this information into a database. A database repository for backup job metadata, where jobs themselves can report their status, and where other backup components can get information in order to coordinate tasks such as removing old jobs, was clearly needed. ?It would also enable us to monitor backup job status via simple queries for information such as the number of jobs running (total, and on a per-server basis), the time since the last backup, the size of the backup jobs, etc., etc. ?? ## MongoDB 作為備份作業元數據存儲 The type of backup job statistics was more than likely going to evolve over time, so MongoDB came to light with its “[schemaless](http://blog.mongodb.org/post/119945109/why-schemaless)” document store design. ?It seemed the perfect fit: easy to setup, easy to query, schemaless, and a simple JSON style structure for storing job information. ?As an added bonus, MongoDB replication is excellent: ?it is robust and extremely easy to ?implement and maintain. ?Compared to MySQL, adding members to a MongoDB replica set is auto-magic.So the first idea was to keep using rsync, but track the status of jobs in MongoDB. But it was a kludge to have to wrap all sorts of reporting and querying logic in scripts surrounding rsync. ?The backup job metainfo and the actual backed up files were still separate and decoupled, with the metadata in MongoDB and the backed up files residing on a disk on some system (not necessarily the same). ?How nice it would be if the the data and the database were combined. ?If I could query for a specific backup job, then use the same query language again for an actual backed up file and get it. ?If restoring data files was just a simple query away... ?[Enter GridFS](http://docs.mongodb.org/manual/applications/gridfs/). ## 為什么選擇 GridFS You can read up on the details GridFS on the MongoDB site, but suffice it to say it is a simple file system overlay on top of MongoDB (files are simply chunked up and stored in the same manner that all documents are). ?Instead of having scripts surround rsync, our backup scripts store the data and the metadata at the same time and into the same place, so everything is easily queried. 當然，MongoDB 復制可與 GridFS 一起使用，這意味著備份的文件可立即在數據中心內和異地復制。通過 Amazon EC2 內部的副本，可以拍攝快照以保留所需的盡可能多的歷史備份。現在，我們的設置如下所示： ![](https://img.kancloud.cn/ff/12/ff12f1ea5ac50ef4d41260d836bcd188_447x338.png) 優點 * 作業狀態信息可通過簡單查詢獲得 * 備份作業本身（包括文件）可以通過查詢檢索和刪除 * 復制到異地位置實際上是立即的 * 分片可能 * 借助 EBS 卷，通過快照進行 MongoDB 備份（元數據和實際備份數據）既簡單又無限 * 自動化狀態監控很容易 ## 通過 LogicMonitor 進行監控 LogicMonitor 認為，從物理級別到應用程序級別的基礎架構的所有方面都應位于同一監視系統中：UPS，機箱溫度，OS 統計信息，數據庫統計信息，負載平衡器，緩存層，JMX 統計信息，磁盤寫入延遲等）。所有都應該存在，其中包括備份。為此，LogicMonitor 不僅可以監視 MongoDB 的常規統計信息和運行狀況，還可以對 MongoDB 執行任意查詢。這些查詢可以查詢任何內容，從登錄靜態信息到頁面視圖，再到最后一個小時內完成的（猜測是什么？）備份作業。 Now that our backups are all done via MongoDB, I ?can keep track of (and more importantly, be alerted on): * 每臺服務器運行的備份作業數 * 所有服務器之間同時執行的備份數 * 任何未備份超過 6 個小時的客戶門戶 * MongoDB 復制滯后 ### 復制滯后 ![](https://img.kancloud.cn/73/4c/734c744aadee0154935f9c6d11c5d63a_474x249.png) ### 工作完成 ![](https://img.kancloud.cn/0d/48/0d484870b3f62eb17ae6695a2f8381ed_476x250.png) 您是否正在使用保險絲訪問 GridFS，或者是否正在根據 API 編寫所有代碼？