<ruby id="bdb3f"></ruby>

    <p id="bdb3f"><cite id="bdb3f"></cite></p>

      <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
        <p id="bdb3f"><cite id="bdb3f"></cite></p>

          <pre id="bdb3f"></pre>
          <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

          <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
          <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

          <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                <ruby id="bdb3f"></ruby>

                ThinkChat2.0新版上線,更智能更精彩,支持會話、畫圖、視頻、閱讀、搜索等,送10W Token,即刻開啟你的AI之旅 廣告
                ## 問題背景 GPDB是中央控制節點式的架構,在一個 GreenPlum 集群中,有一個 Master 節點和多個 Segment 節點。Master 是中央控制節點,Segment 是數據存放節點。所有的Segment節點平等,均由Master管理。架構如下圖: ![](https://box.kancloud.cn/2016-04-22_5719d9fde4d0e.jpg) GreenPlum架構圖 當GP Master出現問題的時候,可以通過外部的HA監控模塊發現并激活備庫,Standby Master 正常后刪除原來的 Master 進行重建備庫。 而 Segment 的修復與此不同!由上圖可知,Segment 也分為主備,稱為 Primary 和 Mirror,Mirror 是 Primary 的備。Primary與Mirror之間強同步保證數據一致性和可靠性,其間的監控與切換則由Master的FTS模塊負責。當FTS發現Primary宕機、Mirror健康后會激活Mirror,并標記Primary為’d’,Mirror進入 ChangeTracking 狀態。(詳細的原理此處不作贅述,有興趣可以參考本期月報的[GPDB · 特性分析· GreenPlum Segment事務一致性與異常處理](http://mysql.taobao.org/monthly/2016/04/02/)和上期的[GPDB · 特性分析· GreenPlum FTS 機制](http://mysql.taobao.org/monthly/2016/03/08/)) 當有Segment被標記為’d’后,Master將不會對其做處理,GP實例的啟動(重啟)也會將其忽略。這個時候,整個GP集群是處于有風險的狀況中: 1. 切過去的Mirror壓力增大(需要做change tracking); 2. 節點單點,可靠性風險加大。 這個時候需要及時地對Segment進行修復。 ## GP的Segment修復 GP提供了一系列的控制腳本用于對GP進行操作,其中用于修復Segment的是gprecoverseg。使用方式比較簡單,有限的幾個主要參數如下: * -i 主要參數,用于指定一個配置文件,該配置文件描述了需要修復的Segment和修復后的目的位置。 * -F 可選項,指定后,gprecoverseg會將”-i”中指定的或標記”d”的實例刪除,并從活著的Mirror復制一個完整一份到目標位置。 * -r 當FTS發現有Primary宕機并進行主備切換,在gprecoverseg修復后,擔當Primary的Mirror角色并不會立即切換回來,就會導致部分主機上活躍的Segment過多從而引起性能瓶頸。因此需要恢復Segment原先的角色,稱為re-balance。 舉個使用的例子: 下面是一個正常的實例, ~~~ $ gpstate -s /opt/python27/lib/python2.7/site-packages/Crypto/Util/number.py:57: PowmInsecureWarning: Not using mpz_powm_sec. You should rebuild using libgmp >= 5 to avoid timing attack vulnerability. _warn("Not using mpz_powm_sec. You should rebuild using libgmp >= 5 to avoid timing attack vulnerability.", PowmInsecureWarning) 20160418:21:39:29:016547 gpstate:host1:gpuser-[INFO]:-Starting gpstate with args: -s 20160418:21:39:29:016547 gpstate:host1:gpuser-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 4.3.99.00 build dev' 20160418:21:39:29:016547 gpstate:host1:gpuser-[INFO]:-master Greenplum Version: 'PostgreSQL 8.3 (Greenplum Database 4.3.99.00 build dev) compiled on Apr 11 2016 22:02:39' 20160418:21:39:29:016547 gpstate:host1:gpuser-[INFO]:-Obtaining Segment details from master... 20160418:21:39:29:016547 gpstate:host1:gpuser-[INFO]:-Gathering data from segments... . 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:----------------------------------------------------- 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:--Master Configuration & Status 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:----------------------------------------------------- 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Master host = host1 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Master postgres process ID = 72447 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Master data directory = /workspace/gpuser/3007 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Master port = 3007 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Master current role = dispatch 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Greenplum initsystem version = 4.3.99.00 build dev 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Greenplum current version = PostgreSQL 8.3 (Greenplum Database 4.3.99.00 build dev) compiled on Apr 11 2016 22:02:39 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Postgres version = 8.3 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Master standby = host2 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Standby master state = Standby host passive 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:----------------------------------------------------- 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:-Segment Instance Status Report 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:----------------------------------------------------- 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Segment Info 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Hostname = host1 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Address = host1 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Datadir = /workspace/gpuser/3008 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Port = 3008 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Mirroring Info 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Current role = Primary 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Preferred role = Primary 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Mirror status = Synchronized 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Status 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- PID = 72388 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Configuration reports status as = Up 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Database status = Up 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:----------------------------------------------------- ...... [INFO]:----------------------------------------------------- 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Segment Info 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Hostname = host1 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Address = host1 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Datadir = /workspace/gpuser/3012 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Port = 3012 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Mirroring Info 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Current role = Mirror 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Preferred role = Mirror 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Mirror status = Synchronized 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Status 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- PID = 75247 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Configuration reports status as = Up 20160418:21:39:30:016547 gpstate:host1:gpuser-[INFO]:- Segment status = Up ~~~ 選擇一個kill之后(如3012這個端口的實例),執行gprecoverseg,如下: ~~~ 20160418:21:40:58:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Sending msg getStatus and cmdStr $GPHOME/bin/gp_primarymirror -h host1 -p 3008 20160418:21:40:58:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Adding cmd to work_queue: $GPHOME/bin/gp_primarymirror -h host1 -p 3008 20160418:21:40:58:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Sending msg getStatus and cmdStr $GPHOME/bin/gp_primarymirror -h host2 -p 3014 20160418:21:40:58:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Adding cmd to work_queue: $GPHOME/bin/gp_primarymirror -h host2 -p 3014 20160418:21:40:58:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Sending msg getStatus and cmdStr $GPHOME/bin/gp_primarymirror -h host1 -p 3010 20160418:21:40:58:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Adding cmd to work_queue: $GPHOME/bin/gp_primarymirror -h host1 -p 3010 20160418:21:40:58:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Sending msg getStatus and cmdStr $GPHOME/bin/gp_primarymirror -h host2 -p 3015 20160418:21:40:58:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Adding cmd to work_queue: $GPHOME/bin/gp_primarymirror -h host2 -p 3015 20160418:21:40:58:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Sending msg getStatus and cmdStr $GPHOME/bin/gp_primarymirror -h host2 -p 3008 20160418:21:40:58:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Adding cmd to work_queue: $GPHOME/bin/gp_primarymirror -h host2 -p 3008 20160418:21:40:58:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Sending msg getStatus and cmdStr $GPHOME/bin/gp_primarymirror -h host1 -p 3011 20160418:21:40:58:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Adding cmd to work_queue: $GPHOME/bin/gp_primarymirror -h host1 -p 3011 20160418:21:40:58:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Sending msg getStatus and cmdStr $GPHOME/bin/gp_primarymirror -h host2 -p 3013 20160418:21:40:58:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Adding cmd to work_queue: $GPHOME/bin/gp_primarymirror -h host2 -p 3013 20160418:21:40:58:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Sending msg getStatus and cmdStr $GPHOME/bin/gp_primarymirror -h host1 -p 3012 20160418:21:40:58:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Adding cmd to work_queue: $GPHOME/bin/gp_primarymirror -h host1 -p 3012 ...... 20160418:21:41:18:017989 gpstate:host1:gpuser-[DEBUG]:-[worker6] finished cmd: Get segment status cmdStr='sshpass -e ssh -o 'StrictHostKeyChecking no' host1 ". /workspace/gpdb/greenplum_path.sh; $GPHOME/bin/gp_primarymirror -h host1 -p 3012"' had result: cmd had rc=15 completed=True halted=False stdout='' stderr='failed to connect: Connection refused (errno: 111) Retrying no 1 failed to connect: Connection refused (errno: 111) Retrying no 2 ...... 20160418:21:41:18:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Encountered error Not ready to connect to database mode: PrimarySegment segmentState: Fault dataState: InSync faultType: FaultMirror mode: PrimarySegment segmentState: Fault dataState: InSync faultType: FaultMirror ~~~ 這個時候連接這個實例去獲取信息是失敗的,失敗的原因后面再說。這個時候失敗后會重試5次,當再一次嘗試的時候發現了不同: ~~~ 20160418:21:41:23:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Sending msg getStatus and cmdStr $GPHOME/bin/gp_primarymirror -h host1 -p 3008 20160418:21:41:23:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Adding cmd to work_queue: $GPHOME/bin/gp_primarymirror -h host1 -p 3008 20160418:21:41:23:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Sending msg getStatus and cmdStr $GPHOME/bin/gp_primarymirror -h host2 -p 3014 20160418:21:41:23:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Adding cmd to work_queue: $GPHOME/bin/gp_primarymirror -h host2 -p 3014 20160418:21:41:23:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Sending msg getStatus and cmdStr $GPHOME/bin/gp_primarymirror -h host1 -p 3010 20160418:21:41:23:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Adding cmd to work_queue: $GPHOME/bin/gp_primarymirror -h host1 -p 3010 20160418:21:41:23:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Sending msg getStatus and cmdStr $GPHOME/bin/gp_primarymirror -h host2 -p 3015 20160418:21:41:23:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Adding cmd to work_queue: $GPHOME/bin/gp_primarymirror -h host2 -p 3015 20160418:21:41:23:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Sending msg getStatus and cmdStr $GPHOME/bin/gp_primarymirror -h host2 -p 3008 20160418:21:41:23:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Adding cmd to work_queue: $GPHOME/bin/gp_primarymirror -h host2 -p 3008 20160418:21:41:23:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Sending msg getStatus and cmdStr $GPHOME/bin/gp_primarymirror -h host1 -p 3011 20160418:21:41:23:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Adding cmd to work_queue: $GPHOME/bin/gp_primarymirror -h host1 -p 3011 20160418:21:41:23:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Sending msg getStatus and cmdStr $GPHOME/bin/gp_primarymirror -h host2 -p 3013 20160418:21:41:23:017989 gprecoverseg:host1:gpuser-[DEBUG]:-Adding cmd to work_queue: $GPHOME/bin/gp_primarymirror -h host2 -p 3013 ~~~ 會發現,少了一個Segment的命令,而這個Segment正是剛才kill的Segment。繼續往下看執行結果,gprecoverseg執行了下面的內容: ~~~ 20160418:23:16:20:085203 gprecoverseg:host1:gpuser-[DEBUG]:-[worker7] finished cmd: Get segment status information cmdStr='sshpass -e ssh -o 'StrictHostKeyChecking no' host2 ". /workspace/gpdb/greenplum_path.sh; $GPHOME/bin/gp_primarymirror -h host2 -p 3013"' had result: cmd had rc=1 completed=True halted=False stdout='' stderr='mode: PrimarySegment segmentState: Ready dataState: InChangeTracking faultType: NotInitialized mode: PrimarySegment segmentState: Ready dataState: InChangeTracking faultType: NotInitialized ' ~~~ 這個實例為什么單獨檢查呢?而且這個時候如果失敗,則會直接退出無法繼續執行。 在一系列的檢查之后,先更新catalog中的操作記錄表: ~~~ UPDATE pg_catalog.gp_segment_configuration ~~~ 之后即調用命令進行數據的恢復: ~~~ /workspace/gpdb/bin/lib/gpconfigurenewsegment -c /workspace/gpuser/3012:3012:false:false:9 -v -B 16 --write-gpid-file-only ~~~ 最后再啟動Segment,并更新catalog: ~~~ $GPHOME/sbin/gpsegstart.py -C en_US.utf8:C:C -M quiescent -V 'postgres (Greenplum Database) 4.3.99.00 build dev' -n 4 --era df86ca11ca2fc214_160418165251 -t 600 -v -p KGRwMApTJ2Ric0J5UG9ydCcKcDEKKGRwMgpJMzAxMgooZHAzClMndGFyZ2V0TW9kZScKcDQKUydtaXJyb3InCnA1CnNTJ2RiaWQnCnA2Ckk5CnNTJ2hvc3ROYW1lJwpwNwpTJzEwLjk3LjI0OC43MycKcDgKc1MncGVlclBvcnQnCnA5CkkzNTEzCnNTJ3BlZXJQTVBvcnQnCnAxMApJMzAxMwpzUydwZWVyTmFtZScKcDExClMncnQxYjA3MDI0LnRiYycKcDEyCnNTJ2Z1bGxSZXN5bmNGbGFnJwpwMTMKSTAwCnNTJ21vZGUnCnAxNApTJ3InCnAxNQpzUydob3N0UG9ydCcKcDE2CkkzNTEyCnNzcy4= -D '9|3|m|m|r|d|host1|host1|3012|3512|/workspace/gpuser/3012||' ...... 20160419:01:21:05:042692 gprecoverseg:host1:gpuser-[DEBUG]:-UPDATE pg_catalog.gp_segment_configuration SET mode = 'r', status = 'u' WHERE dbid = 5 20160419:01:21:05:042692 gprecoverseg:host1:gpuser-[DEBUG]:-INSERT INTO gp_configuration_history (time, dbid, "desc") VALUES( now(), 5, 'gprecoverseg: segment resync marking mirrors up and primaries resync: segment mode and status' ) 20160419:01:21:05:042692 gprecoverseg:host1:gpuser-[DEBUG]:-UPDATE pg_catalog.gp_segment_configuration SET mode = 'r', status = 'u' WHERE dbid = 9 20160419:01:21:05:042692 gprecoverseg:host1:gpuser-[DEBUG]:-INSERT INTO gp_configuration_history (time, dbid, "desc") VALUES( now(), 9, 'gprecoverseg: segment resync marking mirrors up and primaries resync: segment mode and status' ) 20160419:01:21:05:042692 gprecoverseg:host1:gpuser-[DEBUG]:-UPDATE gp_fault_strategy ~~~ 這樣即是一個完整的gprecoverseg過程。執行過后,對應的Primary和Mirror會進入”r”狀態,表示正在做數據同步。 下面來看其中的詳細步驟和原理。 ## 實現原理 上面的例子中,遺留了幾個問題: * 在gprecoverseg過程中,第一次獲取Segment狀態是不對的; * 第二次獲取Segment信息,比第一次少了一條; * 單獨檢查了“-h host2 -p 3013”這個實例。 這幾個問題在了解了原理后就很容易理解了。想要了解原理,可以先看下執行的步驟。從代碼看來,其大致的步驟如下: ### 參數處理 GP的腳本用了較多的環境變量,且不同的腳本、不同的地方略有不同。如gprecoverseg用的就是MASTER_DATA_DIRECTORY,從MASTER_DATA_DIRECTORY指定的目錄中得到Master相關的信息(如port)以進行相關操作。 gprecoverseg的參數,最重要的莫過于”-i”了,其指定了需要做修復的Segment,并且可以指定到不同的主機上,例如: ~~~ filespaceOrder= host1:3012:/workspace/gpuser/3012 host2:3012:3512:/workspace/gpuser/3012 ~~~ 具體執行不再贅述。 ### 判斷Segment當前的狀態 調用gp_primarymirror,向活著的segment發送消息,以判斷Segment當前的狀態。這是非常重要的一步,也是遇到問題最多的一步,經常會出現問題”Unable to connect to database”。事實上,造成這個失敗的原因有很多,比較多的是: * 其對應的Primary(Mirror)也宕機; * 其對應的Primary的狀態不對,如已經有gprecoverseg在進行(或執行失敗,狀態出問題等)。 在做這一步的時候,是依賴gp_segment_configuration中的數據的,即會首先從GP Master上獲取相應的數據,與下一步中的描述基本相同。 如果這個Segment被標記為”d”,那么是不會向該Segment發起狀態信息請求。 而如果對應的Primary/Mirror都宕機了,他們的狀態不會同時為”d”(有可能都為”u”,比如同時異常的時候,FTS不會更新他們)。因此對標記為”u”實際已經宕機的Segment連接獲取狀態信息的時候,則會報錯。這個時候就不是gprecoverseg所能處理的問題了,只能重啟整個實例。 回到前面的問題。第一次執行失敗即因為Segment的狀態尚未更新;第二次執行少了一個Segment,即狀態被更新為”d”后不進行連接。 在檢查完所有狀態為”u”的Segment連接后,則會針對宕機的Mirror進行檢查,查看其對應的主庫是否正常,可以用于修復數據,即是第三個問題的答案。如: ~~~ stdout='' stderr='mode: PrimarySegment segmentState: Ready dataState: InSync faultType: NotInitialized mode: PrimarySegment segmentState: Ready dataState: InSync faultType: NotInitialized ' ~~~ 或者這樣 ~~~ stdout='' stderr='mode: PrimarySegment segmentState: Ready dataState: InChangeTracking faultType: NotInitialized mode: PrimarySegment segmentState: Ready dataState: InChangeTracking faultType: NotInitialized ' ~~~ 正常情況下,當Mirror出現問題,Primary發現后會進入ChangeTracking的狀態。在這個狀態里,Primary會記錄下切換狀態時間點之后的變更,用于當Mirror恢復時進行數據同步,而不用每次都做一次全量。 ### 從master獲取segment的信息 包括IP、PORT、ROLE、Status、數據目錄、臨時空間等,如下: ~~~ dbid | content | role | preferred\_role | mode | status | hostname | address | port | replication\_port | oid | fselocation ------+---------+------+----------------+------+--------+---------------+---------------+------+------------------+------+----------------- 1 | -1 | p | p | s | u | host1 | host1 | 3007 | | 3052 | /workspace/gpuser/3007 10 | -1 | m | m | s | u | host2 | host2 | 3007 | | 3052 | /workspace/gpuser/3007 2 | 0 | p | p | s | u | host1 | host1 | 3008 | 3508 | 3052 | /workspace/gpuser/3008 6 | 0 | m | m | s | u | host2 | host2 | 3014 | 3514 | 3052 | /workspace/gpuser/3014 3 | 1 | p | p | s | u | host1 | host1 | 3010 | 3510 | 3052 | /workspace/gpuser/3010 7 | 1 | m | m | s | u | host2 | host2 | 3015 | 3515 | 3052 | /workspace/gpuser/3015 4 | 2 | p | p | s | u | host2 | host2 | 3008 | 3508 | 3052 | /workspace/gpuser/3008 8 | 2 | m | m | s | u | host1 | host1 | 3011 | 3511 | 3052 | /workspace/gpuser/3011 5 | 3 | p | p | s | u | host2 | host2 | 3013 | 3513 | 3052 | /workspace/gpuser/3013 9 | 3 | m | m | s | u | host1 | host1 | 3012 | 3512 | 3052 | /workspace/gpuser/3012 ~~~ IP/PORT/ROLE/STATUS/目錄/FILESPACE等信息,后面的Mirror修復列表、臨時空間、操作對象的信息都依賴于此。 ### 修復準備 在獲取所有的Segment信息后,會針對配置文件、參數等相關信息確定,包括: * Segment修復對象 確定Segment修復對象和數據源,即Primary;需要修復的Segment有可能是多個。并獲取需要修復的Segment的相關信息,包括端口、流復制端口、數據目錄、臨時空間、文件空間等信息,以及是否強制修復等。 * 主機環境 在獲取所需要修復的Segment列表后,需要確保所在主機環境是可以的,包括端口占用、目錄的占用等有可能沖突的地方。 如果沒有指定主機,則會在已有的主機中選擇一個。 ### 修復 修復的步驟是: * 關閉宕機的Mirror,并清理shared memory * 確定需要修復的Segment已經被標記為”d” * 如有需要,則進行刪除,如”-F”的情況 * 打包壓縮、復制數據到目標位置 * 關閉SIGINT、SIG_IGN,更新元數據庫,打開SIGINT、SIG_IGN 以上步驟后,即可實現對Segment的本地(in-place)或跨機修復。 ### re-balance 當修復完Segment之后,原先因為Primary宕機而切到Mirror上的Segment并不會主動切回來,這個時候有可能出現性能傾斜而影響性能,因此需要做”re-balance”,執行: ~~~ gprecoverseg -r ~~~ 執行該命令會將role切換為preferred_role,保證整個集群的角色平衡而不致于部分主機Primay更多引起性能瓶頸。
                  <ruby id="bdb3f"></ruby>

                  <p id="bdb3f"><cite id="bdb3f"></cite></p>

                    <p id="bdb3f"><cite id="bdb3f"><th id="bdb3f"></th></cite></p><p id="bdb3f"></p>
                      <p id="bdb3f"><cite id="bdb3f"></cite></p>

                        <pre id="bdb3f"></pre>
                        <pre id="bdb3f"><del id="bdb3f"><thead id="bdb3f"></thead></del></pre>

                        <ruby id="bdb3f"><mark id="bdb3f"></mark></ruby><ruby id="bdb3f"></ruby>
                        <pre id="bdb3f"><pre id="bdb3f"><mark id="bdb3f"></mark></pre></pre><output id="bdb3f"></output><p id="bdb3f"></p><p id="bdb3f"></p>

                        <pre id="bdb3f"><del id="bdb3f"><progress id="bdb3f"></progress></del></pre>

                              <ruby id="bdb3f"></ruby>

                              哎呀哎呀视频在线观看