正文

mongo实战之数据空洞的最佳实践(代码片段)

chinesern  chinesern  2022-10-26  778

关键词：

问题背景：某天,开发部的同事跑过来反映: mongodb数据文件太大，快把磁盘撑爆了！其中某个db占用最大(运营环境这个db的数据量其实很小)

分析: 开发环境有大量测试的增/删/改操作,而由于MongoDB顺序写的原因，在我们删除部分无用数据后，它的storageSize并不会变小，这就造成了大量的数据空洞。

解决办法

1. 使用MongoDB自带的compact命令：

  db.collectionName.runCommand(“compact”)
 这种方式是collection级别的压缩，只能去除collection内的碎片。但是MongoDB的数据分配是DB级别的，效果有限。
 这个压缩是线上压缩，磁盘IO会比较高，从而影响线上服务。

2.采用复制集的方式滚动瘦身(offline)

1.检查服务器各节点是否正常运行 (ps -ef |grep mongod)
2.登入要处理的主节点,做降权处理rs.stepDown(),并通过命令 rs.status()来查看是否降权
3.切换成功之后，停掉该节点,删除数据文件,比如： rm -fr /mongodb/data/*
4.重新启动该节点，执行重启命令，比如：如:/mongodb/bin/mongod --config /mongodb/mongodb.conf
5.通过日志查看进程以及数据同步的进度
6.数据同步完成后,在修改后的主节点上执行命令 rs.stepDown(),做降权处理。

通过这种方式,可以做到收缩率是100%,数据完全无碎片.当然也会带来运维成本的增加，并且在Replic-Set集群只有2个副本的情况下，
还会存在一段时间内的单点风险(在下面的实验中就发生了这样的情况)。通过Offline的数据收缩后,收缩前后效果非常明显.

系统环境(实验环境,为了简单只做了1个副本,请勿在生产环境操作！！！)

主库: 192.168.2.130:27017
从库: 192.168.2.138:27017

1.安装mongo

2.配置参数(两台上配置)

[[email protected] bin]# vi mongodb.conf

dbpath=/u01/mongodb/data
port=27017
oplogSize = 2048
logpath=/u01/mongodb/logs/mongodb.log
logappend = true
fork = true
nojournal = true
bind_ip=0.0.0.0
shardsvr=true
replSet=relp1

3.启动mongo

[[email protected] bin]# ./mongod --config ./mongodb.conf

4.初始化副本集

[[email protected] bin]# ./mongo  192.168.2.130:27017

> cfg = "_id":"repl1","members":[ "_id":0, "host":"192.168.2.130:27017", "_id":1, "host":"192.168.2.138:27017"]; 

        "_id" : "repl1",
        "members" : [
                
                        "_id" : 0,
                        "host" : "192.168.2.130:27017"
                ,
                
                        "_id" : 1,
                        "host" : "192.168.2.138:27017"
                
        ]

>  
> rs.initiate(cfg)rs.initiate(cfg)
 "ok" : 1

5.插入数据

#准备工作 安装pip工具 pymongo
[[email protected] mongodb]# tar zxvf setuptools-0.6c11.tar.gz
[[email protected] mongodb]# cd setuptools-0.6c11
[[email protected] mongodb]# python setup.py install

[[email protected] mongodb]# wget "https://pypi.python.org/packages/source/p/pip/pip-1.5.4.tar.gz#md5=834b2904f92d46aaa333267fb1c922bb" --no-check-certificate
[[email protected] mongodb]# tar zxvf [[email protected] mongodb]# tar pip-1.5.4.tar.gz
[[email protected] mongodb]# cd pip-1.5.4
[[email protected] pip-1.5.4]# python setup.py install


[[email protected] mongodb]# pip install pymongo

#插入脚本
vi insert.py
#!/usr/bin/python
import random
from pymongo import MongoClient

client = MongoClient('192.168.2.137', 27017)

test = client.test
students = test.students
students_count = students.count()
print "student count is ", students_count

for i in xrange(0,5000000):
    classid = random.randint(1,4)
    age = random.randint(10, 30)
    student = "classid":classid, "age":age,
"name":"fujun"
    students.insert_one(student)
    print i

students_count = students.count()
print "student count is ", students_count

#执行插入,可以开多个窗口执行
[[email protected] mongodb]#python insert.py

#插入完毕 2kw条记录
repl1:PRIMARY> db.students.find().count() 
21511778

#查看复制集状态
repl1:PRIMARY> rs.status()rs.status()

        "set" : "repl1",
        "date" : ISODate("2018-03-17T02:42:01.126Z"),
        "myState" : 1,
        "term" : NumberLong(3),
        "heartbeatIntervalMillis" : NumberLong(2000),
        "members" : [
                
                        "_id" : 0,
                        "name" : "192.168.2.130:27017",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 60089,
                        "optime" : 
                                "ts" : Timestamp(1521231587, 19),
                                "t" : NumberLong(3)
                        ,
                        "optimeDate" : ISODate("2018-03-16T20:19:47Z"),
                        "electionTime" : Timestamp(1521213635, 1),
                        "electionDate" : ISODate("2018-03-16T15:20:35Z"),
                        "configVersion" : 1,
                        "self" : true
                ,
                
                        "_id" : 1,
                        "name" : "192.168.2.138:27017",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 40892,
                        "optime" : 
                                "ts" : Timestamp(1521231587, 19),
                                "t" : NumberLong(3)
                        ,
                        "optimeDate" : ISODate("2018-03-16T20:19:47Z"),
                        "lastHeartbeat" : ISODate("2018-03-17T02:42:01Z"),
                        "lastHeartbeatRecv" : ISODate("2018-03-17T02:42:00.376Z"),
                        "pingMs" : NumberLong(1),
                        "syncingTo" : "192.168.2.130:27017",
                        "configVersion" : 1
                
        ],
        "ok" : 1


#查看storageSize和fileSize:
repl1:PRIMARY> db.stats() 

        "db" : "test",
        "collections" : 1,
        "objects" : 21511778,
        "avgObjSize" : 60,
        "dataSize" : 1290706680,
        "storageSize" : 410132480,
        "numExtents" : 0,
        "indexes" : 1,
        "indexSize" : 217350144,
        "ok" : 1


#storageSize = dataSize+size(删除的文档)（文档删除后，storageSize并不会变小）。

6.制造空洞数据

#删除3/4数据
repl1:PRIMARY> db.students.remove("classid":"$lt":4) 

WriteResult( "nRemoved" : 9050980 )

#还剩下500w
repl1:PRIMARY> db.students.find().count()db.students.find().count()
5376340


#storageSize没有变下。
repl1:PRIMARY>db.stats()

        "db" : "test",
        "collections" : 1,
        "objects" : 5376340,
        "avgObjSize" : 60,
        "dataSize" : 322580400,
        "storageSize" : 425127936,
        "numExtents" : 0,
        "indexes" : 1,
        "indexSize" : 219590656,
        "ok" : 1

7.primary降权、停止

repl1:PRIMARY> rs.setrs.sers.stepDown()rs.stepDown()
2018-03-17T13:46:49.588+0800 E QUERY    [thread1] Error: error doing query: failed: network error while attempting to run command 'replSetStepDown' on host '192.168.2.130:27017'  :
[email protected]/mongo/shell/db.js:135:1
[email protected]/mongo/shell/db.js:153:16
[email protected]/mongo/shell/utils.js:1202:12
@(shell):1:1

2018-03-17T13:46:49.600+0800 I NETWORK  [thread1] trying reconnect to 192.168.2.130:27017 (192.168.2.130) failed
2018-03-17T13:46:49.618+0800 I NETWORK  [thread1] reconnect 192.168.2.130:27017 (192.168.2.130) ok
repl1:SECONDARY> 

#停止192.168.2.130
[[email protected] mongodb]# mongod --shutdown --config=/u01/mongodb/mongodb.conf
killing process with pid: 32322

8.删除192.168.2.130 上的数据文件

[[email protected] data]# rm -rf /u01/mongodb/data/*

9.重启192.168.2.130

[[email protected] data]# mongod  --config=/u01/mongodb/mongodb.conf
about to fork child process, waiting until server is ready for connections.
forked process: 48079
child process started successfully, parent exiting

10.查看日志,跟踪数据同步过程

[[email protected] mongodb]# cd /u01/mongodb/logs/
[[email protected] logs]# tail -50f  mongodb.log

2018-03-17T13:53:31.077+0800 I REPL     [replExecDBWorker-0] Starting replication applier threads
2018-03-17T13:53:31.077+0800 I REPL     [ReplicationExecutor] 
2018-03-17T13:53:31.078+0800 I REPL     [ReplicationExecutor] ** WARNING: This replica set is running without journaling enabled but the 
2018-03-17T13:53:31.078+0800 I REPL     [ReplicationExecutor] **          writeConcernMajorityJournalDefault option to the replica set config 
2018-03-17T13:53:31.078+0800 I REPL     [ReplicationExecutor] **          is set to true. The writeConcernMajorityJournalDefault 
2018-03-17T13:53:31.078+0800 I REPL     [ReplicationExecutor] **          option to the replica set config must be set to false 
2018-03-17T13:53:31.078+0800 I REPL     [ReplicationExecutor] **          or w:majority write concerns will never complete.
2018-03-17T13:53:31.078+0800 I REPL     [ReplicationExecutor] 
2018-03-17T13:53:31.078+0800 I REPL     [ReplicationExecutor] New replica set config in use:  _id: "repl1", version: 1, protocolVersion: 1, members: [  _id: 0, host: "192.168.2.130:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: , slaveDelay: 0, votes: 1 ,  _id: 1, host: "192.168.2.138:27017", arbiterOnly: false, buildIndexes: true, hidden: false, priority: 1.0, tags: , slaveDelay: 0, votes: 1  ], settings:  chainingAllowed: true, heartbeatIntervalMillis: 2000, heartbeatTimeoutSecs: 10, electionTimeoutMillis: 10000, getLastErrorModes: , getLastErrorDefaults:  w: 1, wtimeout: 0 , replicaSetId: ObjectId('5aab9f92276e84b589038188')  
2018-03-17T13:53:31.078+0800 I REPL     [ReplicationExecutor] This node is 192.168.2.130:27017 in the config
2018-03-17T13:53:31.078+0800 I REPL     [ReplicationExecutor] transition to STARTUP2
2018-03-17T13:53:31.079+0800 I REPL     [rsSync] ******
2018-03-17T13:53:31.079+0800 I REPL     [rsSync] creating replication oplog of size: 2048MB...
2018-03-17T13:53:31.298+0800 I REPL     [ReplicationExecutor] Member 192.168.2.138:27017 is now in state SECONDARY
2018-03-17T13:53:31.915+0800 I STORAGE  [rsSync] Starting WiredTigerRecordStoreThread local.oplog.rs
2018-03-17T13:53:31.915+0800 I STORAGE  [rsSync] The size storer reports that the oplog contains 0 records totaling to 0 bytes
2018-03-17T13:53:31.916+0800 I STORAGE  [rsSync] Scanning the oplog to determine where to place markers for truncation
2018-03-17T13:53:34.635+0800 I REPL     [rsSync] ******
2018-03-17T13:53:34.635+0800 I REPL     [rsSync] initial sync pending
2018-03-17T13:53:37.219+0800 I REPL     [ReplicationExecutor] syncing from: 192.168.2.138:27017
2018-03-17T13:53:39.164+0800 I REPL     [rsSync] initial sync drop all databases
2018-03-17T13:53:39.165+0800 I STORAGE  [rsSync] dropAllDatabasesExceptLocal 1
2018-03-17T13:53:39.165+0800 I REPL     [rsSync] initial sync clone all databases
2018-03-17T13:53:39.281+0800 I REPL     [rsSync] fetching and creating collections for test
2018-03-17T13:53:47.061+0800 I REPL     [rsSync] initial sync cloning db: test
2018-03-17T14:19:07.020+0800 I STORAGE  [rsSync] clone test.students 70015
2018-03-17T14:19:07.021+0800 I STORAGE  [rsSync] 70105 objects cloned so far from collection test.students
2018-03-17T14:20:34.083+0800 I STORAGE  [rsSync] 139955 objects cloned so far from collection test.students
2018-03-17T14:20:34.084+0800 I STORAGE  [rsSync] clone test.students 140031
2018-03-17T14:21:38.094+0800 I STORAGE  [rsSync] 489459 objects cloned so far from collection test.students
2018-03-17T14:21:38.094+0800 I STORAGE  [rsSync] clone test.students 489471
2018-03-17T14:22:51.519+0800 I STORAGE  [rsSync] 769113 objects cloned so far from collection test.students
2018-03-17T14:22:51.519+0800 I STORAGE  [rsSync] clone test.students 769151
2018-03-17T14:24:07.410+0800 I STORAGE  [rsSync] 978790 objects cloned so far from collection test.students
2018-03-17T14:24:07.410+0800 I STORAGE  [rsSync] clone test.students 978815
2018-03-17T14:25:10.198+0800 I STORAGE  [rsSync] 1468121 objects cloned so far from collection test.students
2018-03-17T14:25:10.198+0800 I STORAGE  [rsSync] clone test.students 1468159
2018-03-17T14:25:44.018+0800 I NETWORK  [initandlisten] connection accepted from 127.0.0.1:48042 #7 (4 connections now open)
2018-03-17T14:26:01.100+0800 I NETWORK  [conn5] end connection 127.0.0.1:48032 (3 connections now open)
2018-03-17T14:26:01.100+0800 I NETWORK  [conn7] end connection 127.0.0.1:48042 (3 connections now open)
2018-03-17T14:26:16.318+0800 I STORAGE  [rsSync] 1887602 objects cloned so far from collection test.students
2018-03-17T14:26:16.318+0800 I STORAGE  [rsSync] clone test.students 1887615
2018-03-17T14:27:19.201+0800 I STORAGE  [rsSync] clone test.students 2307071
2018-03-17T14:27:19.201+0800 I STORAGE  [rsSync] 2307083 objects cloned so far from collection test.students
2018-03-17T14:28:22.898+0800 I STORAGE  [rsSync] clone test.students 3145855
2018-03-17T14:28:22.899+0800 I STORAGE  [rsSync] 3145918 objects cloned so far from collection test.students
2018-03-17T14:29:25.991+0800 I STORAGE  [rsSync] 3565272 objects cloned so far from collection test.students
2018-03-17T14:29:25.991+0800 I STORAGE  [rsSync] clone test.students 3565311
2018-03-17T14:30:28.780+0800 I STORAGE  [rsSync] 3984753 objects cloned so far from collection test.students
2018-03-17T14:30:28.780+0800 I STORAGE  [rsSync] clone test.students 3984767
2018-03-17T14:31:28.001+0800 I STORAGE  [rsSync] clone test.students 4667519
2018-03-17T14:31:29.457+0800 I STORAGE  [rsSync] 4683761 objects cloned so far from collection test.students
2018-03-17T14:32:34.884+0800 I STORAGE  [rsSync] clone test.students 5103231
2018-03-17T14:32:34.884+0800 I STORAGE  [rsSync] 5103242 objects cloned so far from collection test.students
2018-03-17T14:33:46.017+0800 I STORAGE  [rsSync] 5452746 objects cloned so far from collection test.students
2018-03-17T14:33:46.017+0800 I STORAGE  [rsSync] clone test.students 5452799
2018-03-17T14:34:47.770+0800 I STORAGE  [rsSync] clone test.students 5872127
2018-03-17T14:34:47.770+0800 I STORAGE  [rsSync] 5872227 objects cloned so far from collection test.students
2018-03-17T14:35:52.885+0800 I STORAGE  [rsSync] 6291581 objects cloned so far from collection test.students
2018-03-17T14:35:52.885+0800 I STORAGE  [rsSync] clone test.students 6291583
2018-03-17T14:36:57.050+0800 I STORAGE  [rsSync] 6571235 objects cloned so far from collection test.students
2018-03-17T14:36:57.050+0800 I STORAGE  [rsSync] clone test.students 6571263
2018-03-17T14:37:59.514+0800 I STORAGE  [rsSync] 8248905 objects cloned so far from collection test.students
2018-03-17T14:37:59.515+0800 I STORAGE  [rsSync] clone test.students 8248959
2018-03-17T14:38:37.698+0800 I INDEX    [rsSync] build index on: test.students properties:  v: 1, key:  _id: 1 , name: "_id_", ns: "test.students" 
2018-03-17T14:38:37.698+0800 I INDEX    [rsSync]         building index using bulk method; build may temporarily use up to 500 megabytes of RAM
2018-03-17T14:38:40.000+0800 I -        [rsSync]   Index Build: 1511900/12074974 12%
2018-03-17T14:38:43.000+0800 I -        [rsSync]   Index Build: 3571300/12074974 29%
2018-03-17T14:38:46.000+0800 I -        [rsSync]   Index Build: 5640600/12074974 46%
2018-03-17T14:38:49.001+0800 I -        [rsSync]   Index Build: 7599000/12074974 62%
2018-03-17T14:38:52.001+0800 I -        [rsSync]   Index Build: 9471000/12074974 78%
2018-03-17T14:38:55.001+0800 I -        [rsSync]   Index Build: 11341800/12074974 93%
2018-03-17T14:39:27.000+0800 I -        [rsSync]   Index: (2/3) BTree Bottom Up Progress: 8961500/12074974 74%
2018-03-17T14:39:30.222+0800 I INDEX    [rsSync]         done building bottom layer, going to commit
2018-03-17T14:39:31.465+0800 I NETWORK  [initandlisten] connection accepted from 192.168.2.138:50802 #8 (3 connections now open)
2018-03-17T14:39:31.960+0800 I COMMAND  [conn1] command local.replset.election command: replSetRequestVotes  replSetRequestVotes: 1, setName: "repl1", dryRun: false, term: 212, candidateIndex: 1, configVersion: 1, lastCommittedOp:  ts: Timestamp 1521264812000|7795, t: 193   keyUpdates:0 writeConflicts:0 numYields:0 reslen:63 locks: Global:  acquireCount:  r: 4, w: 2  , Database:  acquireCount:  r: 1, W: 2  , Collection:  acquireCount:  r: 1    protocol:op_command 3219ms
2018-03-17T14:39:31.983+0800 I INDEX    [rsSync] build index done.  scanned 12074974 total records. 54 secs
2018-03-17T14:39:31.984+0800 I REPL     [rsSync] initial sync data copy, starting syncup
2018-03-17T14:39:31.984+0800 I REPL     [rsSync] oplog sync 1 of 3
2018-03-17T14:39:32.797+0800 I REPL     [ReplicationExecutor] syncing from: 192.168.2.138:27017
2018-03-17T14:39:33.749+0800 I REPL     [ReplicationExecutor] Member 192.168.2.138:27017 is now in state PRIMARY
2018-03-17T14:46:16.481+0800 I REPL [rsSync] oplog sync 2 of 3
2018-03-17T14:46:16.708+0800 I REPL [rsSync] initial sync building indexes
2018-03-17T14:46:16.708+0800 I REPL [rsSync] initial sync cloning indexes for : test
2018-03-17T14:46:17.236+0800 I STORAGE  [rsSync] copying indexes for:  name: "students", options:  
2018-03-17T14:46:19.079+0800 I REPL [rsSync] oplog sync 3 of 3
2018-03-17T14:46:19.083+0800 I REPL [rsSync] initial sync finishing up
2018-03-17T14:46:19.483+0800 I REPL [rsSync] initial sync done
2018-03-17T14:46:19.491+0800 I REPL [rsSync] initial sync succeeded after 1 attempt(s).
2018-03-17T14:46:19.491+0800 I REPL [ReplicationExecutor] transition to RECOVERING
2018-03-17T14:46:19.495+0800 I REPL [ReplicationExecutor] transition to SECONDARY

由日志中可以看到,同步花了很长时间,原因:

1.192.168.2.130降级前,复制集的同步是没有完成的,这导致了降级后，192.168.2.138并没有升级为primary,这样就造成了单点故障！
2.192.168.2.130删除数据重启后,192.168.2.138继续做delete同步操作,130的状态为startup2
3.192.168.2.138删除数据完毕后(由于没有同步完全,130上的数据文件被删除了,导致138上最终的数据不是最新的！),
  然后才成为primary

repl1:PRIMARY> db.students.find().count()
12074974

通过查询结果,可以看到数据不是最新的,新数据应该至于500w条，也证明了mongo复制集是异步的，降级的时候应该观察副本数据库的状态！！！

后面又做了一次删除,并跟踪了副本的同步情况,这次切换时间就变得很短了,并且文件大小下降的十分可观！

11. 再次查看storageSize,下降十分明显

repl1:PRIMARY> db.stats() 

        "db" : "test",
        "collections" : 2,
        "objects" : 5376341,
        "avgObjSize" : 60.000029946017186,
        "dataSize" : 322580621,
        "storageSize" : 102256640,
        "numExtents" : 0,
        "indexes" : 2,
        "indexSize" : 54321152,
        "ok" : 1

12. 192.168.2.138降权

repl1:PRIMARY>rs.stepDown()
2018-03-17T02:51:29.554-0400 E QUERY    [thread1] Error: error doing query: failed: network error while attempting to run command 'replSetStepDown' on host '127.0.0.1:27017'  :
[email protected]/mongo/shell/db.js:135:1
[email protected]/mongo/shell/db.js:153:16
[email protected]/mongo/shell/utils.js:1202:12
@(shell):1:1

2018-03-17T02:51:29.652-0400 I NETWORK  [thread1] trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed
2018-03-17T02:51:29.791-0400 I NETWORK  [thread1] reconnect 127.0.0.1:27017 (127.0.0.1) ok

13. 重复130的操作收缩138的空间

jmeter之最佳实践(代码片段)

...http://jmeter.apache.org/usermanual/best-practices.html 翻译：16.最佳实践16.1始终使用最新版本的JMeterJMeter的性能正在不断提高，因此强烈建议用户使用最新版本。确保始终阅读更改列表以了解新的改进和组件。一定要避免使用与最新版... 查看详情

.netcore开发实战（定义api的最佳实践）sourcegenerators版(代码片段)

前言极客时间上的《.NETCore开发实战》是一门非常好的课程，作者肖伟宇在第31课（https://time.geekbang.org/course/detail/100044601-201165）介绍了定义API的最佳实践。大意如下：Controller这一层负责与前端用户的交互，它... 查看详情

将 mongo 集合导出到 SQL Server 的最佳实践

...【问题描述】：我们使用MongoDB（在Linux上）作为我们的主数据库。但是，我们需要定期（例如每晚）将一些集合从Mongo导出到MSSQL服务器以运行分析。我正在考虑以下方法：使用mongodump备份Mongo数据查看详情

ansible最佳实践之playbook高级循环任务如何操作(代码片段)

写在前面今天和小伙伴分享一些ansible剧本中数据迭代的笔记博文内容比较简单主要介绍的常见的迭代对比使用过滤器和查找插件在复杂数据结构上实施迭代循环食用方式：了解Ansible基础语法理解不足小伙伴帮忙指正傍晚时... 查看详情

hudi自带工具deltastreamer的实时入湖最佳实践(代码片段)

...的实时入湖。本文分享自华为云社区《华为FusionInsightMRS实战-Hudi实时入湖之DeltaStreamer工具最佳实践》，作者：晋红轻。背景传统大数据平台的组织架构是针对离线数据处理需求设计的，常用的数据导入方式为采用sqoop... 查看详情

java并发编程实战-创建和执行任务的最佳实践(代码片段)

若无法通过并行流实现并发，则必须创建并运行自己的任务。运行任务的理想Java8方法就是CompletableFuture。Java并发的历史始于非常原始和有问题的机制，并且充满各种尝试的优化。本文将展示一个规范形式，表示创建... 查看详情

vue开发实战生态篇#18：vuex最佳实践(代码片段)

说明【Vue开发实战】学习笔记。核心概念State一this.$store.state.xxx取值——mapState取值Getter一this.$store.getters.xxx取值——mapGetters取值Mutation一this.$store.commit("xxx")赋值——mapMutations赋值Action一this.$store.dispatch(" 查看详情

解读华为云gaussdb(forinflux)：最佳实践之数据建模(代码片段)

...社区《华为云GaussDB(forInflux)揭秘第七期：最佳实践之数据建模》，作者：GaussDB数据库。华为云GaussDB(forInflux)时序数据库面向工业物联网海量时序数据场景提供数据安全、高性能、低存储成本、免运维等能力，受到... 查看详情

docker使用杂记-最佳实践尝试-实战(代码片段)

目录Docker使用杂记-最佳实践尝试-实战Docker简介项目背景内在原因外在原因基础镜像FROM需求镜像维护者LABEL工作文件夹WORKDIR文件ADDCOPY宗卷VOLUME命令RUN入口点ENTRYPORTCMDDockerfile后记本文环境参考Docker使用杂记-最佳实践尝试-实战本... 查看详情

智能合约最佳实践之solidity编码规范(代码片段)

每一门语言都有其相应的编码规范，Solidity也一样，下面官方推荐的规范及我的总结，供大家参考，希望可以帮助大家写出更好规范的智能合约。命名规范避免使用小写的l，大写的I，大写的O应该避免在命名中单独出现，因为很... 查看详情

elasticsearchelasticsearch日志场景最佳实践(代码片段)

...障Elasticsearch可广泛应用于日志分析、全文检索、结构化数据分析等多种场景，大幅度降低维护多套专用系统的成本，在开源社区非常受欢迎。然而Elasticsearch为满足多种查看详情

sparksql下的parquet使用最佳实践和代码实战

一：Spark SQL下的Parquet使用最佳实践1，过去整个业界对大数据的分析的技术栈的Pipeline一般分为一下两种方式：A）DataSource->HDFS->MR/Hive/Spark(相当于ETL)->HDFSParquet->SparkSQL/impala->ResultService(可以放在DB中，也有可能被通... 查看详情

sparksql下的parquet使用最佳实践和代码实战

一：SparkSQL下的Parquet使用最佳实践1，过去整个业界对大数据的分析的技术栈的Pipeline一般分为一下两种方式：A）DataSource->HDFS->MR/Hive/Spark(相当于ETL)->HDFSParquet->SparkSQL/impala->ResultService(可以放在DB 查看详情

elasticsearch最佳实践之核心概念与原理(代码片段)

每一个系统都拥有很多概念，这些概念是作者在设计与实现时为不同的模块或功能做的定义。概念本身只是一个名词，往往会跟随作者的喜好不同而不同，重要的是理解其设计的初衷以及要表达的实际内容，... 查看详情

ddd实战进阶第一波：开发一般业务的大健康行业直销系统（业务逻辑条件判断最佳实践）(代码片段)

这篇文章其实是大健康行业直销系统的番外篇，主要给大家讲讲如何在领域逻辑中，有效的处理业务逻辑条件判断的最佳实践问题。大家都知道，聚合根、实体和值对象这些领域对象都自身处理自己的业务逻辑。在业务处理过程... 查看详情

ansible最佳实践之playbook控制任务的执行(代码片段)

写在前面今天和小伙伴们分享一些Ansible中如何控制剧本任务执行的笔记博文内容分为两部分，控制任务执行，和控制主机执行顺序，涉及内容：剧本默认执行顺序分析Demo，先角色后任务import或include导入角色... 查看详情

ansible最佳实践之playbook执行速度优化(代码片段)

写在前面今天和小伙伴们分享一些Ansible中Playbook执行速度优化的笔记博文通过7种不同的优化方式，合理利用可配置项，从而提高Playbook的执行速度个人感觉如果受控机数量很少，其实没必要速度调优所谓的执行速度调... 查看详情

正文

mongo实战之数据空洞的最佳实践(代码片段)

问题背景： 某天,开发部的同事跑过来反映: mongodb数据文件太大，快把磁盘撑爆了！其中某个db占用最大(运营环境这个db的数据量其实很小)

分析: 开发环境有大量测试的增/删/改操作,而由于MongoDB顺序写的原因，在我们删除部分无用数据后，它的storageSize并不会变小，这就造成了大量的数据空洞。

解决办法

1. 使用MongoDB自带的compact命令：

2.采用复制集的方式滚动瘦身(offline)

系统环境(实验环境,为了简单只做了1个副本,请勿在生产环境操作！！！)

1.安装mongo

2.配置参数(两台上配置)

3.启动mongo

4.初始化副本集

5.插入数据

6.制造空洞数据

7.primary降权、停止

8.删除192.168.2.130 上的数据文件

9.重启192.168.2.130

10.查看日志,跟踪数据同步过程

11. 再次查看storageSize,下降十分明显

12. 192.168.2.138降权

13. 重复130的操作收缩138的空间

jmeter之最佳实践(代码片段)

.netcore开发实战（定义api的最佳实践）sourcegenerators版(代码片段)

将 mongo 集合导出到 SQL Server 的最佳实践

ansible最佳实践之playbook高级循环任务如何操作(代码片段)

hudi自带工具deltastreamer的实时入湖最佳实践(代码片段)

java并发编程实战-创建和执行任务的最佳实践(代码片段)

vue开发实战生态篇#18：vuex最佳实践(代码片段)

解读华为云gaussdb(forinflux)：最佳实践之数据建模(代码片段)

docker使用杂记-最佳实践尝试-实战(代码片段)

智能合约最佳实践之solidity编码规范(代码片段)

elasticsearchelasticsearch日志场景最佳实践(代码片段)

sparksql下的parquet使用最佳实践和代码实战

sparksql下的parquet使用最佳实践和代码实战

elasticsearch最佳实践之核心概念与原理(代码片段)

ddd实战进阶第一波：开发一般业务的大健康行业直销系统（业务逻辑条件判断最佳实践）(代码片段)

ansible最佳实践之playbook控制任务的执行(代码片段)

ansible最佳实践之playbook执行速度优化(代码片段)

ansible最佳实践之playbook执行速度优化(代码片段)

问题背景：某天,开发部的同事跑过来反映: mongodb数据文件太大，快把磁盘撑爆了！其中某个db占用最大(运营环境这个db的数据量其实很小)