云服务器40G的SSD,迅速被占满,清理应用产生的日志文件后,释放出来的空间又迅速被占满,于是开始查找元凶。
du -sh /*
使用这条命令查找根目录下各个文件夹的磁盘占用量
# du -sh /* 15M /bin 57M /boot 0 /dev 227M /etc 24K /home 0 /initrd.img.old 490M /lib 4.0K /lib64 16K /lost+found 8.0K /media 4.0K /meta.js 4.0K /mnt 16K /opt du: cannot access '/proc/17141/task/17141/fd/4': No such file or directory du: cannot access '/proc/17141/task/17141/fdinfo/4': No such file or directory du: cannot access '/proc/17141/fd/4': No such file or directory du: cannot access '/proc/17141/fdinfo/4': No such file or directory 0 /proc 3.2G /root 85M /run 13M /sbin 8.0K /snap 8.0K /srv 0 /sys 640K /tmp 1.9G /usr 31G /var 0 /vmlinuz.old
看来都是被var吃掉,继续往下挖
# cd var # du -sh ./* 860K ./backups 109M ./cache 4.0K ./crash 30G ./lib 4.0K ./local 0 ./lock 642M ./log 4.0K ./mail 4.0K ./opt 0 ./run 4.0K ./snap 28K ./spool 4.0K ./tmp
看来都是被lib吃掉,继续往下挖
# cd lib # du -sh ./* 12K ./apparmor 182M ./apt 636K ./containerd 8.0K ./dbus 6.2M ./denyhosts 20K ./dhcp 30G ./docker 8.0K ./docker-engine 30M ./dpkg 4.0K ./git 8.0K ./initramfs-tools 4.0K ./initscripts 4.0K ./insserv 52K ./jcloud 12K ./locales 8.0K ./logrotate 0 ./lxcfs 4.0K ./lxd 8.0K ./mdadm 4.0K ./misc 8.0K ./ntp 4.0K ./openssh-known-hosts 4.0K ./os-prober 28K ./pam 4.0K ./plymouth 28K ./polkit-1 4.0K ./python 4.0K ./resolvconf 12K ./sgml-base 108K ./snapd 8.0K ./sudo 328K ./systemd 8.0K ./ubuntu-release-upgrader 76K ./ucf 12K ./update-manager 20K ./update-notifier 4.0K ./update-rc.d 8.0K ./urandom 8.0K ./ureadahead 544K ./usbutils 8.0K ./vim 12K ./xml-core
看来都是被docker吃掉,继续往下挖
# cd docker # du -sh ./* 20K ./builder 72K ./buildkit 26G ./containers 17M ./image 128K ./network 4.3G ./overlay2 20K ./plugins 4.0K ./runtimes 4.0K ./swarm 4.0K ./tmp 4.0K ./trust 140K ./volumes
这时候已经能大致猜出来,是被docker的各个容器吃掉。但是这时候我还以为是被容器在container中写的临时文件占用空间,于是查了一下
# docker system df TYPE TOTAL ACTIVE SIZE RECLAIMABLE Images 18 12 3.986GB 1.936GB (48%) Containers 12 11 331.5kB 0B (0%) Local Volumes 11 1 368B 368B (100%) Build Cache 0 0 0B 0B
发现其实容器并没有往container的可写层写入多少东西,毕竟还是有注意容器的无状态性的,于是继续挖下去
# cd containers # du -sh ./* 172K ./1711518f4d996fa85abcb7f8144a7b69fd802e026f06c998fc92fdbff6c292d7 68K ./1d83a40a3953211f79254c2322f2ea6ad2138597ac8a38b8f5dbb77fa05bff40 187M ./23340aa45da94ec0e42377598130adae1de8a99f282473b526f44ae9f140c0f3 40K ./2ab9bddd896ee1bb9f756e57235697716e16b62157f3bee5b754c994c632b826 11M ./3e98c6fc96117362962499063fb8442d96c45a35a38e874b44f5ca6c66a62b04 12G ./4835ab725d9bf832f6a24ae3d4fab0fe9393352bdaf76712b8c34f09ab78558b 14G ./68a8e0053aaa6e19c1f3a1bde1af378a95812d764abcb4229ccb77cb5bbe1bee 42M ./71a7fcb449d3f36fc85ca243ce2ef2228c27dff803615916ec64ba802cf21ae7 156K ./aa99ace4db6c8f9e5c69297cc64b21954c3bb3b0e24fbbfab64674c12a18809c 44K ./e1435a44026a62d77256995e909d2283d599fe540bd0af60a18df8d7ae23db95 44K ./eb2fbf65dd028e23cd2009fe81a6d99cc8da9e949ac7d1b1e6c74ee8410ae866 40K ./f62642f5b84215e553b4f22e1dedd2b60ef27b2f85662e82444d5d0359aaa7f1
看来是被这两个容器每个各吃掉了十几个G,于是进入第一个吃掉12G空间的容器看了一下
# cd ./4835ab725d9bf832f6a24ae3d4fab0fe9393352bdaf76712b8c34f09ab78558b/ # du -sh ./* 12G ./4835ab725d9bf832f6a24ae3d4fab0fe9393352bdaf76712b8c34f09ab78558b-json.log 4.0K ./checkpoints 4.0K ./config.v2.json 4.0K ./hostconfig.json 4.0K ./hostname 4.0K ./hosts 4.0K ./mounts 4.0K ./resolv.conf 4.0K ./resolv.conf.hash
看来是被这个log文件吃掉了空间。突然想起来,每次清理log只是把容器通过-v写到文件系统的log文件删除,但是console.log或者System.out.println打印的log,也会被docker记录起来,用于docker logs 命令查询。于是删除容器,并在重新docker run时,加上了log限制:
--log-opt max-size=10m --log-opt max-file=1
log可设置项如下
--log-driver json-file #日志驱动 --log-opt max-size=[0-9+][k|m|g] #文件的大小 --log-opt max-file=[0-9+] #文件数量
再次进入containers文件夹,查看磁盘占用
# cd containers # du -sh ./* 172K ./1711518f4d996fa85abcb7f8144a7b69fd802e026f06c998fc92fdbff6c292d7 68K ./1d83a40a3953211f79254c2322f2ea6ad2138597ac8a38b8f5dbb77fa05bff40 187M ./23340aa45da94ec0e42377598130adae1de8a99f282473b526f44ae9f140c0f3 40K ./2ab9bddd896ee1bb9f756e57235697716e16b62157f3bee5b754c994c632b826 52K ./33f7e7a19df5afcbed58d464b74ee9315471214f84d3cb9a03e4fcded4af01be 11M ./3e98c6fc96117362962499063fb8442d96c45a35a38e874b44f5ca6c66a62b04 42M ./71a7fcb449d3f36fc85ca243ce2ef2228c27dff803615916ec64ba802cf21ae7 156K ./aa99ace4db6c8f9e5c69297cc64b21954c3bb3b0e24fbbfab64674c12a18809c 128K ./c28ef25042c0781303a4d4ce9b7af0a4274c823bd3b24e46f96f5a0734467e9a 44K ./e1435a44026a62d77256995e909d2283d599fe540bd0af60a18df8d7ae23db95 44K ./eb2fbf65dd028e23cd2009fe81a6d99cc8da9e949ac7d1b1e6c74ee8410ae866 40K ./f62642f5b84215e553b4f22e1dedd2b60ef27b2f85662e82444d5d0359aaa7f1
可见被删除掉的容器,所属的文件夹也会被删掉,所以之前那个十几个G的log文件自然也被删掉了
参考文献
https://blog.csdn.net/weixin_30248399/article/details/99491342