运维巡检参考手册

一、巡检表

在这里插入图片描述

二、巡检参考

2.1、centos巡检

1> 身份鉴别:确保root是唯一的UID为0的账户,除root以外其他UID为0的用户都应该删除,或者为其分配新的UID;查看命令:

cat /etc/passwd | awk -F: ‘($3 == 0) { print $1 }’|grep -v ‘^root$’

2> 身份鉴别:密码复杂度检查,检查密码长度和密码是否使用多种字符类型

编辑/etc/security/pwquality.conf,把minlen(密码最小长度)设置为9-32位,把minclass(至少包含小写字母、大写字母、数字、特殊字符等4类字符中等3类或4类)设置为3或4。如: minlen=10 minclass=3.

3> 身份鉴别:检查密码重用是否受限制,强制用户不使用最近使用的密码,降低猜测密码攻击风险

在/etc/pam.d/password-auth和/etc/pam.d/system-auth中password sufficient pam_unix.so 这行的末尾配置remember参数为5-24之间,原来的内容不用更改, 只在末尾加了remember=5。

4> 服务配置:确保ssh loglevel设置为info,记录登录和注销活动

编辑 /etc/ssh/sshd_config 文件以按如下方式设置参数(取消注释): LogLevel INFO

5> 服务配置:确保SSH MaxAuthTries设置为3到6之间,设置较低的Max AuthTrimes参数将降低SSH服务器被暴力攻击成功的风险

在/etc/ssh/sshd_config中取消MaxAuthTries注释符号#,设置最大密码尝试失败次数3-6,建议为4:MaxAuthTries 4

6> 服务配置:禁止ssh空密码账户登录

编辑文件/etc/ssh/sshd_config,将PermitEmptyPasswords配置为no:PermitEmptyPasswords no

7> 服务配置:设置ssh空闲超时退出时间,可降低未授权用户访问其他用户ssh会话的风险

编辑/etc/ssh/sshd_config,将ClientAliveInterval 设置为300到900,即5-15分钟,将ClientAliveCountMax设置为0-3。

8> 安全审计:确保rsyslog服务已启用,记录日志以供审计

ps -ef|grep -v grep|grep rsyslog

运行以下命令启用rsyslog服务:

systemctl enable rsyslog
systemctl start rsyslog

9> 查看可以日志记录

命令:lastlog

查看最近 登录的ip地址列表中是否存在可疑ip

last -f /var/log/wtmp

10> 防火墙

查看状态:systemctl status firewalld
开启:service firewalld start
重启:service firewalld restart
关闭:service firewalld stop

#查询端口是否开放
firewall-cmd --query-port=8080/tcp
#开放80端口
firewall-cmd --permanent --add-port=80/tcp
#移除端口
firewall-cmd --permanent --remove-port=8080/tcp

11> 高危漏洞

查看可以升级的软件:yum check-update

更新软件:yum upgrade

注意update和upgrade的区别:

update侧重更新的意思,主要是为原有的东西增加新功能;yum -y update 升级所有包,改变软件设置和系统设置,系统版本内核都升级

upgrade侧重升级的意思,是指从较低级版本升级到高级的版本;yum -y upgrade升级所有包,不改变软件设置和系统设置,系统版本升级,内核不改变。

ubuntu:
sudo apt update:只检查更新源里的软件版本列表,不更新(已安装的软件包是否有可用的更新,给出汇总报告)

sudo apt upgrade 软件包名:更新已安装的软件包

12> 检查确认服务器时间同步
Yum install ntp

/etc/rc.d/init.d/ntpd start

ntpdate 203.117.180.36

crontab –e:

*/1 * * * root /usr/sbin/ntpdate 203.117.180.36 > /dev/null 2>&1

/etc/init.d/crond restart

并且打开防火墙UDP 123 端口保证ntpdate服务正常运行。

13> 其他注意事项

主机名一定不要使用软件名称,比如:mysql,nginx,php等作为系统主机名,因为ps –ef|grep nginx查看服务状态时,容易引起误导。

当网络状况不好,传文件及数据最好使用FTP。

update修改语句之前,一定要先select结果,并且记录,以便如果修改记录错误,能够恢复回数值来。

修改配置文件时候先备份配置文件。然后修改配置文件的时候,要先注释掉选项,然后新增选项,配置参数,注意观察下注释是;分号还是#井号。

2.2、weblogic中间件巡检流程

1.weblogic默认管理地址和用户名密码

http://ip:8001/console; //输入weblogic/密码

2.查看weblogic版本:java weblogic.version
3.登录管理页面查看部署应用
4.查看weblogic当前补丁集
cd /u01/Middleware/OPatch
./opatch lspatches -jre /u01/jdk1.7.0_80/jre

三、巡检脚本:请根据实际优化后使用

3.1、服务器巡检shell脚本

    #!/bin/bash  
    #admin:spirits  
      
    #***********CPU检测*************  
    echo "`date '+%Y年%m月%d日 %H:%M:%S'` 数据库服务器硬件情况开始巡检。。。"  
      
    top -bn 6 >>top  
      
    grep -n "%id" top >> newtop  
      
    grep -n "zombie" top >> insisttop  
      
    top1=`cat  newtop   | awk '{print $5}' | sed -n 4p | sed 's/%//g' |sed 's/id,//g'`  
    top2=`cat  newtop   | awk '{print $5}' | sed -n 5p | sed 's/%//g' |sed 's/id,//g'`  
    top3=`cat  newtop   | awk '{print $5}' | sed -n 6p | sed 's/%//g' |sed 's/id,//g'`  
      
    top4=`cat insisttop | awk '{print $10}' | sed -n 2p | sed 's/%//g' |sed 's/id,//g'`  
      
    #echo "top4:$top4"  
      
    if [ $top4 -gt 0 ]  
      
    then   
        echo "`date '+%Y年%m月%d日 %H:%M:%S'` 采集处理服务器上出现僵尸进程,巡检程序将自动kill该进程,如需人工确认请执行命令top后再执行ps -A -ostat,ppid,pid,cmd | grep -e '^[Zz]'来确认是否将僵尸进程杀死"  >> ./newreport.txt  
      
        ps -A -o stat,ppid,pid,cmd | grep -e '^[Zz]' | awk '{print $2}' | xargs kill -9  
      
    else   
        echo "`date '+%Y年%m月%d日 %H:%M:%S'` 采集处理服务器上无僵尸进程正常运行!"  
    fi  
      
    a=${top1:0:2}  
    b=${top2:0:2}  
    c=${top3:0:2}  
      
    echo "top1: $a"  
    echo "top2: $b"  
    echo "top3: $c"  
      
      
     if [  $a    -lt  20  ]&&[  $b    -lt  20  ]&&[  $c    -lt  20  ]    ; then  
      
        echo  "`date '+%Y年%m月%d日 %H:%M:%S'` 数据库服务器CPU占用率不正常,top取到的值是$top1,$top2,$top3,小于参考值20,请及时处理!" >> ./newreport.txt  
      
    else  
      
      echo "CPU占用率正常!"   
      
    fi  
      
    rm -rf top  
      
    rm -rf newtop  
      
    rm -rf insisttop  
      
    #***************内存检测***********  
    free1=`free -g | awk '{print $4}' | sed -n 3p | sed 's/%//g' |sed 's/t//g'`  
      
    total=`free -g | awk '{print $2}' | sed -n 2p | sed 's/%//g' |sed 's/t//g'`  
      
    canshu=0.2  
      
    tempd=`echo $total $canshu |awk '{print $1*$2}'`  
      
    biaozhun=${tempd%.*}  
      
    if [ $free1  -le  $biaozhun  ]  ;  then   
    echo "`date '+%Y年%m月%d日 %H:%M:%S'`  数据库服务器内存占用率过高,free -g取到的值是$free1,小于等于参考值$biaozhun,请及时处理!" >> ./newreport.txt  
      
    else  
      
    echo "内存占用率正常!"  
      
    fi  
      
    #**************文件系统巡检**********  
    df1=`df -h | awk '{print $5}' | sed -n 2p | sed 's/%//g'`  
    df2=`df -h | awk '{print $5}' | sed -n 3p | sed 's/%//g'`  
    df3=`df -h | awk '{print $5}' | sed -n 4p | sed 's/%//g'`  
    df4=`df -h | awk '{print $5}' | sed -n 5p | sed 's/%//g'`  
    df5=`df -h | awk '{print $5}' | sed -n 6p | sed 's/%//g'`  
      
     if [ $df1 -gt  90 ]||[ $df2  -gt  90 ]||[ $df3 -gt  90 ]||[ $df4 -gt  90 ]||[ $df5 -gt  90 ] ; then  
      
        echo "`date '+%Y年%m月%d日 %H:%M:%S'` 数据库服务器磁盘占用率过高!df -h取到的值是$df1,$df2,$df3,$df4,$df5,参考值是90,若其中一个或一个以上大于参考值,请及时处理!" >> ./newreport.txt  
      
    else  
      
        echo "磁盘占用率正常!"  
      
    fi  
      
    #*********************磁盘IO性能巡检***************  
    iostat -x 2 5 >>iostat.txt  
      
    scvtm1=" `cat  iostat.txt  | awk '{print $11}' | sed -n 16p | sed 's/%//g' `"  
      
    scvtm2="` cat  iostat.txt  | awk '{print $11}' | sed -n 17p | sed 's/%//g'`"  
      
    scvtm3="` cat  iostat.txt  | awk '{print $11}' | sed -n 18p | sed 's/%//g'`"  
      
    scvtm4="` cat  iostat.txt  | awk '{print $11}' | sed -n 19p | sed 's/%//g'`"  
      
    scvtm13="` cat  iostat.txt  | awk '{print $11}' | sed -n 25p | sed 's/%//g'`"  
      
    scvtm6=" `cat  iostat.txt  | awk '{print $11}' | sed -n 26p | sed 's/%//g' `"  
      
    scvtm7="` cat  iostat.txt  | awk '{print $11}' | sed -n 27p | sed 's/%//g'`"  
      
    scvtm8="` cat  iostat.txt  | awk '{print $11}' | sed -n 28p | sed 's/%//g'`"  
      
    scvtm9="` cat  iostat.txt  | awk '{print $11}' | sed -n 34p | sed 's/%//g'`"  
      
    scvtm10="` cat  iostat.txt  | awk '{print $11}' | sed -n 35p | sed 's/%//g'`"  
      
    scvtm11="` cat  iostat.txt  | awk '{print $11}' | sed -n 36p | sed 's/%//g'`"  
      
    scvtm12="` cat  iostat.txt  | awk '{print $11}' | sed -n 37p | sed 's/%//g'`"  
      
      
      
    util1="`cat  iostat.txt  | awk '{print $12}' | sed -n 16p | sed 's/%//g'`"  
      
    util2="` cat  iostat.txt  | awk '{print $12}' | sed -n 17p | sed 's/%//g'`"  
      
    util3="` cat  iostat.txt  | awk '{print $12}' | sed -n 18p | sed 's/%//g'`"  
      
    util4="` cat  iostat.txt  | awk '{print $12}' | sed -n 19p | sed 's/%//g'`"  
      
    util5="` cat  iostat.txt  | awk '{print $12}' | sed -n 25p | sed 's/%//g'`"  
      
    util6=" `cat  iostat.txt  | awk '{print $12}' | sed -n 26p | sed 's/%//g' `"  
      
    util7="` cat  iostat.txt  | awk '{print $12}' | sed -n 27p | sed 's/%//g'`"  
      
    util8="` cat  iostat.txt  | awk '{print $12}' | sed -n 28p | sed 's/%//g'`"  
      
    util9="` cat  iostat.txt  | awk '{print $12}' | sed -n 34p | sed 's/%//g'`"  
      
    util10="` cat  iostat.txt  | awk '{print $12}' | sed -n 35p | sed 's/%//g'`"  
      
    util11="` cat  iostat.txt  | awk '{print $12}' | sed -n 36p | sed 's/%//g'`"  
      
    util12="` cat  iostat.txt  | awk '{print $12}' | sed -n 37p | sed 's/%//g'`"  
      
    #***********1/2/3/4****************  
      
    maxa=`echo "$scvtm1 $scvtm2 $scvtm3 $scvtm4" | awk '{for(i=1;i<=NF;i++)$i>a?a=$i:a}END{print a}'`  
      
    #*************13/6/7/8/**************  
      
    maxb=`echo "$scvtm13 $scvtm6 $scvtm7 $scvtm8" | awk '{for(i=1;i<=NF;i++)$i>a?a=$i:a}END{print a}'`  
      
    #*************************9/10/11/12******************  
      
    maxc=`echo "$scvtm9 $scvtm10 $scvtm11 $scvtm12" | awk '{for(i=1;i<=NF;i++)$i>a?a=$i:a}END{print a}'`  
      
    #********************util1/2/3/4**********************  
      
    maxd=`echo "$util1 $util2 $util3 $util4" | awk '{for(i=1;i<=NF;i++)$i>a?a=$i:a}END{print a}'`  
      
      
    #**********************util5/6/7/8*******************  
      
    maxe=`echo "$util5 $util6 $util7 $util8" | awk '{for(i=1;i<=NF;i++)$i>a?a=$i:a}END{print a}'`  
      
    #***********************util9/10/11/12***************  
      
    maxf=`echo "$util9 $util10 $util11 $util12" | awk '{for(i=1;i<=NF;i++)$i>a?a=$i:a}END{print a}'`  
      
    #******************做判断************************  
    m=${maxa:0:1}  
      
    n=${maxb:0:1}  
      
    h=${maxc:0:1}  
      
    k=${maxd:0:1}  
      
    l=${maxe:0:1}  
      
    o=${maxf:0:1}  
      
      
    if [  $m -ge 15 ]&&[ $k -ge 99 ]&&[ $k -lt 100 ]$$[  $n -ge 15 ]&&[ $l -ge  99 ]&&[ $l -lt 100 ]&&[  $h -ge 15]&&[ $o -ge 99 ]&&[ $o -lt 100 ]  
      
    then  
      
        echo "`date '+%Y年%m月%d日 %H:%M:%S'`  数据库服务器磁盘IO存在瓶颈,请及时处理!" >> ./newreport.txt  
      
    else  
      
       echo "磁盘IO正常!"  
      
    fi  
      
    rm -rf ./iostat.txt  
      
    #*********************************网络连通性检测**********************  
      
    network1=`ping -s 4096 -c 5  135.0.51.15 | awk '{print $6}' | sed -n 9p | sed 's/%//g' |sed 's/t//g'`  
      
    if [ $network1 -gt 0 ]  
      
    then   
      
       echo "`date '+%Y年%m月%d日 %H:%M:%S'` 数据库服务器到该目标IP之间的网络不稳定,ping取到的值是$network1,大于参考值是0,系统存在风险,请及时处理!"  >> ./newreport.txt  
      
    else   
      
       echo "网络连通性正常!"  
      
    fi  
      
    echo "`date '+%Y年%m月%d日 %H:%M:%S'` 数据库服务器硬件情况巡检结束!"

3.2、多台服务器自动巡检脚本

脚本过程:

1> 所有的服务器之间的网络都是在同一个局域网内,所有网络两两相通。

2> 在其中选择一台性能相对较好或者是服务器运行压力较小的服务器,作为巡检服务器。

3> 通过这一服务器来实现对其他服务器的巡检,然后把巡检结果记录到巡检服务器上。

4> 每台服务器巡检结果都以时间和ip做命名用来区分,最后将所有巡检结果压缩打包。

5> 每次维护人员只需要定时去取这个压缩包查看最后结果即可,免去了对每台服务器都需要登录和输入相同的命令进行查看。本次脚本的巡检是基于TELNET服务所以被检服务器必须开启TELNET服务

#! /bin/bash
echo "start running" | tee -a
LANG=en
set `date`
path="/home/check"
echo "start running" | tee -a  $path/log/$1-$2-$3.log
if [ -d /home/check/result/$1-$2-$3 ];
 then
   echo ''
else
mkdir -p /home/check/result/$1-$2-$3
echo `date +"%Y/%m/%d-%H:%M:%S"` "create " "$1-$2-$3" "directory success "|tee -a $path/log/$1-$2-$3.log
fi
echo `date +"%Y/%m/%d-%H:%M:%S"` "starting reading linuxconfig.txt " |tee -a $path/log/$1-$2-$3.log
cat "$path"/config/linuxconfig.txt| while read line;
do
ip=`echo $line |cut -d '=' -f2`
echo `date +"%Y/%m/%d-%H:%M:%S"` "check LINUX " $ip " starting " |tee -a $path/log/$1-$2-$3.log
(
sleep 1
#echo account
 echo root

sleep 1
#echo password
 echo root

sleep 3
echo "free -k"
echo ""
echo "df -k"
echo ""

#memory_used_rate
echo "ps -ef| grep java"
echo ""
echo "netstat -an|egrep -n '80|22|21|23|9043|9044|45331|45332|39194|19195'"
echo ""
#echo "ifconfig -a "
echo  "/sbin/ip ad"
echo ""

echo " tail -2000  /var/log/messages | grep -v snmp |grep  -i  error "
echo ""
echo "/bin/dmesg  |grep -i error"
echo ""

echo "top -n1|sed -n '1,5p'"
echo "exit"
echo "/usr/bin/vmstat  1 3"
echo ""

sleep 5
)|telnet $ip >/home/check/result/$1-$2-$3/$ip-$1-$2-$3-$4.txt
echo `date +"%Y/%m/%d-%H:%M:%S"` "check LINUX " $ip " end" |tee -a $path/log/$1-$2-$3.log
echo "" | tee -a $path/log/$1-$2-$3.log
done
echo `date +"%Y/%m/%d-%H:%M:%S"` "end reading linuxconfig.txt  " |tee -a $path/log/$1-$2-$3.log
 
echo `date +"%Y/%m/%d-%H:%M:%S"` "starting reading AIXconfig.txt " | tee -a $path/log/$1-$2-$3.log
cat "$path"/config/AIXconfig.txt| while read line;
do
ip=`echo $line |cut -d '=' -f2`
echo `date +"%Y/%m/%d-%H:%M:%S"` "check IBM AIX " $ip " starting " |tee -a $path/log/$1-$2-$3.log
(
sleep 1
#echo account
 echo root

sleep 1
#echo password
 echo root
sleep 5
echo ""
#echo "df -k"
 echo "df -g"
echo ""

#memory_used_rate
echo "ps -ef| grep java"
echo ""
echo "netstat -an|egrep -n '80|22|21|23|9043|9044|45331|45332|39194|19195'"
echo ""
echo "ifconfig -a"
echo ""
echo "topas"
echo "exit"
sleep 5
)|telnet $ip >/home/check/result/$1-$2-$3/$ip-$1-$2-$3-$4.txt
echo `date +"%Y/%m/%d-%H:%M:%S"` "check IBM AIX " $ip " end " |tee -a $path/log/$1-$2-$3.log
echo "" | tee -a $path/log/$1-$2-$3.log
done
echo `date +"%Y/%m/%d-%H:%M:%S"` "end reading AIXconfig.txt " | tee -a $path/log/$1-$2-$3.log
zip -r /home/check/result/$1-$2-$3/$1-$2-$3.zip /home/check/result/$1-$2-$3/*
echo "End running "

3.3、Linux巡检脚本

#!bin/bash
#主机信息每日巡检
 
IPADDR=$(ifconfig eth0 | grep '\<inet\>' | awk '{print $2}')
#环境变量PATH没设好,在cron里执行时有很多命令会找不到
export PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin
source /etc/profile
 
[ $(id -u) -gt 0 ] && echo "请用root用户执行此脚本!" && exit 1
centosVersion=$(awk '{print $(NF-1)}' /etc/redhat-release)
VERSION="2018.08.28"
 
#日志相关
LOGPATH="$PROGPATH/var/log/HostDailyCheck"
[ -e $LOGPATH ] || mkdir $LOGPATH
RESULTFILE="$LOGPATH/HostDailyCheck-$IPADDR-`date +%Y%m%d`.txt"
 
#定义报表的全局变量
report_DateTime=""    #日期 ok
report_Hostname=""    #主机名 ok
report_OSRelease=""    #发行版本 ok
report_Kernel=""    #内核 ok
report_Language=""    #语言/编码 ok
report_LastReboot=""    #最近启动时间 ok
report_Uptime=""    #运行时间(天) ok
report_CPUs=""    #CPU数量 ok
report_CPUType=""    #CPU类型 ok
report_Arch=""    #CPU架构 ok
report_MemTotal=""    #内存总容量(MB) ok
report_MemFree=""    #内存剩余(MB) ok
report_MemUsedPercent=""    #内存使用率% ok
report_DiskTotal=""    #硬盘总容量(GB) ok
report_DiskFree=""    #硬盘剩余(GB) ok
report_DiskUsedPercent=""    #硬盘使用率% ok
report_InodeTotal=""    #Inode总量 ok
report_InodeFree=""    #Inode剩余 ok
report_InodeUsedPercent=""    #Inode使用率 ok
report_IP=""    #IP地址 ok
report_MAC=""    #MAC地址 ok
report_Gateway=""    #默认网关 ok
report_DNS=""    #DNS ok
report_Listen=""    #监听 ok
report_Selinux=""    #Selinux ok
report_Firewall=""    #防火墙 ok
report_USERs=""    #用户 ok
report_USEREmptyPassword=""   #空密码用户 ok
report_USERTheSameU      #相同ID的用户 ok 
report_PasswordExpiry=""    #密码过期(天) ok
report_RootUser=""    #root用户 ok
report_Sudoers=""    #sudo授权  ok
report_SSHAuthorized=""    #SSH信任主机 ok
report_SSHDProtocolVersion=""    #SSH协议版本 ok
report_SSHDPermitRootLogin=""    #允许root远程登录 ok
report_DefunctProsess=""    #僵尸进程数量 ok
report_SelfInitiatedService=""    #自启动服务数量 ok
report_SelfInitiatedProgram=""    #自启动程序数量 ok
report_RuningService=""           #运行中服务数  ok
report_Crontab=""    #计划任务数 ok
report_Syslog=""    #日志服务 ok
report_SNMP=""    #SNMP  OK
report_NTP=""    #NTP ok
report_JDK=""    #JDK版本 ok
function version(){
    echo ""
    echo ""
    echo "系统巡检脚本:Version $VERSION"
}
 
function getCpuStatus(){
    echo ""
    echo ""
    echo "############################ CPU检查 #############################"
    Physical_CPUs=$(grep "physical id" /proc/cpuinfo| sort | uniq | wc -l)
    Virt_CPUs=$(grep "processor" /proc/cpuinfo | wc -l)
    CPU_Kernels=$(grep "cores" /proc/cpuinfo|uniq| awk -F ': ' '{print $2}')
    CPU_Type=$(grep "model name" /proc/cpuinfo | awk -F ': ' '{print $2}' | sort | uniq)
    CPU_Arch=$(uname -m)
    echo "物理CPU个数:$Physical_CPUs"
    echo "逻辑CPU个数:$Virt_CPUs"
    echo "每CPU核心数:$CPU_Kernels"
    echo "    CPU型号:$CPU_Type"
    echo "    CPU架构:$CPU_Arch"
    #报表信息
    report_CPUs=$Virt_CPUs    #CPU数量
    report_CPUType=$CPU_Type  #CPU类型
    report_Arch=$CPU_Arch     #CPU架构
}
 
function getMemStatus(){
    echo ""
    echo ""
    echo "############################ 内存检查 ############################"
    if [[ $centosVersion < 7 ]];then
        free -mo
    else
        free -h
    fi
    #报表信息
    MemTotal=$(grep MemTotal /proc/meminfo| awk '{print $2}')  #KB
    MemFree=$(grep MemFree /proc/meminfo| awk '{print $2}')    #KB
    let MemUsed=MemTotal-MemFree
    MemPercent=$(awk "BEGIN {if($MemTotal==0){printf 100}else{printf \"%.2f\",$MemUsed*100/$MemTotal}}")
    report_MemTotal="$((MemTotal/1024))""MB"        #内存总容量(MB)
    report_MemFree="$((MemFree/1024))""MB"          #内存剩余(MB)
    report_MemUsedPercent="$(awk "BEGIN {if($MemTotal==0){printf 100}else{printf \"%.2f\",$MemUsed*100/$MemTotal}}")""%"   #内存使用率%
}
 
function getDiskStatus(){
    echo ""
    echo ""
    echo "############################ 磁盘检查 ############################"
    df -hiP | sed 's/Mounted on/Mounted/'> /tmp/inode
    df -hTP | sed 's/Mounted on/Mounted/'> /tmp/disk 
    join /tmp/disk /tmp/inode | awk '{print $1,$2,"|",$3,$4,$5,$6,"|",$8,$9,$10,$11,"|",$12}'| column -t
    #报表信息
    diskdata=$(df -TP | sed '1d' | awk '$2!="tmpfs"{print}') #KB
    disktotal=$(echo "$diskdata" | awk '{total+=$3}END{print total}') #KB
    diskused=$(echo "$diskdata" | awk '{total+=$4}END{print total}')  #KB
    diskfree=$((disktotal-diskused)) #KB
    diskusedpercent=$(echo $disktotal $diskused | awk '{if($1==0){printf 100}else{printf "%.2f",$2*100/$1}}') 
    inodedata=$(df -iTP | sed '1d' | awk '$2!="tmpfs"{print}')
    inodetotal=$(echo "$inodedata" | awk '{total+=$3}END{print total}')
    inodeused=$(echo "$inodedata" | awk '{total+=$4}END{print total}')
    inodefree=$((inodetotal-inodeused))
    inodeusedpercent=$(echo $inodetotal $inodeused | awk '{if($1==0){printf 100}else{printf "%.2f",$2*100/$1}}')
    report_DiskTotal=$((disktotal/1024/1024))"GB"   #硬盘总容量(GB)
    report_DiskFree=$((diskfree/1024/1024))"GB"     #硬盘剩余(GB)
    report_DiskUsedPercent="$diskusedpercent""%"    #硬盘使用率%
    report_InodeTotal=$((inodetotal/1000))"K"       #Inode总量
    report_InodeFree=$((inodefree/1000))"K"         #Inode剩余
    report_InodeUsedPercent="$inodeusedpercent""%"  #Inode使用率%
 
}
 
function getSystemStatus(){
    echo ""
    echo ""
    echo "############################ 系统检查 ############################"
    if [ -e /etc/sysconfig/i18n ];then
        default_ grep -v "^#" | awk -F '"' '{print $2}')"
    else
        default_LANG=$LANG
    fi
    export 
    Release=$(cat /etc/redhat-release 2>/dev/null)
    Kernel=$(uname -r)
    OS=$(uname -o)
    Hostname=$(uname -n)
    SELinux=$(/usr/sbin/sestatus | grep "SELinux status: " | awk '{print $3}')
    LastReboot=$(who -b | awk '{print $3,$4}')
    uptime=$(uptime | sed 's/.*up \([^,]*\), .*/\1/')
    echo "     系统:$OS"
    echo " 发行版本:$Release"
    echo "     内核:$Kernel"
    echo "   主机名:$Hostname"
    echo "  SELinux:$SELinux"
    echo "语言/编码:$default_LANG"
    echo " 当前时间:$(date +'%F %T')"
    echo " 最后启动:$LastReboot"
    echo " 运行时间:$uptime"
    #报表信息
    report_DateTime=$(date +"%F %T")  #日期
    report_Hostname="$Hostname"       #主机名
    report_OSRelease="$Release"       #发行版本
    report_Kernel="$Kernel"           #内核
    report_Language="$default_LANG"   #语言/编码
    report_LastReboot="$LastReboot"   #最近启动时间
    report_Uptime="$uptime"           #运行时间(天)
    report_Selinux="$SELinux"
    export 
}
function getServiceStatus(){
    echo ""
    echo ""
    echo "############################ 服务检查 ############################"
    echo ""
    if [[ $centosVersion > 7 ]];then
        conf=$(systemctl list-unit-files --type=service --state=enabled --no-pager | grep "enabled")
        process=$(systemctl list-units --type=service --state=running --no-pager | grep ".service")
        #报表信息
        report_SelfInitiatedService="$(echo "$conf" | wc -l)"       #自启动服务数量
        report_RuningService="$(echo "$process" | wc -l)"           #运行中服务数量
    else
        conf=$(/sbin/chkconfig | grep -E ":on|:启用")
        process=$(/sbin/service --status-all 2>/dev/null | grep -E "is running|正在运行")
        #报表信息
        report_SelfInitiatedService="$(echo "$conf" | wc -l)"       #自启动服务数量
        report_RuningService="$(echo "$process" | wc -l)"           #运行中服务数量
    fi
    echo "服务配置"
    echo "--------"
    echo "$conf"  | column -t
    echo ""
    echo "正在运行的服务"
    echo "--------------"
    echo "$process"
 
}
 
function getAutoStartStatus(){
    echo ""
    echo ""
    echo "############################ 自启动检查 ##########################"
    conf=$(grep -v "^#" /etc/rc.d/rc.local| sed '/^$/d')
    echo "$conf"
    #报表信息
    report_SelfInitiatedProgram="$(echo $conf | wc -l)"    #自启动程序数量
}
 
function getLoginStatus(){
    echo ""
    echo ""
    echo "############################ 登录检查 ############################"
    last | head
}
 
function getNetworkStatus(){
    echo ""
    echo ""
    echo "############################ 网络检查 ############################"
    if [[ $centosVersion < 7 ]];then
        /sbin/ifconfig -a | grep -v packets | grep -v collisions | grep -v inet6
    else
        #ip a
        for i in $(ip link | grep BROADCAST | awk -F: '{print $2}');do ip add show $i | grep -E "BROADCAST|global"| awk '{print $2}' | tr '\n' ' ' ;echo "" ;done
    fi
    GATEWAY=$(ip route | grep default | awk '{print $3}')
    DNS=$(grep nameserver /etc/resolv.conf| grep -v "#" | awk '{print $2}' | tr '\n' ',' | sed 's/,$//')
    echo ""
    echo "网关:$GATEWAY "
    echo " DNS:$DNS"
    #报表信息
    IP=$(ip -f inet addr | grep -v 127.0.0.1 |  grep inet | awk '{print $NF,$2}' | tr '\n' ',' | sed 's/,$//')
    MAC=$(ip link | grep -v "LOOPBACK\|loopback" | awk '{print $2}' | sed 'N;s/\n//' | tr '\n' ',' | sed 's/,$//')
    report_IP="$IP"            #IP地址
    report_MAC=$MAC            #MAC地址
    report_Gateway="$GATEWAY"  #默认网关
    report_DNS="$DNS"          #DNS
}
 
function getListenStatus(){
    echo ""
    echo ""
    echo "############################ 监听检查 ############################"
    TCPListen=$(ss -ntul | column -t)
    echo "$TCPListen"
    #报表信息
    report_Listen="$(echo "$TCPListen"| sed '1d' | awk '/tcp/ {print $5}' | awk -F: '{print $NF}' | sort | uniq | wc -l)"
}
 
function getCronStatus(){
    echo ""
    echo ""
    echo "############################ 计划任务检查 ########################"
    Crontab=0
    for shell in $(grep -v "/sbin/nologin" /etc/shells);do
        for user in $(grep "$shell" /etc/passwd| awk -F: '{print $1}');do
            crontab -l -u $user >/dev/null 2>&1
            status=$?
            if [ $status -eq 0 ];then
                echo "$user"
                echo "--------"
                crontab -l -u $user
                let Crontab=Crontab+$(crontab -l -u $user | wc -l)
                echo ""
            fi
        done
    done
    #计划任务
    find /etc/cron* -type f | xargs -i ls -l {} | column  -t
    let Crontab=Crontab+$(find /etc/cron* -type f | wc -l)
    #报表信息
    report_Crontab="$Crontab"    #计划任务数
}
function getHowLongAgo(){
    # 计算一个时间戳离现在有多久了
    datetime="$*"
    [ -z "$datetime" ] && echo "错误的参数:getHowLongAgo() $*"
    Timestamp=$(date +%s -d "$datetime")    #转化为时间戳
    Now_Timestamp=$(date +%s)
    Difference_Timestamp=$(($Now_Timestamp-$Timestamp))
    days=0;hours=0;minutes=0;
    sec_in_day=$((60*60*24));
    sec_in_hour=$((60*60));
    sec_in_minute=60
    while (( $(($Difference_Timestamp-$sec_in_day)) > 1 ))
    do
        let Difference_Timestamp=Difference_Timestamp-sec_in_day
        let days++
    done
    while (( $(($Difference_Timestamp-$sec_in_hour)) > 1 ))
    do
        let Difference_Timestamp=Difference_Timestamp-sec_in_hour
        let hours++
    done
    echo "$days 天 $hours 小时前"
}
 
function getUserLastLogin(){
    # 获取用户最近一次登录的时间,含年份
    # 很遗憾last命令不支持显示年份,只有"last -t YYYYMMDDHHMMSS"表示某个时间之间的登录,我
    # 们只能用最笨的方法了,对比今天之前和今年元旦之前(或者去年之前和前年之前……)某个用户
    # 登录次数,如果登录统计次数有变化,则说明最近一次登录是今年。
    username=$1
    : ${username:="`whoami`"}
    thisYear=$(date +%Y)
    oldesYear=$(last | tail -n1 | awk '{print $NF}')
    while(( $thisYear >= $oldesYear));do
        loginBeforeToday=$(last $username | grep $username | wc -l)
        loginBeforeNewYearsDayOfThisYear=$(last $username -t $thisYear"0101000000" | grep $username | wc -l)
        if [ $loginBeforeToday -eq 0 ];then
            echo "从未登录过"
            break
        elif [ $loginBeforeToday -gt $loginBeforeNewYearsDayOfThisYear ];then
            lastDateTime=$(last -i $username | head -n1 | awk '{for(i=4;i<(NF-2);i++)printf"%s ",$i}')" $thisYear" #格式如: Sat Nov 2 20:33 2015
            lastDateTime=$(date "+%Y-%m-%d %H:%M:%S" -d "$lastDateTime")
            echo "$lastDateTime"
            break
        else
            thisYear=$((thisYear-1))
        fi
    done
 
}
 
function getUserStatus(){
    echo ""
    echo ""
    echo "############################ 用户检查 ############################"
    #/etc/passwd 最后修改时间
    pwdfile="$(cat /etc/passwd)"
    Modify=$(stat /etc/passwd | grep Modify | tr '.' ' ' | awk '{print $2,$3}')
 
    echo "/etc/passwd 最后修改时间:$Modify ($(getHowLongAgo $Modify))"
    echo ""
    echo "特权用户"
    echo "--------"
    RootUser=""
    for user in $(echo "$pwdfile" | awk -F: '{print $1}');do
        if [ $(id -u $user) -eq 0 ];then
            echo "$user"
            RootUser="$RootUser,$user"
        fi
    done
    echo ""
    echo "用户列表"
    echo "--------"
    USERs=0
    echo "$(
    echo "用户名 UID GID HOME SHELL 最后一次登录"
    for shell in $(grep -v "/sbin/nologin" /etc/shells);do
        for username in $(grep "$shell" /etc/passwd| awk -F: '{print $1}');do
            userLastLogin="$(getUserLastLogin $username)"
            echo "$pwdfile" | grep -w "$username" |grep -w "$shell"| awk -F: -v lastlogin="$(echo "$userLastLogin" | tr ' ' '_')" '{print $1,$3,$4,$6,$7,lastlogin}'
        done
        let USERs=USERs+$(echo "$pwdfile" | grep "$shell"| wc -l)
    done
    )" | column -t
    echo ""
    echo "空密码用户"
    echo "----------"
    USEREmptyPassword=""
    for shell in $(grep -v "/sbin/nologin" /etc/shells);do
            for user in $(echo "$pwdfile" | grep "$shell" | cut -d: -f1);do
            r=$(awk -F: '$2=="!!"{print $1}' /etc/shadow | grep -w $user)
            if [ ! -z $r ];then
                echo $r
                USEREmptyPassword="$USEREmptyPassword,"$r
            fi
        done    
    done
    echo ""
    echo "相同ID的用户"
    echo "------------"
    USERTheSameU
    UIDs=$(cut -d: -f3 /etc/passwd | sort | uniq -c | awk '$1>1{print $2}')
    for uid in $UIDs;do
        echo -n "$uid";
        USERTheSameU
        r=$(awk -F: 'ORS="";$3=='"$uid"'{print ":",$1}' /etc/passwd)
        echo "$r"
        echo ""
        USERTheSameU
    done
    #报表信息
    report_USERs="$USERs"    #用户
    report_USEREmptyPassword=$(echo $USEREmptyPassword | sed 's/^,//') 
    report_USERTheSameU sed 's/,$//') 
    report_RootUser=$(echo $RootUser | sed 's/^,//')    #特权用户
}
 
function getPasswordStatus {
    echo ""
    echo ""
    echo "############################ 密码检查 ############################"
    pwdfile="$(cat /etc/passwd)"
    echo ""
    echo "密码过期检查"
    echo "------------"
    result=""
    for shell in $(grep -v "/sbin/nologin" /etc/shells);do
        for user in $(echo "$pwdfile" | grep "$shell" | cut -d: -f1);do
            get_expiry_date=$(/usr/bin/chage -l $user | grep 'Password expires' | cut -d: -f2)
            if [[ $get_expiry_date = ' never' || $get_expiry_date = 'never' ]];then
                printf "%-15s 永不过期\n" $user
                result="$result,$user:never"
            else
                password_expiry_date=$(date -d "$get_expiry_date" "+%s")
                current_date=$(date "+%s")
                diff=$(($password_expiry_date-$current_date))
                let DAYS=$(($diff/(60*60*24)))
                printf "%-15s %s天后过期\n" $user $DAYS
                result="$result,$user:$DAYS days"
            fi
        done
    done
    report_PasswordExpiry=$(echo $result | sed 's/^,//')
 
    echo ""
    echo "密码策略检查"
    echo "------------"
    grep -v "#" /etc/login.defs | grep -E "PASS_MAX_DAYS|PASS_MIN_DAYS|PASS_MIN_LEN|PASS_WARN_AGE"
 
}
 
function getSudoersStatus(){
    echo ""
    echo ""
    echo "############################ Sudoers检查 #########################"
    conf=$(grep -v "^#" /etc/sudoers| grep -v "^Defaults" | sed '/^$/d')
    echo "$conf"
    echo ""
    #报表信息
    report_Sudoers="$(echo $conf | wc -l)"
}
 
function getInstalledStatus(){
    echo ""
    echo ""
    echo "############################ 软件检查 ############################"
    rpm -qa --last | head | column -t 
}
 
function getProcessStatus(){
    echo ""
    echo ""
    echo "############################ 进程检查 ############################"
    if [ $(ps -ef | grep defunct | grep -v grep | wc -l) -ge 1 ];then
        echo ""
        echo "僵尸进程";
        echo "--------"
        ps -ef | head -n1
        ps -ef | grep defunct | grep -v grep
    fi
    echo ""
    echo "内存占用TOP10"
    echo "-------------"
    echo -e "PID %MEM RSS COMMAND
    $(ps aux | awk '{print $2, $4, $6, $11}' | sort -k3rn | head -n 10 )"| column -t 
    echo ""
    echo "CPU占用TOP10"
    echo "------------"
    top b -n1 | head -17 | tail -11
    #报表信息
    report_DefunctProsess="$(ps -ef | grep defunct | grep -v grep|wc -l)"
}
 
function getJDKStatus(){
    echo ""
    echo ""
    echo "############################ JDK检查 #############################"
    java -version 2>/dev/null
    if [ $? -eq 0 ];then
        java -version 2>&1
    fi
    echo "JAVA_HOME=\"$JAVA_HOME\""
    #报表信息
    report_JDK="$(java -version 2>&1 | grep version | awk '{print $1,$3}' | tr -d '"')"
}
function getSyslogStatus(){
    echo ""
    echo ""
    echo "############################ syslog检查 ##########################"
    echo "服务状态:$(getState rsyslog)"
    echo ""
    echo "/etc/rsyslog.conf"
    echo "-----------------"
    cat /etc/rsyslog.conf 2>/dev/null | grep -v "^#" | grep -v "^\\$" | sed '/^$/d'  | column -t
    #报表信息
    report_Syslog="$(getState rsyslog)"
}
function getFirewallStatus(){
    echo ""
    echo ""
    echo "############################ 防火墙检查 ##########################"
    #防火墙状态,策略等
    if [[ $centosVersion < 7 ]];then
        /etc/init.d/iptables status >/dev/null  2>&1
        status=$?
        if [ $status -eq 0 ];then
                s="active"
        elif [ $status -eq 3 ];then
                s="inactive"
        elif [ $status -eq 4 ];then
                s="permission denied"
        else
                s="unknown"
        fi
    else
        s="$(getState iptables)"
    fi
    echo "iptables: $s"
    echo ""
    echo "/etc/sysconfig/iptables"
    echo "-----------------------"
    cat /etc/sysconfig/iptables 2>/dev/null
    #报表信息
    report_Firewall="$s"
}
 
function getSNMPStatus(){
    #SNMP服务状态,配置等
    echo ""
    echo ""
    echo "############################ SNMP检查 ############################"
    status="$(getState snmpd)"
    echo "服务状态:$status"
    echo ""
    if [ -e /etc/snmp/snmpd.conf ];then
        echo "/etc/snmp/snmpd.conf"
        echo "--------------------"
        cat /etc/snmp/snmpd.conf 2>/dev/null | grep -v "^#" | sed '/^$/d'
    fi
    #报表信息
    report_SNMP="$(getState snmpd)"
}
 
function getState(){
    if [[ $centosVersion < 7 ]];then
        if [ -e "/etc/init.d/$1" ];then
            if [ `/etc/init.d/$1 status 2>/dev/null | grep -E "is running|正在运行" | wc -l` -ge 1 ];then
                r="active"
            else
                r="inactive"
            fi
        else
            r="unknown"
        fi
    else
        #CentOS 7+
        r="$(systemctl is-active $1 2>&1)"
    fi
    echo "$r"
}
 
function getSSHStatus(){
    #SSHD服务状态,配置,受信任主机等
    echo ""
    echo ""
    echo "############################ SSH检查 #############################"
    #检查受信任主机
    pwdfile="$(cat /etc/passwd)"
    echo "服务状态:$(getState sshd)"
    Protocol_Version=$(cat /etc/ssh/sshd_config | grep Protocol | awk '{print $2}')
    echo "SSH协议版本:$Protocol_Version"
    echo ""
    echo "信任主机"
    echo "--------"
    authorized=0
    for user in $(echo "$pwdfile" | grep /bin/bash | awk -F: '{print $1}');do
        authorize_file=$(echo "$pwdfile" | grep -w $user | awk -F: '{printf $6"/.ssh/authorized_keys"}')
        authorized_host=$(cat $authorize_file 2>/dev/null | awk '{print $3}' | tr '\n' ',' | sed 's/,$//')
        if [ ! -z $authorized_host ];then
            echo "$user 授权 \"$authorized_host\" 无密码访问"
        fi
        let authorized=authorized+$(cat $authorize_file 2>/dev/null | awk '{print $3}'|wc -l)
    done
 
    echo ""
    echo "是否允许ROOT远程登录"
    echo "--------------------"
    config=$(cat /etc/ssh/sshd_config | grep PermitRootLogin)
    firstChar=${config:0:1}
    if [ $firstChar == "#" ];then
        PermitRootLogin="yes"  #默认是允许ROOT远程登录的
    else
        PermitRootLogin=$(echo $config | awk '{print $2}')
    fi
    echo "PermitRootLogin $PermitRootLogin"
 
    echo ""
    echo "/etc/ssh/sshd_config"
    echo "--------------------"
    cat /etc/ssh/sshd_config | grep -v "^#" | sed '/^$/d'
 
    #报表信息
    report_SSHAuthorized="$authorized"    #SSH信任主机
    report_SSHDProtocolVersion="$Protocol_Version"    #SSH协议版本
    report_SSHDPermitRootLogin="$PermitRootLogin"    #允许root远程登录
}
function getNTPStatus(){
    #NTP服务状态,当前时间,配置等
    echo ""
    echo ""
    echo "############################ NTP检查 #############################"
    if [ -e /etc/ntp.conf ];then
        echo "服务状态:$(getState ntpd)"
        echo ""
        echo "/etc/ntp.conf"
        echo "-------------"
        cat /etc/ntp.conf 2>/dev/null | grep -v "^#" | sed '/^$/d'
    fi
    #报表信息
    report_NTP="$(getState ntpd)"
}
 
function uploadHostDailyCheckReport(){
    json="{
        \"DateTime\":\"$report_DateTime\",
        \"Hostname\":\"$report_Hostname\",
        \"OSRelease\":\"$report_OSRelease\",
        \"Kernel\":\"$report_Kernel\",
        \"Language\":\"$report_Language\",
        \"LastReboot\":\"$report_LastReboot\",
        \"Uptime\":\"$report_Uptime\",
        \"CPUs\":\"$report_CPUs\",
        \"CPUType\":\"$report_CPUType\",
        \"Arch\":\"$report_Arch\",
        \"MemTotal\":\"$report_MemTotal\",
        \"MemFree\":\"$report_MemFree\",
        \"MemUsedPercent\":\"$report_MemUsedPercent\",
        \"DiskTotal\":\"$report_DiskTotal\",
        \"DiskFree\":\"$report_DiskFree\",
        \"DiskUsedPercent\":\"$report_DiskUsedPercent\",
        \"InodeTotal\":\"$report_InodeTotal\",
        \"InodeFree\":\"$report_InodeFree\",
        \"InodeUsedPercent\":\"$report_InodeUsedPercent\",
        \"IP\":\"$report_IP\",
        \"MAC\":\"$report_MAC\",
        \"Gateway\":\"$report_Gateway\",
        \"DNS\":\"$report_DNS\",
        \"Listen\":\"$report_Listen\",
        \"Selinux\":\"$report_Selinux\",
        \"Firewall\":\"$report_Firewall\",
        \"USERs\":\"$report_USERs\",
        \"USEREmptyPassword\":\"$report_USEREmptyPassword\",
        \"USERTheSameUID\":\"$report_USERTheSameUID\",
        \"PasswordExpiry\":\"$report_PasswordExpiry\",
        \"RootUser\":\"$report_RootUser\",
        \"Sudoers\":\"$report_Sudoers\",
        \"SSHAuthorized\":\"$report_SSHAuthorized\",
        \"SSHDProtocolVersion\":\"$report_SSHDProtocolVersion\",
        \"SSHDPermitRootLogin\":\"$report_SSHDPermitRootLogin\",
        \"DefunctProsess\":\"$report_DefunctProsess\",
        \"SelfInitiatedService\":\"$report_SelfInitiatedService\",
        \"SelfInitiatedProgram\":\"$report_SelfInitiatedProgram\",
        \"RuningService\":\"$report_RuningService\",
        \"Crontab\":\"$report_Crontab\",
        \"Syslog\":\"$report_Syslog\",
        \"SNMP\":\"$report_SNMP\",
        \"NTP\":\"$report_NTP\",
        \"JDK\":\"$report_JDK\"
    }"
    #echo "$json" 
    curl -l -H "Content-type: application/json" -X POST -d "$json" "$uploadHostDailyCheckReportApi" 2>/dev/null
}
 
function check(){
    version
    getSystemStatus
    getCpuStatus
    getMemStatus
    getDiskStatus
    getNetworkStatus
    getListenStatus
    getProcessStatus
    getServiceStatus
    getAutoStartStatus
    getLoginStatus
    getCronStatus
    getUserStatus
    getPasswordStatus
    getSudoersStatus
    getJDKStatus
    getFirewallStatus
    getSSHStatus
    getSyslogStatus
    getSNMPStatus
    getNTPStatus
    getInstalledStatus
}
 
#执行检查并保存检查结果
check > $RESULTFILE

如果巡检服务器是windows,可将其配置为rsync服务器,安装cwRsyncServer,

3.4、监控 WebLogic shell脚本

#!/bin/bash
CLASSPATH="/opt/Oracle/Middleware/wlserver_10.3/server/lib/weblogic.jar:$CLASSPATH"
PATH="/usr/java/jdk1.6.0_45/bin:$PATH"

URL="192.168.222.11:7020"
USER_NAME="weblogic"
PASS_WORD="weblogic1"
DOMAIN_NAME="MedRecDomain"
SERVER_NAME="MedRecAdmSvr"

STATE_ALL=$(java weblogic.Admin -url $URL -username $USER_NAME -password $PASS_WORD get -pretty -mbean "$DOMAIN_NAME:Location=$SERVER_NAME,Name=$SERVER_NAME,Type=ServerRuntime")

# Check WebLogic instance running status
echo "$STATE_ALL" | grep -q "State: RUNNING"
if [ $? == 0 ]; then
    echo "$URL $DOMAIN_NAME $SERVER_NAME running status is OK"
else
    echo "$URL $DOMAIN_NAME $SERVER_NAME running status is not OK"
fi

# Check WebLogic instance health status
echo "$STATE_ALL" | grep -q "State:HEALTH_OK"
if [ $? == 0 ]; then
    echo "$URL $DOMAIN_NAME $SERVER_NAME health status is OK"
else
    echo "$URL $DOMAIN_NAME $SERVER_NAME health status is not OK"
fi

# Check WebLogic instance open sockets number
SOCKET_MAX=200
SOCKET_NOW=$(echo "$STATE_ALL" | awk '/OpenSocketsCurrentCount/{print $2}')
if [ x$SOCKET_NOW == x ]; then
    echo "$URL $DOMAIN_NAME $SERVER_NAME open sockets number is not OK: fail to get"
else
    if [ $SOCKET_NOW -gt $SOCKET_MAX ]; then
        echo "$URL $DOMAIN_NAME $SERVER_NAME health status is not OK: $SOCKET_NOW greater than $SOCKET_MAX"
    else
        echo "$URL $DOMAIN_NAME $SERVER_NAME health status is OK: $SOCKET_NOW not greater than $SOCKET_MAX"
    fi
fi

3.5、weblogic日志按天生成压缩保存

#!/bin/bash
TODAY=`date -u +"%Y%m%d"` 
/usr/bin/gzip -c /app/weblogic/Oracle/Middleware/user_projects/domains/base_domain/bin/AdminServer.log>/app/weblogic/Oracle/Middleware/user_projects/domains/base_domain/bin/AdminServer${TODAY}.out.gz
> /app/weblogic/Oracle/Middleware/user_projects/domains/base_domain/bin/AdminServer.log
#!/bin/ksh
TODAY=`date -u +"%Y%m%d"` 
#nohup_save.sh
/usr/bin/gzip -c /root/Oracle/Middleware/user_projects/domains/base_domain/bin/nohup.out>/root/Oracle/Middleware/user_projects/domains/base_domain/bin/nohup${TODAY}.out.gz
> /root/Oracle/Middleware/user_projects/domains/base_domain/bin/nohup.out

3.6、weblogic状态监控脚本

echo "======================================welcome=============================================="
echo "====                                                                                 ======"
echo "====           此脚本是用来监控weblogic的domain运行状态主要的监控对象有              ======"
echo "====                server,Thread,Request,Jdbc State and Socckets                ======"
echo "====                使用时修改url, usernam,password 即可                             ======"
echo "====                create by xxx at 2009=03=30                                    ======"
echo "==========================================================================================="
url=t3://***.***.***.***.***:8082
username=weblogic
password=123456

while [ true ]
do
echo "==============================weblogic的domain的名称========================================================="

java -cp /bea/weblogic81/server/lib/weblogic.jar weblogic.Admin -url $url  -username $username -password $password GET -pretty -type Server -property Parent | awk '/^/t/' | awk 'NR==1{print $2}'

echo"=============================================================================================================="
echo "========================================目前空闲线程=========================================================="
java -cp /bea/weblogic81/server/lib/weblogic.jar weblogic.Admin -url  $url  -username $username -password $password GET -pretty -type ExecuteQueueRuntime -property ExecuteThreadCurrentIdleCount -property ExecuteThreadTotalCount -property ServicedRequestTotalCount

echo"=============================================================================================================="

echo "======================================= server_ip和port========================================================"
java -cp /bea/weblogic81/server/lib/weblogic.jar weblogic.Admin -url $url  -username $username -password $password GET -pretty -type Server -property Name -property ListenAddress -property ListenPort

echo"=============================================================================================================="
echo "=======================================Server运行状态和OpenSocketsCurrentCount数量=============================="
java -cp /bea/weblogic81/server/lib/weblogic.jar weblogic.Admin -url $url  -username $username -password $password GET -pretty -type ServerRuntime -property State -property Server -property OpenSocketsCurrentCount

echo"=============================================================================================================="
echo "============================JDBC连接池的初始化和最大各数以及已经发布再上面的server=============================="
java -cp /bea/weblogic81/server/lib/weblogic.jar weblogic.Admin -url $url  -username $username -password $password GET -pretty -type JDBCConnectionPool -property Name -property InitialCapacity -property MaxCapacity -property Targets
echo"=============================================================================================================="
sleep 60
done

另外还可使用软件来监控:hyperic hq和Jennifer软件。

3.7、监控weblogic的python脚本

  1 ###############################################################################
  2 #created on 2013-07-09
  3
  4 #used to get weblogic  server runtime infomation
  5 #wls_ver:weblogic 10.3.5.0
  6 ###############################################################################
  7 
  8 ###############################################################################
  9 # parameters define
 10 ###############################################################################
 11 username='weblogic'
 12 password='isp902isp'
 13 url='t3://10.200.36.210:17101'
 14 LOOPS=3
 15 IntervalTime=30000
 16 FILEPATH="e:/logs/"
 17 newline = "\n"
 18 ###############################################################################
 19 # define functions
 20 ###############################################################################
 21 def WriteToFile(ServerName, SubModule, LogString, LSTARTTIME, FILENAME):
 22 
 23     if SubModule == "ServerCoreInfo":
 24         HeadLineInfo = "DateTime,ServerName,ExecuteThreadIdleCount,StandbyThreadCount,ExecuteThreadTotalCount,busythread,HoggingThreadCount"
 25     elif SubModule == "DataSourceInfo":
 26         HeadLineInfo = "DateTime,ServerName,DataSourceName,ActiveConnectionsCurrentCount,CurrCapacity,WaitingForConnectionCurrentCount,WaitingForConnectionTotal"
 27             
 28     if not os.path.exists(FILENAME): 
 29         print  "path not exist, create log file by self."
 30         f = open(FILENAME, "a+")
 31         f.write(HeadLineInfo + newline)
 32         f.write(LSTARTTIME + "," + ServerName + "," + LogString + newline)
 33         f.close()
 34         f = None
 35     else:
 36         f = open(FILENAME, "a+")    
 37         f.write(LSTARTTIME + "," + ServerName + ","  + LogString + newline)
 38         f.close()
 39         f = None
 40     
 41 def getCurrentTime():
 42     s=SimpleDateFormat("yyyyMMdd HHmmss")
 43     currentTime=s.format(Date())
 44     return currentTime
 45 def GetJdbcRuntimeInfo():
 46     domainRuntime()
 47     servers = domainRuntimeService.getServerRuntimes();
 48     print ' ******************DATASOURCE CONNECTION POOL RUNTIME INFORMATION*******'
 49     for server in servers:   
 50         print 'SERVER: ' + server.getName();
 51         ServerName=server.getName()
 52         jdbcRuntime = server.getJDBCServiceRuntime();
 53         datasources = jdbcRuntime.getJDBCDataSourceRuntimeMBeans();
 54         for datasource in datasources:
 55             ds_name=datasource.getName()
 56             print('-Data Source: ' + datasource.getName() + ', Active Connections: ' + repr(datasource.getActiveConnectionsCurrentCount()) + ', CurrCapacity: ' + repr(datasource.getCurrCapacity())+' , WaitingForConnectionCurrentCount: '+repr(datasource.getWaitingForConnectionCurrentCount())+' , WaitingForConnectionTotal: '+str(datasource.getWaitingForConnectionTotal()));
 57             FILENAME=FILEPATH + ServerName +"_"+ ds_name + "_"+ "DataSourceInfo" +".csv"
 58             LSTARTTIME=getCurrentTime()
 59             dsLogString=ds_name +','+ str(datasource.getActiveConnectionsCurrentCount())+','+repr(datasource.getCurrCapacity())+','+str(datasource.getWaitingForConnectionCurrentCount())+','+str(datasource.getWaitingForConnectionTotal())
 60             WriteToFile(ServerName, "DataSourceInfo", dsLogString, LSTARTTIME, FILENAME)
 61             
 62 def GetThreadRuntimeInfo():
 63     domainRuntime()
 64     servers=domainRuntimeService.getServerRuntimes();
 65     print ' ******************SERVER QUEUE THREAD RUNTIME INFOMATION***************'
 66     for server in servers:
 67         print 'SERVER: ' + server.getName()
 68         ServerName=server.getName()
 69         threadRuntime=server.getThreadPoolRuntime()
 70         hoggingThreadCount = str(threadRuntime.getHoggingThreadCount())
 71         idleThreadCount = str(threadRuntime.getExecuteThreadIdleCount())
 72         standbycount = str(threadRuntime.getStandbyThreadCount())
 73         threadTotalCount = str(threadRuntime.getExecuteThreadTotalCount())
 74         busythread=str(threadRuntime.getExecuteThreadTotalCount()-threadRuntime.getStandbyThreadCount()-threadRuntime.getExecuteThreadIdleCount()-1)
 75         print ('-Thread :' + 'idleThreadCount:' + idleThreadCount+' ,standbycount:'+standbycount+' , threadTotalCount: '+threadTotalCount+' , hoggingThreadCount:'+hoggingThreadCount+' ,busythread:'+busythread)
 76         FILENAME=FILEPATH + ServerName +"_"+ "ServerCoreInfo" +".csv"
 77         LSTARTTIME=getCurrentTime()
 78         serLogString=idleThreadCount+','+standbycount+','+threadTotalCount+','+busythread+','+hoggingThreadCount
 79         WriteToFile(ServerName, "ServerCoreInfo", serLogString, LSTARTTIME, FILENAME)
 80 ###############################################################################
 81 ############ main 
 82 ###############################################################################               
 83 if __name__ == '__main__': 
 84     from wlstModule import *#@UnusedWildImport
 85 #import sys, re, os
 86 #import java
 87 from java.util import Date
 88 from java.text import SimpleDateFormat
 89 print 'starting the script ....'
 90 connect(username,password, url);
 91 try:
 92     for i in range(LOOPS) :
 93         
 94         GetThreadRuntimeInfo()
 95         GetJdbcRuntimeInfo()
 96         java.lang.Thread.sleep(IntervalTime)
 97         
 98 except Exception, e:
 99     print e 
100     dumpStack()
101     raise 
102 disconnect() 

将上述脚本保存为ColletRuntime.py,放到weblogic的安装目录D:\weblogic\bea\wlserver_10.3\common\bin下,修改collectionRuntime.py中的weblogic的用户名,密码,IP,端口,日志路径修改成正确的数据。打开CMD窗口,切换到D:\weblogic\bea\wlserver_10.3\common\bin下,然后执行命令wlst.cmd CollectRuntime.py
在日志路径下会看到自动生成的CSV文件。能看到HoggingThread和busyThread的数据。

3.8、oracle自动化巡检脚本

#!/bin/bash 
# 
#    NAME 
#      report_oracle_inspection.sh   2016-09-30
# 
#    DESCRIPTION 
#      collecting the DB info 
# 
#    NOTES 
#      sh  report_oracle_inspection.sh 
# 
#    MODIFIED   (yyyy-mm-dd) 
#    


 
echo 'Instance Health Data' 
echo '================================================' 
echo 'The current database is $ORACLE_SID' 
echo 'The current running processes for $ORACLE_SID are' 
echo '================================================' 
ps -ef|grep $ORACLE_SID 
 
sqlplus -S /nolog <<EOF      
 connect / as sysdba 
set feedback off 
set heading off 
select '00.instance information' from dual; 
select '================================================' from dual; 
set linesize 1000 
set pagesize 1000 
set heading on 
select * from v\$instance; 
 
set heading off 
select '01:database created date and archive type' from dual; 
select '================================================' from dual; 
set heading on  
Select Created, Log_Mode, Log_Mode From V\$Database; 
 
set heading off 
select '1.ulimit oracle' from dual; 
select '================================================' from dual; 
!ulimit -a 
 
set heading off 
select '2.installed production option' from dual; 
select '================================================' from dual; 
set linesize 1000 
set pagesize 1000 
set heading on 
select * from v\$option; 
 
set heading off 
select '3.used production option' from dual; 
select '================================================' from dual; 
set linesize 1000 
set pagesize 1000 
col COMP_NAME for a40 
set heading on 
select COMP_ID, COMP_NAME, VERSION,STATUS from   dba_registry; 
 
set heading off 
select '4.spfile' from dual; 
select '================================================' from dual; 
show parameter spfile 
 
set heading off 
select '5.not default parameter' from dual; 
select '================================================' from dual; 
col name for a40 
col value for a40 
set heading on 
select name,value from v\$parameter where isdefault='FALSE'; 
 
set heading off 
select '6.control file' from dual; 
select '================================================' from dual; 
show parameter control_files 
 
set heading off 
select '7.backup control file' from dual; 
select '================================================' from dual; 
alter database backup controlfile to trace; 
 
set heading off 
select '8.log file' from dual; 
select '================================================' from dual; 
set linesize 1000 
set pagesize 1000 
set heading on 
select group#,thread#,bytes/1024/1024 size_MB , members, archived,status from 
v\$Log; 
 
set heading off 
select '9.log file' from dual; 
col MEMBER for a40 
select '================================================' from dual; 
set heading on 
select * From v\$logfile order by 1; 
 
set heading off 
select '10.Archive log' from dual; 
select '================================================' from dual; 
Archive log list  
 
select '11.data file' from dual; 
select '================================================' from dual; 
set heading on 
select count(*),sum(bytes)/1024/1024/1024 ||'G'  max_G from v\$datafile; 
 
SELECT trunc(sum(sum_m-sum_free_m)/1024,2)||'G' used_G 
FROM (  
SELECT tablespace_name,sum(bytes)/1024/1024 AS sum_m FROM dba_data_files 
where tablespace_name not like 'UNDO%' GROUP BY tablespace_name) df, 
(SELECT tablespace_name, 
sum(bytes)/1024/1024 AS sum_free_m 
FROM dba_free_space GROUP BY tablespace_name ) fs 
where df.tablespace_name=fs.tablespace_name; 
 
set heading off 
select '12.data file location' from dual; 
select '================================================' from dual; 
set heading on 
select t1.TABLESPACE_NAME,t1.FILE_ID, t1.bytes/1024/1024 
SIZE_MB,t1.AUTOEXTENSIBLE AUT,t2.status,t1.FILE_NAME 
from dba_data_files t1,v\$datafile t2 
where t1.file_id=t2.file#; 
 
set heading off 
select '13-1.temp data file' from dual; 
select '================================================' from dual; 
set heading on 
select FILE_NAME,FILE_ID,TABLESPACE_NAME,BYTES/1024/1024 
byte_MB,status,AUTOEXTENSIBLE from sys.dba_temp_files; 
 
set heading off 
select '13-2.temp tablespace' from dual; 
select '================================================' from dual; 
set heading on 
col file_name for a30 
col  byte_MB for a20 
col  cached_MB for a20 
SELECT d.file_name,  v.status, TO_CHAR((d.bytes / 1024 / 1024), '99999990.000') 
byte_MB,  
TO_CHAR(NVL(t.bytes_cached, 0) / 1024 / 1024, '99999990.000') cached_MB, 
d.autoextensible, d.increment_by, d.maxblocks  
FROM sys.dba_temp_files d, v\$temp_extent_pool t, v\$tempfile v  
WHERE (t.file_id (+)= d.file_id) AND (d.tablespace_name = 'TEMP') AND (d.file_id 
= v.file#); 
 
set heading off 
select '14.system tablespace' from dual; 
select '================================================' from dual; 
set heading on 
select owner,segment_type,segment_name from dba_segments where owner not 
in('SYS','SYSTEM','MDSYS','ORDSYS','OUTLN','WMSYS') and 
tablespace_name='SYSTEM' order by 1; 
exit 
EOF 
 
ora_version=`sqlplus -S '/ as sysdba' <<EOF 
set head off 
select version from v\\\$instance; 
exit; 
EOF` 
 
echo $ora_version 
 
if [ `echo $ora_version|awk -F"." '{print $1}'` -ne 8 ] 
then 
sqlplus -S /nolog <<EOF 
conn / as sysdba 
set linesize 1000 
set pagesize 1000 
set heading off 
select '15.tablespace fragmentation and free' from dual; 
select '================================================' from dual; 
col TABLESPACE_NAME for a30 
col FREE_PCT for a20 
set heading on 
SELECT df.TABLESPACE_NAME,FILES, extent_management ,sum_m as 
TOTAL_SIZE,--sum(largest) as "MAXFREE_MB",  
sum_free_m as "FREE_MB",to_char(100*sum_free_m/sum_m, '999.99') AS 
FREE_PCT--,sum(blocks) as "FREE_EXTENTS" 
FROM ( SELECT tablespace_name,count(file_id) as files ,sum(bytes)/1024/1024 AS 
sum_m FROM dba_data_files GROUP BY tablespace_name) df, 
(SELECT tablespace_name,--max(bytes)/1024/1024 largest, 
sum(bytes)/1024/1024 AS sum_free_m --,count(blocks) as blocks 
FROM dba_free_space GROUP BY tablespace_name ) fs,(select 
tablespace_name,extent_management from dba_tablespaces) ts 
where df.tablespace_name=fs.tablespace_name and 
fs.tablespace_name=ts.tablespace_name;  
exit; 
EOF 
 
else 
sqlplus -S /nolog <<EOF 
conn / as sysdba 
set linesize 1000 
set pagesize 1000 
select '16.tablespace fragmentation and free (8i)' from dual; 
select '================================================' from dual; 
col TABLESPACE_NAME for a30 
col FREE_PCT for a20 
set heading on 
SELECT df.TABLESPACE_NAME,FILES, sum_m as TOTAL_SIZE,--sum(largest) as 
"MAXFREE_MB",  
sum_free_m as "FREE_MB",to_char(100*sum_free_m/sum_m, '999.99') AS 
FREE_PCT--,sum(blocks) as "FREE_EXTENTS" 
FROM ( SELECT tablespace_name,count(file_id) as files ,sum(bytes)/1024/1024 AS 
sum_m FROM dba_data_files GROUP BY tablespace_name) df, 
(SELECT tablespace_name,--max(bytes)/1024/1024 largest, 
sum(bytes)/1024/1024 AS sum_free_m --,count(blocks) as blocks 
FROM dba_free_space GROUP BY tablespace_name ) fs,(select tablespace_name from 
dba_tablespaces) ts 
where df.tablespace_name=fs.tablespace_name and 
fs.tablespace_name=ts.tablespace_name;  
exit; 
EOF 
fi 


sqlplus -S /nolog <<EOF 
conn / as sysdba 
set linesize 1000 
set pagesize 1000 
set heading off 
select '17.object list' from dual; 
select '================================================' from dual; 
col OBJECT_TYPE for a20 
set heading on 
select owner,replace(object_type,' ','_') as OBJECT_TYPE,count(*) from 
dba_objects where 
owner not in ('SYS','SYSTEM') group by owner,object_type order by 
owner,object_type; 
 
set heading off 
select '18.invalid objects' from dual; 
select '================================================' from dual; 
col OBJECT_NAME for a40 
col OBJECT_TYPE for a20 
set heading on 
select OWNER,OBJECT_NAME,replace(OBJECT_TYPE,' ','_') as 
OBJECT_TYPE,STATUS,TIMESTAMP from dba_objects where status='INVALID'; 
 
set heading off 
select '19.dblinks' from dual; 
select '================================================' from dual; 
col DB_LINK for a40 
col OWNER for a10 
col HOST for a20 
set heading on 
select * from dba_db_links; 
 
set heading off 
select '20.indexes' from dual; 
select '================================================' from dual; 
set heading on 
select * From dba_indexes where BLEVEL>4; 
 
set heading off 
select '21.dba role' from dual; 
select '================================================' from dual; 
set heading on 
select grantee,granted_role from dba_role_privs where granted_role='DBA'; 
 
set heading off 
select '22.sysdba role' from dual; 
select '================================================' from dual; set heading on 
SELECT * FROM v\$pwfile_users order by username; 


set head off 
select '2-performance' from dual; select 
'================================================================================================' from dual; 
select '2-1.buffer cache hit ratio:(Higher than 80% is ok, high value does not alwasy mean good performance)' from dual; 
select '================================================' from dual; set head on 
select (1 - (sum(decode(name, 'physical reads', value, 0)) / (sum(decode(name, 'db block gets', value, 0)) + 
sum(decode(name, 'consistent gets', value, 0))))) * 100  "Hit Ratio" from v\$sysstat;  
set head off 
select '2-2.data dictionary hit ratio:should >98%' from dual; 
select '================================================' from dual; set head on 
select (1 - (sum(getmisses) / sum(gets))) * 100 "Hit Ratio" from v\$rowcache;  
set head off select '2-3.library cache hit ratio:(Should be kept over 90%, otherwise there mighe be too much reparse)' from dual; 
select '================================================' from dual; set head on 
select sum(pins) / (sum(pins) + sum(reloads)) * 100 "Hit Ratio" from v\$librarycache;  
set head off 
select '2-4.menory sort ratio:should >98%' from dual; 
select '================================================' from dual; set head on 
select a.value "Disk Sorts", b.value "Memory Sorts", round((100 * b.value) / 
decode((a.value + b.value), 0, 1, (a.value + b.value)), 2) "Pct Memory Sorts" 
from v\$sysstat a, v\$sysstat b where a.name = 'sorts (disk)' and b.name = 'sorts (memory)';


set head off 
select '2-5.memory top 10 sql read ratio:should <5%' from dual; 
select '================================================' from dual; set head on 
select sum(pct_bufgets) 
from (select rank() over(order by buffer_gets desc) as rank_bufgets, to_char(100 * ratio_to_report(buffer_gets) over(), '999.99') pct_bufgets from v\$sqlarea) 
where rank_bufgets < 11;  
set heading off 
select '' from dual; select '2-6.Top 10 Wait Event (Time unit:Hundreths of a second, IO operations should be common wait event)' from dual; 
select '================================================' from dual; set heading on 
column event format a30 
select * from (select event,total_waits,time_waited, average_wait from v\$system_event where  
event not like 'SQL*Net%' and event not like '%ipc%' order by total_waits desc) where  rownum<11;  
set head off 
select '2-7.memory top 10 sql' from dual; 
select '================================================' from dual; set head on  
set serveroutput on size 1000000 declare 
top10 number; 
text1 varchar2(4000); x number; len1 number; cursor c1 is 
select buffer_gets, substr(sql_text, 1, 4000) from v\$sqlarea 
order by buffer_gets desc; begin  
dbms_output.put_line('------------' || ' ' || '-------------------'); open c1; 
for i in 1 .. 10 loop 
fetch c1 into top10, text1; 
dbms_output.put_line('------------top sql No.' ||i||'-------------------');




dbms_output.put_line(rpad(to_char(top10), 9) ); len1 := length(text1); x := 1; 
while len1 > x - 1 loop 
dbms_output.put_line('  ' || substr(text1, x, 65)); x := x + 66; end loop; end loop; end; /  
set head off 
select '2-8.IO information' from dual; 
select '================================================' from dual; set head on 
Select phyrds,phywrts,d.name from v\$datafile d,v\$filestat f where f.file#=d.file# order by d.name;  
set head off 
select '2-9.full table scan' from dual; 
select '================================================' from dual; set head on 
Select name,value value1 from v\$sysstat where name like '%table scan%';  
set head off 
select '3-1.sys and system security' from dual; 
select '================================================' from dual; 
select username "User(s) with Default Password!",ACCOUNT_STATUS from dba_users where password in 
('E066D214D5421CCC',  -- dbsnmp 
'24ABAB8B06281B4C',  -- ctxsys 
'72979A94BAD2AF80',  -- mdsys 
'C252E8FA117AF049',  -- odm 
'A7A32CD03D3CE8D5',  -- odm_mtr 
'88A2B2C183431F00',  -- ordplugins 
'7EFA02EC7EA6B86F',  -- ordsys 
'4A3BA55E08595C81',  -- outln 
'F894844C34402B67',  -- scott 
'3F9FBD883D787341',  -- wk_proxy 
'79DF7A1BD138CF11',  -- wk_sys 
'7C9BA362F8314299',  -- wmsys 
'88D8364765FCE6AF',  -- xdb 
'F9DA8977092B7B81',  -- tracesvr 
'9300C0977D7DC75E',  -- oas_public
'A97282CE3D94E29E',  -- websys 
'AC9700FD3F1410EB',  -- lbacsys 
'E7B5D92911C831E1',  -- rman 
'AC98877DE1297365',  -- perfstat 
'66F4EF5650C20355',  -- exfsys 
'84B8CBCA4D477FA3',  -- si_informtn_schema 
'D4C5016086B2DC6A',  -- sys 
'D4DF7931AB130E37')  -- system 
; 
exit 
EOF 
 
 
cd $ORACLE_HOME/network/admin/ 
echo '3-2.listener configure' 
echo 'listener.ora================================================' 
cat listener*.ora 
sleep 2;  
echo 'sqlnet.ora=================================================' 
cat sqlnet*.ora 
sleep 2;  
echo 'tnsnames.ora================================================' 
cat tnsnames*.ora 
sleep 2;  
 
echo '3-3.controlfile dump============================================' 
ora_dump=`sqlplus -S '/ as sysdba' <<EOF 
set head off 
select value 
from v\\\$parameter 
where name='user_dump_dest'; 
exit; 
EOF` 
 
cd $ora_dump 
ls -lt|head -n 2|tail -n 1|awk '{print $9}'|xargs cat 
sleep 2; 
 
echo '3-4.Alert Log ORA- Warning 
Error============================================' 
ora_background_dump=`sqlplus -S '/ as sysdba' <<EOF 
set head off 
select value 
from v\\\$parameter 
where name='background_dump_dest'; 
exit; 
EOF` 
 
cd $ora_background_dump 
tail -10000 alert_$ORACLE_SID.log|grep ORA- 
sleep 2; 
 
echo '3-5.Alert Log size============================================' 
ls -l alert_$ORACLE_SID.log 
 
echo '3-6.listener.log size============================================' 
lsnrctl status|grep listener.log|awk '{print $4}'|xargs ls -l 
 
echo '3-7.crontab info============================================' 
crontab -l 
 
echo '3-8.Alert Log  tail 20000 
nums============================================' 
tail -20000 alert_$ORACLE_SID.log 
 
SYSTEM=`uname -s` 
export SYSTEM 
 
echo '4.machine information============================================' 
 
if [ $SYSTEM = "Linux" ]  then 
echo "----------------host name----------------" 
hostname 
echo "" 
echo "----------------id----------------" 
id 
echo "" 
echo '--- Current uptime,users and load averages ---' 
uptime 
echo "" 
echo "----------------CPU number----------------" 
cat /proc/cpuinfo 
sleep 1; 
echo "" 
echo "----------------memory info----------------" 
cat /proc/meminfo 
sleep 1; 
echo "" 
echo "----------------disk info----------------" 
df -k 
sleep 1; 
echo "" 
echo "----------------kernel parameter----------------" 
cat /etc/sysctl.conf 
sleep 1; 
echo "" 
echo "----------------os lever----------------" 
lsb_release -a 
sleep 1; 
echo "" 
echo "----------------product type----------------" 
dmidecode |grep Product 
sleep 1; 
echo "" 
echo "----------------CPU memory usage----------------" 
vmstat 5 5 
sleep 1; 
echo "" 
echo "----------------top info----------------" 
top -d 1 -n 20 
sleep 5; 
top -d 1 -n 20  
sleep 5; 
top -d 1 -n 20  
sleep 5; 
top -d 1 -n 20  
sleep 5; 
top -d 1 -n 20  
 
elif [ $SYSTEM = "SunOS" ]  then 
echo "----------------host name----------------" 
hostname 
echo "" 
echo "----------------id----------------" 
id 
echo "----------------CPU,memory number----------------" 
/usr/platform/sun4u/sbin/prtdiag -v 
echo "----------------os lever----------------" 
cat /etc/release 
echo "----------------Kernel parameter----------------" 
/usr/sbin/sysdef |grep SHM 
/usr/sbin/sysdef |grep SEM 
cat /etc/system 
echo "----------------disk info----------------" 
df -k 
echo "----------------IP info----------------" 
ifconfig -a 
sleep 1; 
 
elif [ $SYSTEM = "AIX" ]  then 
echo "----------------host name----------------" 
hostname 
echo "" 
echo "----------------id----------------" 
id 
echo "" 
echo "----------------machine plat----------------" 
uname -M 
echo "" 
echo "----------------CPU,memory number----------------" 
prtconf 
sleep 2; 
echo "" 
echo "----------------disk info----------------" 
df -k 
echo "" 
echo "----------------os lever----------------" 
oslevel -r 
echo "" 
echo "----------------kernel parameter----------------" 
lsattr -El sys0 
echo "" 
echo "----------------HACMP----------------" 
lslpp -l |grep cluster 
echo "" 
echo "----------------network parameter----------------" 
no -a 
echo "" 
echo "----------------CPU memory usage----------------" 
vmstat 5 5 
sleep 1; 
echo "" 
echo "----------------IP info----------------" 
ifconfig -a 
sleep 1; 
echo "" 
echo "----------------view cluster----------------" 
lssrc -g cluster 
sleep 1; 
echo "" 
echo "----------------view VG----------------" 
lsvg 
sleep 1; 
echo "" 
 
 
elif [ $SYSTEM = "HP-UX" ]  then  
echo "----------------host name----------------" 
hostname 
echo "" 
echo "----------------id----------------" 
id 
echo "" 
echo "----------------machine plat----------------" 
model 
echo "" 
echo "----------------CPU,memory number----------------" 
machinfo 
sleep 2; 
echo "" 
echo "----------------disk info----------------" 
bdf 
echo "" 
echo "----------------os lever----------------" 
oslevel -r 
echo "" 
echo "----------------HACMP----------------" 
lslpp -l |grep cluster 
echo "" 
echo "----------------network parameter----------------" 
no -a 
echo "" 
echo "----------------CPU memory usage----------------" 
vmstat 5 5 
sleep 1; 
sar -du 5 5 
echo "" 
echo "----------------IP info----------------" 
ifconfig -a 
 
else 
echo "What " 
fi

3.9、服务器自动巡检脚本

#!/bin/bash

login_info=$1

gather_server_ip=$2

gather_server_password=$3

grep_ip=`ifconfig | grep "[[:digit:]]{1,3}\.[[:digit:]]{1,3}\.\{3\}[[:digit:]]\{1,3\}" --color=auto -o | sed -e "2,5d"`

GatherPath="/tmp/GatherLogDirectory"

CheckScriptPath="/tmp/CheckScript"

if [ $# -ne 3 ]; then

echo -e "Parameters if fault!\n"

echo -e "Please using:$0 login_info gather_server_ip\n"

echo -e "For example: $0 IpAndPassword.txt $grep_ip\n"

exit;

fi

if [ ! -x "$GatherPath" ];then

mkdir "$GatherPath"

echo -e "The log"s path is: $GatherPath"

fi

cat $login_info | while read line

do

server_ip=`echo $line|awk "{print $1}"`

server_password=`echo $line|awk "{print $2}"`

login_server_command="ssh -o StrictHostKeyChecking=no root@$server_ip"

scp_gather_server_checksh="scp checksh.sh root@$server_ip:$CheckScriptPath"

/usr/bin/expect<

set timeout 20

spawn $login_server_command

expect {
"*yes/no" { send "yes\r"; exp_continue }

"*password:" { send "$server_password\r" }

}

expect "Permission denied, please try again." {exit}

expect "#" { send "mkdir $CheckScriptPath\r"}

expect eof

exit

EOF

/usr/bin/expect<

set timeout 20

spawn $scp_gather_server_checksh

expect {
"*yes/no" { send "yes\r"; exp_continue }

"*password:" { send "$server_password\r" }

}

expect "Permission denied, please try again." {exit}

expect "Connection refused" {exit}

expect "100%"

expect eof

exit

EOF

/usr/bin/expect<

set timeout 60

spawn $login_server_command

expect {
"*yes/no" { send "yes\r"; exp_continue }

"*password:" { send "$server_password\r" }

}

expect "Permission denied, please try again." {exit}

expect "#" { send "cd $CheckScriptPath;./checksh.sh $gather_server_ip $gather_server_password\r"}

expect eof

exit

EOF

done

checksh.sh#!/bin/bash

########################################################################################

#Function:

#This script checks the system"s information,disks"s information,performance,etc...of the

#server

#

#Author:

#By Jack Wang

#

#Company:

#ShaanXi Great Wall Information Co.,Ltd.

########################################################################################

########################################################################################

#

#GatherServerIpAddress is the server"s IP address that gather the checking log

#GatherServerPassword is the server"s IP address that gather the checking log

#

########################################################################################

GatherServerIpAddress=$1

GatherServerPassword=$2

########################################################################################

#GetTheIpCommand is a command that you can get the IP address

########################################################################################

GetTheIpCommand=`ifconfig | grep "[[:digit:]]{1,3}\.[[:digit:]]{1,3}\.\{3\}[[:digit:]]\{1,3\}" --color=auto -o | sed -e "2,5d"`

########################################################################################

#LogName is a command that Your logs"name

########################################################################################

LogName=`ifconfig|grep "[[:digit:]]{1,3}\.[[:digit:]]{1,3}\.\{3\}[[:digit:]]\{1,3\}" --color=auto -o|sed -e "2,5d"``echo "-"``date +%Y%M%d`

########################################################################################

#

#GatherLogPath is a path that collecting log path

#LocalServerLogPath is local log path

#

########################################################################################

GatherServerLogPath="/tmp/GatherLogDirectory"

LocalServerLogPath="/tmp/LocalServerLogDirectory"

########################################################################################

#LinuxOsInformation is function that usege to collect OS"s information

########################################################################################

LinuxOsInformation(){
Hostname=`hostname`

UnameA=`uname -a`

OsVersion=`cat /etc/issue | sed "2,4d"`

Uptime=`uptime|awk "{print $3}"|awk -F "," "{print $1}"`

ServerIp=`ifconfig|grep "inet"|sed "2,4d"|awk -F ":" "{print $2}"|awk "{print $1}"`

ServerNetMask=`ifconfig|grep "inet"|sed "2,4d"|awk -F ":" "{print $4}"|awk "{print $1}"`

ServerGateWay=`netstat -r|grep "default"|awk "{print $2}"`

SigleMemoryCapacity=`dmidecode|grep -P -A5 "Memory\s+Device"|grep "Size"|grep -v "Range"|grep "[0-9]"|awk -F ":" "{print $2}"|sed "s/^[ \t]*//g"`

MaximumMemoryCapacity=`dmidecode -t 16|grep "Maximum Capacity"|awk -F ":" "{print $2}"|sed "s/^[ \t]*//g"`

NumberOfMemorySlots=`dmidecode -t 16|grep "Number Of Devices"|awk -F ":" "{print $2}"|sed "s/^[ \t]*//g"`

MemoryTotal=`cat /proc/meminfo|grep "MemTotal"|awk "{printf("MemTotal:%1.0fGB\n",$2/1024/1024)}"|awk -F ":" "{print $2}"`

PhysicalMemoryNumber=`dmidecode|grep -A16 "Memory Device"|grep "Size:"|grep -v "No Module Installed"|grep -v "Range Size:"|wc -l`

ProductName=`dmidecode|grep -A10 "System Information"|grep "Product Name"|awk -F ":" "{print $2}"|sed "s/^[ \t]*//g"`

SystemCPUInfomation=`cat /proc/cpuinfo|grep "name"|cut  -d: -f2|awk "{print "*"$1,$2,$3,$4}"|uniq -c|sed "s/^[ \t]*//g"`

echo -e "Hostname|$Hostname\nUnamea|$UnameA\nOsVersion|$OsVersion\nUptime|$Uptime\nServerIp|$ServerIp\nServerNetMask|$ServerNetMask\nServerGateWay|$ServerGateWay\nSigleMemoryCapacity|$SigleMemoryCapacity\nMaximumMemoryCapacity|$MaximumMemoryCapacity\nNumberOfMemorySlots|$NumberOfMemorySlots\nMemoryTotal|$MemoryTotal\nPhysicalMemoryNumber|$PhysicalMemoryNumber\nProductName|$ProductName\nSystemCPUInformation|$SystemCPUInfomation"

}

PerformanceInfomation (){
CPUIdle=`top -d 2 -n 1 -b|grep C[Pp][Uu]|grep id|awk "{print $5}"|awk -F "%" "{print $1}"`

CPUloadAverage=`top -d 2 -n 1 -b|grep "load average:"|awk -F ":" "{print $5}"|sed "s/^[ \t]*//g"`

ProcessNumbers=`top -d 2 -n 1 -b|grep "Tasks"|awk -F "[: ,]" "{print $3}"`

ProcessRunning=`top -d 2 -n 1 -b|grep "Tasks"|awk -F "[: ,]" "{print $8}"`

ProcessSleeping=`top -d 2 -n 1 -b|grep "Tasks"|awk -F "[: ,]" "{print $11}"`

ProcessStoping=`top -d 2 -n 1 -b|grep "Tasks"|awk -F "[: ,]" "{print $16}"`

ProcessZombie=`top -d 2 -n 1 -b|grep "Tasks"|awk -F "[: ,]" "{print $21}"`

UserSpaceCPU=`top -d 2 -n 1 -b|grep "C[Pp][Uu]"|head -1|awk -F "[: ,%]" "{print $4}"`

SystemSpaceCPU=`top -d 2 -n 1 -b|grep "C[Pp][Uu]"|head -1|awk -F "[: ,%]" "{print $8}"`

ChangePriorityCPU=`top -d 2 -n 1 -b|grep "C[Pp][Uu]"|head -1|awk -F "[: ,%]" "{print $12}"`

WaitingCPU=`top -d 2 -n 1 -b|grep "C[Pp][Uu]"|head -1|awk -F "[: ,%]" "{print $19}"`

HardwareIRQCPU=`top -d 2 -n 1 -b|grep "C[Pp][Uu]"|head -1|awk -F "[: ,%]" "{print $23}"`

SoftwareIRQCPU=`top -d 2 -n 1 -b|grep "C[Pp][Uu]"|head -1|awk -F "[: ,%]" "{print $27}"`

MemUsed=`top -d 2 -n 1 -b|grep "Mem"|awk -F "[: ,]" "{print $11}"|tr -d "a-zA-Z"|awk "{printf("%dM\n",$1/1024)}"`

MemFreeP=`top -d 2 -n 1 -b|grep "Mem"|awk -F "[: ,]" "{print $16}"|tr -d "a-zA-Z"|awk "{printf("%dM\n",$1/1024)}"`

MemBuffersP=` top -d 2 -n 1 -b|grep "Mem"|awk -F "[: ,]" "{print $22}"|tr -d "a-zA-Z"|awk "{printf("%dM\n",$1/1024)}"`

CacheCachedP=`top -d 2 -n 1 -b|grep "Swap"|awk -F "[: ,]" "{print $24}"|tr -d "a-zA-Z"|awk "{printf("%dM\n",$1/1024)}"`

CacheTotal=`top -d 2 -n 1 -b|grep "Swap"|awk -F "[: ,]" "{print $4}"|tr -d "a-zA-Z"|awk "{printf("%dM\n",$1/1024)}"`

CacheUsed=`top -d 2 -n 1 -b|grep "Swap"|awk -F "[: ,]" "{print $14}"|tr -d "a-zA-Z"|awk "{printf("%dM\n",$1/1024)}"`

CacheFree=`top -d 2 -n 1 -b|grep "Swap"|awk -F "[: ,]" "{print $18}"|tr -d "a-zA-Z"|awk "{printf("%dM\n",$1/1024)}"`

echo -e "CPUIdle|$CPUIdle\nCPUloadAverage|$CPUloadAverage\nProcessNumbers|$ProcessNumbers\nProcessRunning|$ProcessRunning\nProcessSleeping|$ProcessSleeping\nProcessStoping|$ProcessStoping\nProcessZombie|$ProcessZombie\nUserSpaceCPU|$UserSpaceCPU\nSystemSpaceCPU|$SystemSpaceCPU\nChangePriorityCPU|$ChangePriorityCPU\nWaitingCPU|$WaitingCPU\nHardwareIRQCPU|$HardwareIRQCPU\nSoftwareIRQCPU|$SoftwareIRQCPU\nMemUsed|$MemUsed\nMemFreeP|$MemFreeP\nMemBuffersP|$MemBuffersP\nCacheCachedP|$CacheCachedP\nCacheTotal|$CacheTotal\nCacheUsed|$CacheUsed\nCacheFree|$CacheFree\n"

}

OprateSystemSec () {
echo "======================UserLogin======================"

w

echo "======================FileUsed======================="

df -ah

echo "======================dmesgError====================="

dmesg | grep error

echo "======================dmesgFail======================"

dmesg | grep Fail

echo "======================BootLog========================"

more /var/log/boot.log | grep -V "OK" | sed "1,6d"

echo "======================route -n======================="

route -n

echo "======================iptables -L===================="

iptables -L

echo "======================netstat -lntp=================="

netstat -lntp

echo "======================netstat -antp=================="

netstat -antp

echo "======================BootLog========================"

netstat -s

echo "======================netstat -s====================="

last

echo "======================du -sh /etc/==================="

du -sh /etc/

echo "======================du -sh /boot/=================="

du -sh /boot/

echo "======================du -sh /dev/==================="

du -sh /dev/

echo "======================df -h=========================="

df -h

echo "======================mount | column -t=============="

mount | column -t

}

TopAndVmstat(){
top -d 2 -n 1 -b

vmstat 1 10

}

CheckGatherLog(){
if [ -f "$LocalServerLogPath/$GetTheIpCommand.log" ];then

rm -rf $LocalServerLogPath/$GetTheIpCommand.log

fi

if [ ! -x "$LocalServerLogPath" ];then

mkdir "$LocalServerLogPath"

fi

if [ ! -f "$LocalServerLogPath/$GetTheIpCommand.log" ];then

touch $LocalServerLogPath/$GetTheIpCommand.log

LinuxOsInformation>>$LocalServerLogPath/$GetTheIpCommand.log

PerformanceInfomation>>$LocalServerLogPath/$GetTheIpCommand.log

OprateSystemSec>>$LocalServerLogPath/$GetTheIpCommand.log

TopAndVmstat>>$LocalServerLogPath/$GetTheIpCommand.log

fi

}

CheckGatherLog

SCP_LOG_TO_GATHER_SERVER="scp $LocalServerLogPath/$GetTheIpCommand.log root@$GatherServerIpAddress:$GatherServerLogPath"

/usr/bin/expect<

set timeout 50

spawn $SCP_LOG_TO_GATHER_SERVER

expect {
"*yes/no)?"

{
send "yes\n"

"*password:*" {send "GatherServerPassword\n"}

}

"*password:"

{
send "$GatherServerPassword\n"

}

}

expect "*password:"  { send "$GatherServerPassword\n" }

expect "100%"

expect eof

EOF

将上述脚本保存为shellsh.sh另外创建一个file.txt文档,格式按如下书写:
IP password
192.168.182.143 123456

然后执行:shellsh.sh file.txt

3.10、linux服务器巡检脚本

#!/bin/bash
##############################################################
#脚本用于对服务器各资源,服务进程等信息采集。
#############################################################


TIME=`date +"%Y-%m-%d-%H-%M"`

#30:黑,31:红,32:绿,33:黄,34:蓝色,35:紫色,36:深绿
#red:31 green:32 yellow:33 blue:34 purple:35 darkgreen:36

#颜色设置
RED(){
    val=$1
    echo -e "\033[31m ${val} \033[0m"
}

GREEN(){
    val=$1
    echo -e "\033[32m ${val} \033[0m"
}
YELLOW(){
    val=$1
    echo -e "\033[33m ${val} \033[0m"
}
BLUE(){
    val=$1
    echo -e "\033[34m ${val} \033[0m"
}

PURPLE(){
    val=$1
    echo -e "\033[35m ${val} \033[0m"
}

DARKGREEN(){
    val=$1
    echo -e "\033[36m ${val} \033[0m"
}

###########################################################
#登录交互操作
commd(){
ssh -o "StrictHostKeyChecking no" $IP  "$cmd" < /dev/null
}

###########################################################
#服务资源参数加载

eval $(/bin/grep disksize  gather.conf)
eval $(/bin/grep cpusize  gather.conf)
eval $(/bin/grep memsize  gather.conf)
eval $(/bin/grep swapsize  gather.conf)
eval $(/bin/grep dropsize  gather.conf)
eval $(/bin/grep ntpsize   gather.conf)

eval $(/bin/grep vss gather.conf)
eval $(/bin/grep vss_ip gather.conf)
eval $(/bin/grep ntp1_server_ip gather.conf)
eval $(/bin/grep ntp1_client gather.conf)
eval $(/bin/grep ntp2_server_ip gather.conf)
eval $(/bin/grep ntp2_client gather.conf)
eval $(/bin/grep sarip gather.conf)
eval $(/bin/grep mass gather.conf)
eval $(/bin/grep ipvsadm gather.conf)
eval $(/bin/grep losssize gather.conf)

eval $(/bin/grep bo  gather.conf)
eval $(/bin/grep bo_ip  gather.conf)
eval $(/bin/grep cdn  gather.conf)
eval $(/bin/grep cdn_ip  gather.conf)
eval $(/bin/grep VSS  gather.conf)
eval $(/bin/grep VSS_IP  gather.conf)
eval $(/bin/grep lvs  gather.conf)
eval $(/bin/grep lvs_ip  gather.conf)
eval $(/bin/grep portal  gather.conf)
eval $(/bin/grep portal_ip  gather.conf)
##########################################################



DISK(){
    cmd='df -h'
    disk=`commd $IP $cmd |awk '{if (NF==6){print $5","$6}else if (NF==5){print $4","$5}}'|grep -v -E "已用%|挂载点|shm|boot"`
    for data in $disk
    do
        valnum=`echo $data|awk -F[,%] '{print $1}'`
        diskname=`echo $data|awk -F[,%] '{print $3}'`
        if [ $valnum -gt $disksize ] ;then
            RED "($diskname),$valnum%"
        else
            echo "($diskname),$valnum%"
        fi
    done
}
CPU(){
    cmd='top -b n 1'
    CPUVAL=`commd $IP $cmd|grep "Cpu(s)"|awk -F, '{print $1,$2,$4}'`
    cpuval=`echo $CPUVAL|awk -F[:%] -v cpusizes=$cpusize '{if($2>cpusizes){print 0}else {print 1}}'`
    if [ -n "$CPUVAL" ];then
        if [ $cpuval -eq 0 ];then
            RED "$CPUVAL"
        else
            echo "$CPUVAL"
        fi
    fi
}


MEM(){
    cmd='free -m'
    mem=`commd $IP $cmd|grep "Mem"|awk '{sum=$2-$4-$6-$7}END{printf "%d\n", (sum/$2*100)}'`
    if [ -n "$mem" ];then
        if [ $mem -gt $memsize ];then
            RED "MEM: $mem%"
        else
            echo "MEM: $mem%"
        fi
    fi
}

SWAP(){
    cmd='free -m'
    swap=`commd $IP $cmd|grep "Swap"|awk '{printf "%d\n",($3/$2)}'`
    if [ -n "$swap" ];then
        if [ $swap -gt $swapsize ];then
            RED "SWAP: $swap%"
        else
            echo "SWAP: $swap%"
        fi
    fi
}
NET(){
    cmd="ifconfig|grep -E 'eth|bond'"
    network=`commd $IP $cmd|grep -E '^eth|^bond'|awk '{print $1}'`
    for net in $network
    do
        net=`echo $net|grep -v ":"`
        if [ "$net" != "" ];then
            cmd="ifconfig $net"
            DROPVAL=`commd $IP $PASSWD $cmd|grep "RX packets"|awk /dropped/'{print $4}'`
            dropval=`echo $DROPVAL|awk -F: '{print $2}'`
            cmd="ethtool $net|grep 'Link'"
            UPDOWN=`commd $IP $PASSWD $cmd|grep 'detected'|awk '{print $NF}'`
            statval=`echo $UPDOWN|sed 's/\r//g'`
            if [ "$statval" == "no" ] || [ $dropval -gt $dropsize ] ;then
                RED "$NAME,$IP,$net,$DROPVAL,$UPDOWN"
            else
                echo "$NAME,$IP,$net,$DROPVAL,$UPDOWN"
            fi
        fi
    done
}

GATH(){
for IP in $1
do
    NAME=`cat iplist.txt|grep "\<$IP\>"|awk '{print $3}'`
    sip=`echo $IP|awk -F. '{print $1"."$2"."$3}'`
    cmd="ip add|grep $sip|awk '{print \$2}'|uniq|wc -l"
    num=`commd $IP $cmd`
    NAMES=`echo $NAME|grep 'CDN'`
    if [ "${NAMES}" != "" ];then
        if [ $num -ge 4 ]; then
            for role in $2
            do
                cmd="ps uax|grep ngod|grep $role|grep -v 'grep'"
                val=`commd $IP $cmd|grep "$role"|grep -v "grep"`
                if [ "$val" == "" ];then
                    RED "$NAME,$IP,$role,进程不存在!"
                else
                    echo "$NAME,$IP,$role,进程存在!"
                fi
            done
        else
            for role in $2
            do
                roles=`echo $role|grep -E "rti|csi|cls"`
                if [ "$roles" == "" ] ;then
                cmd="ps uax|grep ngod|grep $role|grep -v 'grep'"
                                val=`commd $IP $cmd|grep "$role"|grep -v "grep"`
                                if [ "$val" == "" ];then
                                        RED "$NAME,$IP,$role,进程不存在!"
                                else
                                        echo "$NAME,$IP,$role,进程存在!"
                                fi
                fi
                        done
        fi
    else
        if [ $num -ge 2 ]; then
                        for role in $2
                        do
                                cmd="ps uax|grep $role|grep -v 'grep'"
                                val=`commd $IP $cmd|grep "?"`
                                if [ "$val" == "" ];then
                                        RED "$NAME,$IP,$role,进程不存在!"
                                else
                                        echo "$NAME,$IP,$role,进程存在!"
                                fi
                        done
                else
                        echo "$NAME,$IP,备机器"
                fi
    fi
done
}

GATHS(){
for IP in $1
do
    NAME=`cat iplist.txt|grep "\<$IP\>"|awk '{print $3}'`
        for role in $2
        do
            cmd="ps aux|grep ngod|grep $role|grep -v 'grep'"
                val=`commd $IP $cmd|grep "$role"|grep -v "grep"`
                if [ "$val" == "" ];then
                    echo "$NAME,$IP,$role,进程不存在!"
                else
                        echo "$NAME,$IP,$role,进程存在!"
                fi
        done
done
}

NTP(){
    ntp1(){
    if [ "$ntp1_server_ip" != "" ] && [ "$ntp1_client" != "" ] ;then
        for IP in ${ntp1_client[@]}
        do
            NAME=`cat iplist.txt|grep "\<$IP\>"|awk '{print $3}'`
            cmd="ntpdate -u ${ntp1_server_ip}"
            ntpval=`commd $IP $cmd|awk /offset/'{print $10}'`
            ntpnum=`echo $ntpval|awk -v ntpsizes=$ntpsize '{if ($1>ntpsizes){print  0}else {print 1}}'`
            if [ $ntpnum -eq 0 ] ;then
                RED "$NAME,$IP,$ntpval"
            else
                echo "$NAME,$IP,$ntpval"
            fi
        done
    fi
    }
    ntp2(){
    if [ "$ntp2_server_ip" != "" ] && [ "$ntp2_client" != "" ] ;then
        for IP in ${ntp2_client[@]}
        do
            NAME=`cat iplist.txt|grep "\<$IP\>"|awk '{print $3}'`
            cmd="ntpdate -u ${ntp2_server_ip}"
            ntpval=`commd $IP $cmd|awk /offset/'{print $10}'`
            ntpnum=`echo $ntpva2|awk -v ntpsizes=$ntpsize '{if ($1>ntpsizes){print  0}else {print 1}}'`
            if [ $ntpnum -eq 0 ] ;then
                RED "$NAME,$IP,$ntpval"
            else
                echo "$NAME,$IP,$ntpval"
            fi
        done
    fi
}
ntp2
ntp1
}

PING(){
    network(){
        cmd="ping -I $1 -c 5 $2"
        loss=`commd $IP $cmd|grep " packets"|awk '{print $6}'`
        lossval=`echo $loss|awk -F% '{print $1}'`
        if [ $lossval -gt $losssize ] ;then
            RED "$NAME,$IP,$1,$loss"
        else
            echo "$NAME,$IP,$1,$loss"
        fi
    }
    cat iplist.txt|grep -v "\#" |while read file
    do
        IP=`echo $file|awk '{print $1}'`
        NAME=`echo $file|awk '{print $3}'`
        cmd='route -n|grep -E "bond1|bond0"'
        val=`commd $IP $cmd|grep "\<UG\>"|awk '{print $2,$NF}'|sort|uniq`
        net_num=`echo $val|awk '{print NF}'`
        if [ $net_num -eq 2 ];then
            gw=`echo $val|awk '{print $1}'`
            netname=`echo $val|awk '{print $2}'`
            network $netname  $gw 
        elif [ $net_num -eq 4 ] ; then
            gw1=`echo $val|awk '{print $1}'`
            gw2=`echo $val|awk '{print $3}'`
            netname1=`echo $val|awk '{print $2}'`
            netname2=`echo $val|awk '{print $4}'`
            network $netname1 $gw1
            network $netname2 $gw2
        fi
    done
    
}

SAR(){

    if [ "$sarip" != "" ];then
        sar(){
            NAME=`cat iplist.txt|grep "\<$IP\>"|awk '{print $3}'`
            cmd="export 
            GREEN "*********************************${NAME},${IP}*******************************************"
            commd $IP $cmd|grep -E "平均时间:|Average"
        }
        for IP in ${sarip[@]}
        do
            sar 
        done
    fi
}
MASS(){

    if [ "$mass" != "" ];then
        Date=`date +%Y-%m-%d_%H-%M-%S`
        ipfile='disk_mass.csv'
        WGET()
        {
                wget -O diskstatus.xml "http://${ip}:${port}/cmd?cmdname=GetSysInfo" >/dev/null 2>&1
        }
        echo "AREA,IPADDR,MASSSTATUS"
        for serverinfo in `cat ${ipfile}|grep "$mass"`
        do
                area=`echo ${serverinfo}|cut -d, -f1`
                ip=`echo ${serverinfo}|cut -d, -f2`
                port=`echo ${serverinfo}|cut -d, -f3`
                if ( ping -c 1 $ip >/dev/null );then
                            WGET
                        if [ `grep -c ok diskstatus.xml` -eq 1 ];then
                                allsize=`awk -F "<SysAllSize>" '{print $2}' diskstatus.xml  |awk -F "</" '{print $1}'`
                                usesize=`awk -F "<SysUsedSize>" '{print $2}' diskstatus.xml  |awk -F "</" '{print $1}'`
                                stat='ok'
                                let per=usesize*100/allsize
                        else
                                stat='bad'
                        fi
                else
                        stat='bad'
                fi
            if [ "$stat" = "ok" ];then
            echo "${area},${ip},${per}%"
        elif [ "$stat" = "bad" ] ;then
            RED "${area},${ip},error"
        fi
    done
    rm -rf diskstatus.xml
    fi
}

IPVSADM(){
    if [ "$ipvsadm" != "" ];then
        for IP in ${ipvsadm[@]}
        do
            NAME=`cat iplist.txt|grep "\<$IP\>"|awk '{print $3}'`
            sip=`echo $IP|awk -F. '{print $1"."$2"."$3}'`
            cmd="ip add|grep -c $sip"
            num=`commd $IP $cmd`
            if [ $num -gt 1 ];then
                cmd='ipvsadm -ln'
                GREEN "*********************************${NAME},${IP}************************************"
                commd $IP $cmd|grep "Route"
            else
                cmd='/etc/init.d/keepalived status'
                BLUE "**********************************${NAME},${IP}************************************"
                commd $IP $cmd|grep "keepalived"
            fi
        done
    fi
    
}

main(){
RED "####IP#########|################CPU#############|##MEM#####|####SWAP###|########################DISK###############################"
cat iplist.txt|grep -v "\#"|while read file
do
    IP=`echo $file|awk '{print $1}'`
    NAME=`echo $file|awk '{print $3}'`
    disk=`DISK|xargs`
    cpu=`CPU`
    mem=`MEM`
    swa=`SWAP`
    echo "$NAME,$IP:|  $cpu | $mem | $swa  | DISK:$disk"
done 
}

WNETS(){
cat iplist.txt|grep -v "#"|while read file
do
    IP=`echo $file|awk '{print $1}'`
    NAME=`echo $file|awk '{print $3}'`
    NET 
done    
}

ips=${vss_ip[@]};gath=${vss[@]}
bos=${bo[@]};bo_ips=${bo_ip[@]}
cdns=${cdn[@]};cdn_ips=${cdn_ip[@]}
VSSS=${VSS[@]};VSS_IPS=${VSS_IP[@]}
lvss=${lvs[@]};lvs_ips=${lvs_ip[@]}
portals=${portal[@]};portal_ips=${portal_ip[@]}

#RED GREEN YELLOW BLUE PURPLE DARKGREEN

Usage(){
    echo """$0
    -h    help.
    -z    cpu,mem,swap...
    -n    ntp
    -p    ping
    -s    sar -n DEV 2 2
    -m    MASS
    -i    IPVSADM
    -w    NETWORK
    -j    程序进程
    -a    以上所有的。"""
}
if [ $# -eq 0 ];then
    Usage
fi
while getopts ":h",":znpsmiwja" opt
do
    case $opt in 
        "h")
        Usage
        exit -1
            ;;
        "z")
            main
            #TIME
            ;;
        "n")
            NTP
            ;;
        "p")
            PING
            ;;
        "s")
            SAR
            ;;
        "m")
            MASS
            ;;
        "i")
            IPVSADM
            ;;
        "w")
            WNETS
            ;;
        "j")
            GATH "$ips" "$gath"
            GATHS "$bo_ips" "$bos"
            GATH "$cdn_ips" "$cdns"
            GATHS "$VSS_IPS" "$VSSS"
            GATH "$lvs_ips" "$lvss"
            GATHS "$portal_ips" "$portals"
            ;;
        "a")
            main
            #TIME
            GREEN "NTP******************************************************************************"
            NTP
            YELLOW "PINT******************************************************************************"
            PING
            SAR
            PURPLE "MASS******************************************************************************"
            MASS
            IPVSADM
            DARKGREEN "NETWORK******************************************************************************"
            WNETS
            YELLOW "程序进程******************************************************************************"
            GATH "$ips" "$gath"
            GATHS "$bo_ips" "$bos"
            GATH "$cdn_ips" "$cdns"
            GATHS "$VSS_IPS" "$VSSS"
            GATH "$lvs_ips" "$lvss"
            GATHS "$portal_ips" "$portals"
            ;;
        *)
        echo "请输入参数"
        exit -1
            ;;
    esac
done

3.11、Linux服务器巡检脚本

#!/bin/sh
#定义检查操作系统版本的函数
NUM_VERSION=$(uname -r)
function Check_OS(){
[[ $NUM_VERSION =~ el6 ]] && return 0||return 1
}

echo "######CPU使用情况######"
CPU_HARDWARE=$(cat /proc/cpuinfo | grep name |cut -f2 -d: | uniq -c)
CPU_NUMBER=$(cat /proc/cpuinfo | grep name |cut -f2 -d: | uniq -c | awk '{print $1}')
CPU_LOAD=$(uptime | awk '{for(i=6;i<=NF;i++) printf $i""FS;print ""}')
CPU_LOAD_NUMBER=$(uptime | awk -F"load average:" '{print $2}' | awk -F"," '{print $1}' | awk -F"." '{print $1}' |sed 's/^[ \t]*//g')
CPU_UTILIZ=$(top -n 1 | grep "Cpu(s)")
if [[ $CPU_LOAD_NUMBER -lt $CPU_NUMBER ]]
 then
  CPU_STATUS=正常
 else
  CPU_STATUS=不正常
fi
echo "$CPU_STATUS("$CPU_HARDWARE,$CPU_LOAD,$CPU_UTILIZ")"
echo -e
echo -e

echo "######磁盘使用情况######"
IFS="  
"   
for i in `df -hP | sed 1d | awk '{print $(NF-1)"\t"$NF"\t"$(NF-2)}'`
do 
 DISK_UTILIZ=$(echo $i |awk  '{print $1}')
 MOUNT_DISK=$(echo $i |awk  '{print $2}')
 DISK_FREE=$(echo $i |awk  '{print $3}')
 if [[ $(echo $DISK_UTILIZ | sed s/%//g) -gt 70 ]]
   then
    echo "不正常""("$MOUNT_DISK"的使用率"$DISK_UTILIZ"较大,请注意"")"
   else
    continue
 fi
done
echo -e
echo "磁盘具体使用情况:"
df -hP | sed 1d | awk '{print $NF"分区""剩余空间"$(NF-2),"使用率"$(NF-1)}'
UMAIL_DIR=$(cat /usr/local/u-mail/config/custom.conf | grep "mailroot" | awk -F"=" '{print $2}' | sed 's/^[ \t]*//g')
echo "邮件数据存储在"$UMAIL_DIR
echo -e
echo -e

echo "######内存使用情况######"
Check_OS
RESULT=$?
if [ ${RESULT} -eq 0 ]
 then
  MEM_SUM_NUM=$(free -m | grep "Mem:" | awk -F" " '{print $2}')
  MEM_SURPLUS_NUM=$(free -m | grep "Mem:" | awk '{for(i=4;i<=NF;i++) print $i""FS;}' | awk '{a+=$1}END{print a}')
  MEM_SUM=$(free -m | grep "Mem:" | awk -F" " '{print $2"M"}')
  MEM_SURPLUS=$(free -m | grep "Mem:" | awk '{for(i=4;i<=NF;i++) print $i""FS;}' | awk '{a+=$1}END{print a"M"}')
  MEM_USED=$(echo $(($MEM_SUM_NUM-$MEM_SURPLUS_NUM)))
  PERCENT=$(printf "%d%%" $(($MEM_USED*100/$MEM_SUM_NUM)))
  PERCENT_NUM=$(echo $PERCENT|sed s/%//g)
   if [[ $PERCENT_NUM -lt 70 ]]
    then
     MEM_STATUS=正常
    else
     MEM_STATUS=不正常
   fi
  echo "$MEM_STATUS(""总内存大小"$MEM_SUM,"剩余内存大小"$MEM_SURPLUS,"内存使用率"$PERCENT")"
 else
  MEM_SUM_NUM7=$(free -m | grep "Mem:" | awk -F" " '{print $2}')
  MEM_SURPLUS_NUM7=$(free -m | grep "Mem:" | awk -F" " '{print $4}')
  MEM_SUM7=$(free -m | grep "Mem:" | awk -F" " '{print $2"M"}')
  MEM_SURPLUS7=$(free -m | grep "Mem:" | awk -F" " '{print $4"M"}')
  MEM_USED7=$(echo $(($MEM_SUM_NUM7-$MEM_SURPLUS_NUM7)))
  PERCENT7=$(printf "%d%%" $(($MEM_USED7*100/$MEM_SUM_NUM7)))
  PERCENT_NUM7=$(echo $PERCENT7|sed s/%//g)
   if [[ $PERCENT_NUM7 -lt 70 ]]
    then
     MEM_STATUS=正常
    else
     MEM_STATUS=不正常
   fi
  echo "$MEM_STATUS(""总内存大小"$MEM_SUM7,"剩余内存大小"$MEM_SURPLUS7,"内存使用率"$PERCENT7")"
fi 
echo -e
echo -e

echo "######操作系统版本和邮件系统版本######"
OS_VERSION=$(cat /etc/redhat-release)
UMAILAPP_VERSION=$(rpm -qa | grep umail_app | awk -F"." '{print $1"."$2"."$3}')
UMAILWEB_VERSION=$(rpm -qa | grep umail_webmail | awk -F"." '{print $1"."$2"."$3}')
echo $OS_VERSION,$UMAILAPP_VERSION,$UMAILWEB_VERSION
echo -e
echo -e

echo "######系统基本操作是否正常######"
SSH_SUM=$(cat /var/log/secure | grep "authentication failure" | wc -l)
SSH_DIY=500
if [ $SSH_SUM -gt $SSH_DIY ]
 then
  echo "有人在试您root密码,请注意"
 else
  echo "正常"
fi
echo -e
echo -e

echo "######是否有可疑进程或后门######"
echo "正常" 
echo -e
echo -e

echo "######是否安装杀毒软件防火墙######"
Check_OS
RESULT=$?
if [ ${RESULT} -eq 0 ]
 then
  /etc/init.d/iptables status 1>/dev/null 2>&1
  RESULT_IPTABLES=$?
  if [ ${RESULT_IPTABLES} -eq 0 ]
   then
    echo "操作系统自带防火墙已开启"
   else
    echo "操作系统自带防火墙未开启"
  fi
 else
  systemctl status firewalld.service 1>/dev/null 2>&1
  RESULT_FIREWALLD=$?
  if [ ${RESULT_FIREWALLD} -eq 0 ]
   then
    echo "操作系统自带防火墙已开启"
   else
    echo "操作系统自带防火墙未开启"
  fi  
fi
Check_OS
RESULT=$?
if [ ${RESULT} -eq 0 ]
 then
  ps -ef | grep umail_clamd | grep -v grep 1>/dev/null 2>&1
  RESULT_CLAMD6=$?
  /etc/init.d/umail_clamd status 1>/dev/null 2>&1
  RESULT_CLAMDSTATUS6=$?
   if [ ${RESULT_CLAMD6} -eq 0 ] && [ ${RESULT_CLAMDSTATUS6} -eq 0 ]
    then
     echo "已安装CLAMD杀毒软件"
   else
     echo "未安装杀毒软件或者未启动成功"
   fi
 else
  ps -ef | grep umail_clamd | grep -v grep 1>/dev/null 2>&1
  RESULT_CLAMD7=$?
  systemctl status umail_clamd.service 1>/dev/null 2>&1
  RESULT_CLAMDSTATUS7=$?
   if [ ${RESULT_CLAMD7} -eq 0 ] && [ ${RESULT_CLAMDSTATUS7} -eq 0 ]
    then
     echo "已安装CLAMD杀毒软件"
    else
     echo "未安装杀毒软件或者未启动成功"
   fi
fi
echo -e
echo -e

echo "######开机时长######"
LINETIME=$(uptime | awk -F"up" '{print $2}' | awk -F",  load average" '{print $1}')
echo "服务器开机时间为"$LINETIME
echo -e
echo -e

echo "######HTTP服务######"
APACHE6_STATUS=$(/etc/init.d/umail_apache status 1>/dev/null 2>&1) 
NGINX6_STATUS=$(/etc/init.d/umail_nginx status 1>/dev/null 2>&1)
APACHE7_STATUS=$(systemctl status umail_apache.service 1>/dev/null 2>&1)
NGINX7_STATUS=$(systemctl status umail_nginx.service 1>/dev/null 2>&1)
APACHE_PROC=$(ps -ef | grep "/usr/local/u-mail/service/apache/bin/httpd" | grep -v grep 1>/dev/null 2>&1)
NGINX_PROC=$(ps -ef | grep "/usr/local/u-mail/service/nginx/sbin/nginx" | grep -v grep 1>/dev/null 2>&1)
Check_OS
RESULT=$?
if [ ${RESULT} -eq 0 ]
 then
  /etc/init.d/umail_apache status 1>/dev/null 2>&1
  RESULT_APACHE6=$?
  /etc/init.d/umail_nginx status 1>/dev/null 2>&1
  RESULT_NGINX6=$?
  ps -ef | grep "/usr/local/u-mail/service/apache/bin/httpd" | grep -v grep 1>/dev/null 2>&1
  RESULT_APACHEPROC6=$?
  ps -ef | grep "/usr/local/u-mail/service/nginx/sbin/nginx" | grep -v grep 1>/dev/null 2>&1
  RESULT_NGINXPROC6=$?
  if [ ${RESULT_APACHE6} -eq 0 ] && [ ${RESULT_NGINX6} -eq 0 ] && [ ${RESULT_APACHEPROC6} -eq 0 ] && [ ${RESULT_NGINXPROC6} -eq 0 ]
   then
    echo "HTTP服务启动成功"
   else
    echo "HTTP服务启动不成功"
  fi
 else
  systemctl status umail_apache.service 1>/dev/null 2>&1
  RESULT_APACHE7=$?
  systemctl status umail_nginx.service 1>/dev/null 2>&1
  RESULT_NGINX7=$?
  ps -ef | grep "/usr/local/u-mail/service/apache/bin/httpd" | grep -v grep 1>/dev/null 2>&1
  RESULT_APACHEPROC7=$?
  ps -ef | grep "/usr/local/u-mail/service/nginx/sbin/nginx" | grep -v grep 1>/dev/null 2>&1
  RESULT_NGINXPROC7=$?
  if [ ${RESULT_APACHE7} -eq 0 ] && [ ${RESULT_NGINX7} -eq 0 ] && [ ${RESULT_APACHEPROC7} -eq 0 ] && [ ${RESULT_NGINXPROC7} -eq 0 ]
   then
    echo "HTTP服务启动成功"
   else
    echo "HTTP服务启动不成功"
   fi
fi
echo -e
echo -e

echo "######SMTP服务######"
Check_OS
RESULT=$?
if [ ${RESULT} -eq 0 ]
 then
  netstat -anltp | grep ":25" 1>/dev/null 2>&1
  RESULT_SMTP=$?
  /etc/init.d/umail_postfix status 1>/dev/null 2>&1
  RESULT_POSTFIX=$?
  if [ ${RESULT_SMTP} -eq 0 ] && [ ${RESULT_POSTFIX} -eq 0 ]
   then
    echo "SMTP服务启动成功"
   else
    echo "SMTP服务启动不成功"
  fi
 else
  netstat -anltp | grep ":25" 1>/dev/null 2>&1
  RESULT_SMTP7=$?
  systemctl status umail_postfix.service 1>/dev/null 2>&1
  RESULT_POSTFIX7=$?
  if [ ${RESULT_SMTP7} -eq 0 ] && [ ${RESULT_POSTFIX7} -eq 0 ]
   then
    echo "SMTP服务启动成功"
   else
    echo "SMTP服务启动不成功"
  fi
fi
echo -e
echo -e

echo "######POP服务######"
Check_OS
RESULT=$?
if [ ${RESULT} -eq 0 ]
 then
  netstat -anltp | grep ":110" 1>/dev/null 2>&1
  RESULT_POP=$?
  /etc/init.d/umail_dovecot status 1>/dev/null 2>&1
  RESULT_POPPROC=$?
  if [ ${RESULT_POP} -eq 0 ] && [ ${RESULT_POPPROC} -eq 0 ]
   then
    echo "POP服务启动成功"
   else
    echo "POP服务启动不成功"
  fi
 else
  netstat -anltp | grep ":110" 1>/dev/null 2>&1
  RESULT_POP7=$?
  systemctl status umail_dovecot.service 1>/dev/null 2>&1
  RESULT_POPPROC7=$?
  if [ ${RESULT_POP7} -eq 0 ] && [ ${RESULT_POPPROC7} -eq 0 ]
   then
    echo "POP服务启动成功"
   else
    echo "POP服务启动不成功"
  fi
fi
echo -e
echo -e

echo "######IMAP服务######"
Check_OS
RESULT=$?
if [ ${RESULT} -eq 0 ]
 then
  netstat -anltp | grep ":143" 1>/dev/null 2>&1
  RESULT_IMAP=$?
  /etc/init.d/umail_dovecot status 1>/dev/null 2>&1
  RESULT_IMAPPROC=$?
  if [ ${RESULT_IMAP} -eq 0 ] && [ ${RESULT_IMAPPROC} -eq 0 ]
   then
    echo "IMAP服务启动成功"
   else
    echo "IMAP服务启动不成功"
  fi
 else
  netstat -anltp | grep ":143" 1>/dev/null 2>&1
  RESULT_IMAP7=$?
  systemctl status umail_dovecot.service 1>/dev/null 2>&1
  RESULT_IMAPPROC7=$?
  if [ ${RESULT_IMAP7} -eq 0 ] && [ ${RESULT_IMAPPROC7} -eq 0 ]
   then
    echo "IMAP服务启动成功"
   else
    echo "IMAP服务启动不成功"
  fi
fi
echo -e
echo -e

echo "######收发测试(web和客户端)######"
echo "正常"
echo -e
echo -e

echo "######管理后台功能测试######"
echo "正常"
echo -e
echo -e

echo "######反垃圾反病毒测试######"
echo "正常"
echo -e
echo -e

echo "######是否有密码泄露导致群发垃圾邮件现象######"
SMTP_SUM=$(cat /usr/local/u-mail/app/log/smtp.log | grep "from:" | awk -F " " '{ print $6 }' | sed 's/<//g' | sed 's/>,//g' | sort | uniq -c | sort -rn |sed 's/^[ \t]*//g' |head -n 1 | awk -F" " '{print $1}')
SMTP_USER=$(cat /usr/local/u-mail/app/log/smtp.log | grep "from:" | awk -F " " '{ print $6 }' | sed 's/<//g' | sed 's/>,//g' | sort | uniq -c | sort -rn |sed 's/^[ \t]*//g' |head -n 1 | awk -F" " '{print $2}')
SMTP_DIY=500
if [ $SMTP_SUM -gt $SMTP_DIY ]
 then
  echo "当天外发邮件数量最大的"$SMTP_USER"用户超过"$SMTP_DIY"封,请确认"
 else
  echo "正常"
fi
echo -e
echo -e


运行结果如下:

[root@localhost ~]# sh check_umail.sh 
######CPU使用情况######
正常( 2 Intel(R) Xeon(R) CPU E5606 @ 2.13GHz,1 user, load average: 0.06, 0.02, 0.00 ,Cpu(s): 2.1%us, 0.8%sy, 0.2%ni, 96.5%id, 0.3%wa, 0.0%hi, 0.2%si, 0.0%st)

######磁盘使用情况######
磁盘具体使用情况:
/分区剩余空间38G 使用率20%
/dev/shm分区剩余空间1.9G 使用率1%
/boot分区剩余空间425M 使用率7%
/home分区剩余空间434G 使用率38%
邮件数据存储在/home/mailbox

######内存使用情况######
正常(总内存大小3952M,剩余内存大小3028M,内存使用率23%)

######操作系统版本和邮件系统版本######
CentOS release 6.9 (Final),umail_app-2.2.44-2,umail_webmail-1.6.69-1

######系统基本操作是否正常######
正常

######是否有可疑进程或后门######
正常

######是否安装杀毒软件防火墙######
操作系统自带防火墙已开启
已安装CLAMD杀毒软件

######开机时长######
服务器开机时间为 33 days, 6:29, 1 user

######HTTP服务######
HTTP服务启动成功

######SMTP服务######
SMTP服务启动成功

######POP服务######
POP服务启动成功

######IMAP服务######
IMAP服务启动成功

######收发测试(web和客户端)######
正常

######管理后台功能测试######
正常

######反垃圾反病毒测试######
正常

######是否有密码泄露导致群发垃圾邮件现象######
正常

3.12、企业服务器巡检

#!/bin/bash
 
function system(){
echo "#########################系统信息#########################"
OS_TYPE=`uname`
OS_VER=`cat /etc/redhat-release`
OS_KER=`uname -a|awk '{print $3}'`
OS_TIME=`date +%F_%T`
OS_RUN_TIME=`uptime |awk '{print $3}'|awk -F, '{print $1}'`
OS_LAST_REBOOT_TIME=`who -b|awk '{print $2,$3}'`
OS_HOSTNAME=`hostname`
 
echo "    系统类型:$OS_TYPE"
echo "    系统版本:$OS_VER"
echo "    系统内核:$OS_KER"
echo "    当前时间:$OS_TIME"
echo "    运行时间:$OS_RUN_TIME"
echo "最后重启时间:$OS_LAST_REBOOT_TIME"
echo "    本机名称:$OS_HOSTNAME"
}
function network(){
 
echo "#########################网络信息#########################"
INTERNET=(`ifconfig|grep ens|awk -F: '{print $1}'`)
for((i=0;i<`echo ${#INTERNET[*]}`;i++))
do
  OS_IP=`ifconfig ${INTERNET[$i]}|head -2|grep inet|awk '{print $2}'`
  echo "      本机IP:${INTERNET[$i]}:$OS_IP"
done
curl -I http://www.baidu.com &>/dev/null
if [ $? -eq 0 ]
then echo "    访问外网:成功"
else echo "    访问外网:失败"
fi
}
 
function hardware(){
 
echo "#########################硬件信息#########################"
CPUphysical id" /proc/cpuinfo |sort|uniq|wc -l`
CPUCORES=`grep "cores" /proc/cpuinfo|sort|uniq|awk -F: '{print $2}'`
CPUMODE=`grep "model name" /proc/cpuinfo|sort|uniq|awk -F: '{print $2}'`
 
echo "     CPU数量: $CPUID"
echo "     CPU核心:$CPUCORES"
echo "     CPU型号:$CPUMODE"
 
MEMTOTAL=`free -m|grep Mem|awk '{print $2}'`
MEMFREE=`free -m|grep Mem|awk '{print $7}'`
 
echo "  内存总容量: ${MEMTOTAL}MB"
echo "剩余内存容量: ${MEMFREE}MB"
 
disksize=0
swapsize=`free|grep Swap|awk {'print $2'}`
partitionsize=(`df -T|sed 1d|egrep -v "tmpfs|sr0"|awk {'print $3'}`)
for ((i=0;i<`echo ${#partitionsize[*]}`;i++))
do
disksize=`expr $disksize + ${partitionsize[$i]}`
done
((disktotal=\($disksize+$swapsize\)/1024/1024))
 
echo "  磁盘总容量: ${disktotal}GB"
 
diskfree=0
swapfree=`free|grep Swap|awk '{print $4}'`
partitionfree=(`df -T|sed 1d|egrep -v "tmpfs|sr0"|awk '{print $5}'`)
for ((i=0;i<`echo ${#partitionfree[*]}`;i++))
do
diskfree=`expr $diskfree + ${partitionfree[$i]}`
done
 
((freetotal=\($diskfree+$swapfree\)/1024/1024))
 
echo "剩余磁盘容量:${freetotal}GB"
}
 
 
function secure(){
echo "#########################安全信息#########################"
 
countuser=(`last|grep "still logged in"|awk '{print $1}'|sort|uniq`)
for ((i=0;i<`echo ${#countuser[*]}`;i++))
do echo "当前登录用户:${countuser[$i]}"
done
  
md5sum -c --quiet /opt/passwd.db &>/dev/null
if [ $? -eq 0 ]
then echo "    用户异常:否"
else echo "    用户异常:是"
fi
}
 
function chksys(){
system
network
hardware
secure
}

3.13、定期的将每日服务器的检查结果发送到邮箱

#!/bin/bash
   #服务器检查脚本
   source /home/jack/.bash_profile
   #引用普通用户的环境变量
   list=/home/jack/shell/monitor/serverlist
   ip=`awk '{print $2}' $list `
   log=/home/jack/shell/monitor/logs/check_$(date +%F).log
   subject="服务器日常巡检结果"
   if [ `/usr/bin/sudo ls  /var/spool/mqueue/|wc -l` -ge 0 ];then
   sudo rm -rf /var/spool/mqueue/*
   fi
   #清空邮件队列
   >$log
   date|sed 's@CST@@g' >>$log
   for i in $ip
   do
   ping -c 4 $i >/dev/null 2>&1
   if [ $? -eq 0 ];then
   echo "`cat $list|grep $i|awk '{print $1}'` 检测正常!" >>$log
   else
   echo "`cat $list|grep $i|awk '{print $1}'` 检测失败!" >>$log
   fi
   done
   /bin/mail -s $subject <$log  n3h3aaaaa@163.com
   #邮件发送检测结果

/etc/mail.rc中参数的设置如下:

set from=邮箱地址
set smtp=smtp服务器的地址
set smtp-auth-user=邮箱的用户名
set smtp-auth-password=邮箱的密码
set smtp-auth=login 设置登录方法
serverlist 
服务器名称      服务器IP

3.14、WEB服务器巡检python脚本

#!/usr/bin/env python

# coding=utf-8

#----------------------------------------------------------

# Name:         WEB服务器巡检脚本

# Purpose:      监控多台Web服务器状态,一旦出现问题就发送邮件

# Version:      1.0

# Created:      2013-06-04

# Copyright:    (c) LEO 2013

# Python:       2.4/2.7

#----------------------------------------------------------

from smtplib import SMTP

from email import MIMEText

from email import Header

from datetime import datetime

import httplib

#定义要检测的服务器,URL 端口号 资源名称

web_servers = [('192.168.1.254', 80, 'index.html'),

('www.xxx.com', 80, 'index.html'),

('114.114.114.114', 9000, '/main/login.html'),

]

#定义主机 帐号 密码 收件人 邮件主题

smtpserver = 'smtp.163.com'

sender = 'xxxx@xxx.com'

password = 'password'

receiver = ('收件人1','收件人2')

subject = u'WEB服务器告警邮件'

From = u'Web服务器'

To = u'服务器管理员'

#定义日志文件位置

error_log = '/tmp/web_server_status.txt'

def send_mail(context):

'''发送邮件'''

#定义邮件的头部信息

header = Header.Header

msg = MIMEText.MIMEText(context,'plain','utf-8')

msg['From'] = header(From)

msg['To'] = header(To)

msg['Subject'] = header(subject + '\n')

#连接SMTP服务器,然后发送信息

smtp = SMTP(smtpserver)

smtp.login(sender, password)

smtp.sendmail(sender, receiver, msg.as_string())

smtp.close()

def get_now_date_time():

'''获取当前的日期'''

now = datetime.now()

return str(now.year) + "-" + str(now.month) + "-" \

+ str(now.day) + " " + str(now.hour) + ":" \

+ str(now.minute) + ":" + str(now.second)

def check_webserver(host, port, resource):

'''检测WEB服务器状态'''

if not resource.startswith('/'):

resource = '/' + resource

try:

try :

connection = httplib.HTTPConnection(host, port)

connection.request('GET', resource)

response = connection.getresponse()

status = response.status

content_length = response.length

except :

return  False

finally :

connection.close()

if status in [200,301] and content_length != 0:

return True

else:

return False

if __name__ == '__main__':

logfile = open(error_log,'a')

problem_server_list = []

for host in web_servers:

host_url = host[0]

check = check_webserver(host_url, host[1], host[2])

if not check:

temp_string = 'The Server [%s] may appear problem at %s\n' % (host_url,get_now_date_time())

print >> logfile, temp_string

problem_server_list.append(temp_string)

logfile.close()

#如果problem_server_list不为空,就说明服务器有问题,那就发送邮件

if problem_server_list:

send_mail(''.join(problem_server_list))

3.15、linux服务器自动巡检Python脚本


#!/usr/bin/evn python
#*-* encoding:utf-8 -*-
#Filename:ssh.py
#自动登录服务器,实现服务器巡检工作
import os
import sys
import paramiko
 
#设置一下字符编码
reload(sys)
sys.setdefaultencoding('utf-8')
 
#使用public key的登录服务器,将巡检结果输出到特定的目录中
def login_by_pubkey(serverHost,serverPort,userName,keyFile):
		known_host = "~/.ssh/known_hosts"
	ssh = paramiko.SSHClient();
	ssh.load_system_host_keys(known_host)
	#设置默认接收主机信任的策略,但是可能报告“不信任主机的”异常
	ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
 
	print 'Connectting host %s......' % serverHost
	ssh.connect(serverHost,serverPort,username = userName,key_filename = keyFile)
	print 'Connect host %s sucess' % serverHost
 
	fname = '~/xunjian/result_%s' % serverHost
	f = file(fname,'w')
	#执行系统命令,获取输出
	stdin, stdout, stderr = ssh.exec_command('df -h')
	#print stdout.readlines()
	f.write('step1:check disk:\n')
	for line in stdout.readlines():
		if len(line) > 0:
			print line
			f.write(line)
	vmstat_stdin,vmstat_stdout,vmstat_stderr = ssh.exec_command('vmstat 2 10')
	f.write('step2:check system:\n')
	for line in vmstat_stdout.readlines():
		if len(line) > 0:
			f.write(line)
	process_stdin,process_stdout,process_stderr = ssh.exec_command('ps -aux | grep java | top 10')
	f.write('step3:check process:\n')
	for line in process_stdout.readlines():
		if len(line) > 0:
			f.write(line)
	#关闭文件和ssh连接
	f.close()
	ssh.close()
	print 'say bye to host %s' % serverHost
	#生成截图文件(采用Java实现,需要调用本地的Java文件,依赖了commons-io.jar)
	print 'generate image file of %s' % serverHost
	try:
		java_cmd = '/usr/bin/env java -cp commons-io-2.1.jar:img.jar com.*.*.*.CeateCheckPic %s' % fname
		os.system(java_cmd)
	except Exception, e:
		print 'error when generate image file of %s : %s' % (serverHost,e)
	finally:
		print '===generate image file of %s over===' % serverHost
def login_by_prikey():
	pass
if __name__ == '__main__':
	#如果有多个服务器,这个列表中需要配置多条这种配置,实际使用中请将 ip,port,username,public key path替换下面的变量
	ips = ['#ip#,#port#,#user#,#pubkey_path#']
 
	for ip in ips:
		host,port,user,path = ip.split(',')
 
		print '==========start %s============' % host
		login_by_pubkey(host,int(port),user,path)
		print '>>>>>>>>>>end %s<<<<<<<<<<<<<<' % host
来源url
栏目