Capacity compression due to web server access logs

I just looked into the phenomenon of losing access to the site, and found out that
The disk space was full and the cache could not be created.

When I looked for a large file, I saw the nginx access log.
This time, I deleted other files and took up free space.

For the first time, I realized that it would be bad to leave it alone, but I don't know how to deal with it well.

I can't help but leave a log, so what should I do in this case?

Also, off the subject a little, do you take any action to notice when you run out of disk space?

Environment

centros 6.5
nginx

centos nginx

2022-09-29 22:02

2 Answers

In my personal experience, logs only install log servers to accumulate logs.
It also reduces disk pressure by periodically sending FTP from the web server to the log server, such as cron.(I don't know how many logs I have, but daily is fine.)

If you reduce the capacity by gzip compression in advance when sending FTP, it would be good to reduce the amount of server transfer.

Error logs, access logs, etc. can be used to understand user trends, and it is recommended that you set policies and operate them properly.
Also, if you have a separate DB server, it's good to consolidate all of those servers' data, not just nginx.

The recommended configuration is that the log server is basically just for file savings, so low specs are fine.(Unless you want to do big data analysis or data mining)

As for HDD capacity, the current service will output xx gigabytes daily, so I think it would be good to calculate it based on the current operations, such as holding it for a month.

Access log enlargement may be affected by crawlers on search sites and bot access using simple URL checks.

The former is simply software run by Google, Yahoo! and other search sites, and the user agent is special, so you may want to control it by not outputting it to the log when the user agent runs.(If you control access, you won't be able to search.)

In the latter case, either a direct attack by a third party or a security hole scan before the attack.Therefore, they typically respond with a 404 or 403 status code, while not recording a specific user agent.

For capacity monitoring, it would be good to deploy server monitoring software such as munin, Zabbix, and Cacti

These can be configured in detail, such as sending an email to an administrator when disk usage exceeds 90 percent.
This is one aspect of usage, rather than CPU utilization, httpd, monitoring the life and death of various DB servers, and monitoring processes.

For example, in my case, I use Zabbix to monitor daily MySQL database servers, httpd servers, understand disk space, monitor CPU utilization and memory usage, and monitor email as a warning when it reaches a threshold.

The official website has templates for your purpose and related books, so you can find them at bookstores.

2022-09-29 22:02

Let's rotate the logs and manage the generation first generation.Most UNIX-based operating systems have this functionality as a standard operating system.For CentOS, it is logrotate.

For example, the following example of /etc/logrotate.d/nginx compresses and stores five generations of logs every week.

/var/log/nginx/*.log{
    weekly
    rotate5
    compress

    missingok
    notify
    sharedscripts
    postrotate
        kill-USR1`cat/var/run/nginx.pid`||true
    endscript
}

As for disk space monitoring, the first step is to check the status of the server by emailing it daily.You should check the authentication logs frequently. logwatch is a standard tool, but you can run it regularly with cron to automatically summarize the logs and send them by email.You can also find out how much disk space you have.

2022-09-29 22:02

If you have any answers or tips

Popular Tags

python x 4647

android x 1593

java x 1494

javascript x 1427

c x 927

c++ x 878

ruby-on-rails x 696

php x 692

python3 x 685

html x 656