This solution is by far not perfect and can be improved/automated/updated in many different ways. It took me only 10 minutes to implement though and basically reduce my web traffic by 80%.

I simply use the ngx_http_map_module, which allows me to have a variable depend on the value of another variable.

map $http_user_agent $blocked_user_agent {
  default 0;
  ~*amazonbot 1;
  ~*openai 1; 
  ~*chatgpt 1; 
  ~*gptbot 1; 
  ~*claudebot 1; 
}

The nginx expression ~*term matches case-insensitively for the occurence of the string at any position. This map finds its place somewhere in the http-section in my /etc/nginx/nginx.conf.

I simply checked my nginx access-files to find the most commonly used crawlers in the user agent.

To then apply and test for the filter I run a simple if-statement before returning any files or proxying in the server. The error code i have selected is 450, but any code can be really used.

For example

server {
    ....
    location / {
        if ($blocked_user_agent) {
            return 450; # Blocked by Windows Parental Controls
        }
        try_files ...;
    }
}