通常,我们会把一些php文件使用apache/httpdd的伪静态来实现后缀.html或短网址的效果。有时候,我们希望把后缀是.html的页面(实际是url rewrite读取的php程序的内容)缓存起来,比如缓存30分钟。
这时候,我们在测试 URL Rewrite的apache伪静态Nginx缓存的时候,发现nginx无法缓存,proxy_cache失败. www.ctohome.com的最直接的测试方法是,监控实际响应服务器(php程序所在的服务器)上的日志访问情况,发现无论怎么访问伪静态的html文件,都会直接访问数据服务器,就是说没有缓存起来。
例如,我们使用 curl -I http://www.ctohome.com/FuWuQi/6a/680.html 来查看响应头:
HTTP/1.1 200 OK
Server: nginx/1.11.4
Date: Mon, 03 Oct 2016 02:19:47 GMT
Content-Type: text/html; charset=utf-8
Connection: keep-alive
X-Powered-By: PHP/5.3.3
Cache-control: private
Set-Cookie: ECS_ID=abd942b0af8a00074455befa7e9672a765cdcd85; path=/
Set-Cookie: ECS[visit_times]=1; expires=Mon, 02-Oct-2017 18:19:47 GMT; path=/
Set-Cookie: ECS[history]=127; expires=Tue, 01-Nov-2016 18:19:47 GMT
Vary: Accept-Encoding
X-Cache: www.CTOhome.com-Nginx-Cached
X-Cache-Status: MISS
可以看到X-Cache: www.CTOHome.com-Nginx-Cached,说明实际上已经调用了我们cache的配置文件,但是X-Cache-Status: MISS,说明没有缓存成功。怎么回事呢?
原因:
进过研究,原来是php程序传递了header头 Cache-control 和 Set-Cookie,导致nginx默认就不缓存这类文件。
解决办法:
在nginx的cache配置文件里面,忽略掉header头 Cache-control 和 Set-Cookie
proxy_ignore_headers Cache-Control Set-Cookie;
官方解释:
By default, NGINX respects the Cache‑Control headers from origin servers. It does not cache responses with Cache‑Control set to Private, No‑Cache, or No‑Store or with Set‑Cookie in the response header. NGINX only caches GET and HEAD client requests. You can override these defaults as described in the answers below.
NGINX does not cache responses if proxy_buffering is set to off. It is on by default.
Yes, with the proxy_ignore_headers directive. For example, with this configuration:
location /images/ { proxy_cache my_cache; proxy_ignore_headers Cache-Control; proxy_cache_valid any 30m; ... }
NGINX ignores the Cache‑Control header for everything under /images/. The proxy_cache_validdirective enforces an expiration for the cached data and is required if ignoring Cache‑Controlheaders. NGINX does not cache files that have no expiration.
Yes, with the proxy_ignore_headers directive, as discussed in the previous answer.
Yes, with the proxy_cache_methods directive:
proxy_cache_methods GET HEAD POST;
This example enables caching of POST requests.
Yes, provided the Cache‑Control header allows for it. Caching dynamic content for even a short period of time can reduce load on origin servers and databases, which improves time to first byte, as the page does not have to be regenerated for each request.
接下来的测试中,CTOHOME进一步发现,使用不同的浏览器访问同一个文件,即使是在cache缓存的有效期里面,不同浏览器并不会直接读取cache,不同的browser会单独发起请求,这坑爹吧,基本上就等于是每个客户的浏览器都要缓存一次网站的内容,cache就失去意义了。
为什么?应该是不同的浏览器、包括不同的版本,传送了不同的参数给nginx,导致nginx认为是不同的请求。
怎么办?理论上要忽略不同的browser即可。
看下解决办法:
proxy_ignore_headers Cache-Control Set-Cookie Vary;
Due to support for the Vary header, NGINX can cache multiple versions of the same content (e.g. different encodings) and serve the appropriate version to each client, e.g. gzipped and uncompressed.
Based on the Vary: Accept-Encoding header that's there, I would guess that Edge and Opera send different "Accept-Encoding" headers for the request. For example, one may simply send "gzip" while the other sends "gzip, deflate". Those are technically different Accept-Encoding request headers.
If you know that the origin won't send meaningfully different encodings that won't work between browsers you can add:
proxy_ignore_headers Vary;
You already have the proxy_ignore_headers, so you can probably just add to that.
Since all major browsers support gzip, the risk is likely very low. However, "webp" is also done via the Accept-Encoding, so that could create surprising results for some images if the origin can handle webp.
CTOHOME告诉您一个终极加速的办法: 让浏览器直接缓存已经访问的网页,连304的请求都不发出。也就是说,输入网址,回车, 或 鼠标点击已经访问过的连接,浏览器并不会发出任何请求给服务器,也就是连304的判断都不去做,而是直接从你的电脑读取缓存文件,这个请求在firefox等debug插件http status会显示 200状态,已缓存! 要达到这个目的,首先需要设置 header里面的
Cache-control: private => 改为 Cache-Control: max-age=1800
Set-Cookie: => 去掉,也就是说在php程序里面,要缓存的页面,不要使用setcookie函数,如果一定要设置cookie,用javascript去做
去掉header里面的etag,增加 X-Accel-Expires: 1800
好了,接下来debug测试,先访问A页面,监控数据服务器日志,只有1个访问记录,这对的,因为nginx缓存了。 在监控nginx做了缓存的日志,再次访问(输入网址,回车, 或 鼠标点击已经访问过的连接), 没有增加访问记录!连304的访问日志都没有!效果达到了!
如果没有设置正确,每次输入或粘贴网址然后回车 或 点击已经访问过的连接(不是按F5或点刷新按钮),每次都会是200状态,或304状态。其中如果每次都是200的状态无缓存,这说明每次访问都需要去服务器下载新的内容,这是很不好的设置。CTOHOME认为最理想的是设置200状态有缓存 ,效果如下图chrome的调试界面显示的Status Code 200 OK (from cache)
status-code-200-from-cache.png
CTOHOME还观察发现,百度云加速,阿里CDN都是用200状态有缓存 的配置,你还犹豫什么呢?