Support non escaped referer in log
Unlike apache which escape non ascii characters in referrer, caddy writes referrer as is. Edge seem to send referrer not escaped, so with Edge and caddy we can have non ascii text in referrer. For lines which cannot be decoded as ASCII, we use python `replace` error handler which would in this case allow the line to be processed if the decoding problem is only about the encoding of the referrer. We don't implement this case as "skip and report ill-formed line", because python does not provide utilities to do this easily. Reproduction with caddy: ``` curl -k http://localhost -H 'Referer: héhé' ``` With apache, `LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" common` ``` 127.0.0.1 - - [28/Feb/2019:10:03:33 +0100] "GET / HTTP/1.1" 200 2046 "h\xc3\xa9h\xc3\xa9" "curl/7.50.1" 4 ``` With caddy, `log / stdout "{remote} {>REMOTE_USER} [{when}] \"{method} {uri} {proto}\" {status} {size} \"{>Referer}\" \"{>User-Agent}\" {latency_ms}"` ``` 127.0.0.1 - [28/Feb/2019:10:05:00 +0100] "GET / HTTP/2.0" 200 1950 "héhé" "curl/7.50.1" 4 ```
Showing
Please register or sign in to comment