Fixing Apache and UTF8
A while back, I started having problems with the output of Venus, a
planet-like aggregator I use to read a bunch of things. The symptoms
were broken characters for things like apostrophes, quotes and so on
– which rendered the output nearly unusable. I dug into it,
but couldn’t resolve the problem…so I resorted to a bletcherous hack
(cron job to copy the file to my laptop, and view it with
file:///...
) and blamed Python 2.
Today I came across the same problem but manifested in another set of files. This time I managed to find the answer:
AddCharset UTF-8 .htm .html .js .css
To be clear, I already:
- had made sure that the headers for the file included
Content-Type: text/html;charset=utf-8
- had made sure the html file had
<meta charset="utf=8">
Weirdly enough, changing that meta
tag to:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" >
worked…the apostrophes and such were displayed correctly. But they never showed up in the output when I ran a curl on the URL. Does Apache filter this stuff on the fly?
Anyhow…that’s enough encoding debugging for one day. Or possibly a year.