Joined: 21 Sep 2003 Posts: 16777097 Location: Portugal
Posted: Sun Jun 08, 2008 5:00 pm Post subject:
Exactly. The problem here is that the original html contains characters that don't exist in the reduced ASCII set. Things like curved quotes (“) or the Euro sign (€), for example.
When saving to text, the browser is probably saving either to strict 7-bit ASCII (or maybe ISO-8859-1, also known as Latin1), or to the encoding specified by your locale settings. The problem is that whichever encoding it's using seems to not include some of the original characters.
The solution would be to normalize the characters so that fancy stuff like curved quotes and so on is transformed to more standard characters like ". This, however, may not be easy to accomplish from the browser.
As Elderan pointed out, saving as UTF-8 would be another solution - as UTF-8 can by definition encode all Unicode characters. This would mean, however, that you'd need a text editor that can understand UTF-8 to read the fancy characters in the text file, but that shouldn't be a problem for virtually every modern text editor.
Unfortunately I don't really know if there's a way to choose the encoding used when you save text in Firefox, or which encoding it uses to save in the first place. Perhaps asking in the Mozilla forums might help.
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
Looking for more Windows Networking info?
Sign up to the WindowsNetworking.com Monthly Newsletter, written by Enterprise Security MVP Deb Shinder, containing news, the hottest tips, Networking links of the month and much more. Subscribe today and don't miss a thing! View a sample newsletter.
Become a WindowsNetworking.com member!
Discuss your Windows Networking issues with thousands of other Windows Newtorking experts. Click here to join!