There are several location where a document can state its encoding: the Content-Type HTTP header the (optional) XML declaration the Content-Type meta tag inside the document header for HTML5 documents the charset meta tag There are probably even more I've forgotten In the end, detecting the actual encoding is rather hard. You really shouldn't do this yourself but use high-level libraries for retrieving and parsing HTML content. I'm sure they are available even for C++, even if they have to be thiefed from the a browser environment.
:).
There are several location where a document can state its encoding: the Content-Type HTTP header the (optional) XML declaration the Content-Type meta tag inside the document header for HTML5 documents the charset meta tag. There are probably even more I've forgotten. In the end, detecting the actual encoding is rather hard.
You really shouldn't do this yourself but use high-level libraries for retrieving and parsing HTML content. I'm sure they are available even for C++, even if they have to be thiefed from the a browser environment. :).
Thanks for that info I didn't knew it could appear in so many places. However it's not very effective for me, I'm creating a native host that can allow JS Code perform cross-domain HTTP requests and eventually receive his request as plain text. I'll have to parse the entire HTML document and look for those tags just to convert it to a readable text - sounds a bit expensive (run-time wise).
However found a nice COM interface that might help msdn.microsoft. Com/en-us/library/aa741001(v=vs.85). Aspx – Omer Jun 22 at 15:29.
I used DetectInputCodepage in IMultiLanguage2 interface and it worked great!
I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.