RootsWeb.com Mailing Lists
Total: 1/1
    1. [STATE-COORD-L] Language codes
    2. Robert Sullivan
    3. Yesterday's posting about character entities made me think a few of you might be interested in another subject which can affect the use of your pages: how you set the encoding for the page. The rest of you may skip this. :-) It's considered good practice to specify the primary language for your page. For English, this can be as simple as putting <html lang="en"> at the beginning of all your pages. I learned about this when I was trying to figure out why some non-English pages wouldn't display properly on my computers. When I'm wearing my computer-installer hat, I put all of Internet Explorer's language options on; we have a lot of residents, students and visitors who like to view pages in their own languages. Some pages - Korean, for instance - would work and some would give me gibberish. Eventually, I realized the ones which displayed the correct alphabet had the proper "kr" encoding. You can also use meta tags to set the specify a character set. Many people use windows-1252; I'm not sure how this works for non-Windows users. The iso-8859-1 character set is also common. I wanted to make my pages as internationally-compatible as possible, and after some consultation with a fellow at an Australian university who has done some amazing work with aboriginal poetry (you have to appreciate the technical achievement even if it's not your field), I modified all my XHTML pages to begin like this: <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en" dir="ltr"> <head> <title>meaningful title here</title> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <meta http-equiv="Content-Language" content="en" /> Using both the meta tags and the attributes in the html tag gives you backward compatibility with older browsers. On a scale smaller than a page, ideally you will identify words or phrases which use a language other than the primary. This is an accessibility issue: in a perfect world, a user listening to your page read aloud would hear those passages read in the correct accent. In the real world, it's not clear how many screen readers are up to this yet, but it's much easier to add this as you create your pages than it is to retrofit them later. For example, I'm working on a local history with extensive quotations from Dutch documents. I could simply code those paragraphs as <p lang="nl"> or use <span lang="nl"> for phrases. If you're doing a table with side-by-side translations, you'll have to tag the cells on one side. Proper names don't need to be tagged. The language codes are sometimes, but not always, the same as the country codes. A reference to the 2- and 3-letter codes may be found at: <http://www.loc.gov/standards/iso639-2/englangn.html> and it covers anything you are likely to need. If there is ever software which can properly pronounce Latin, Iroquois and Mohawk, my pages will be ready for it. :-) Hope this helps someone, Bob Sullivan NY SC / Schenectady County CC

    05/20/2003 12:50:19