FAQ:Languages and Character Sets

From PGVWiki
Jump to: navigation, search

Questions about Languages and Character Sets

Some of the characters in my GEDCOM are not showing up right?

PhpGedView defaults to the UTF-8 character set. If you are finding that characters above the standard character set are not being shown correctly (ie. umlauts, quotes, m-dashes, Hebrew etc.) then re-export your gedcom from your genealogy application and make sure you select the UTF-8 option when you export. Then upload this new gedcom and re-import.

If your genealogy software does not support exporting in the UTF-8 format, then choose the ANSI option and then convert the GEDCOM to UTF-8 using another program. Instructions on how to do this can be found here.

Does PhpGedView support Unicode (UTF-16)?

No. Unfortunately this is a problem with PHP and thus by inheritance is a problem in PhpGedView. PHP currently does not support wide character sets (2 byte or 16 bit character sets). However, PhpGedView does support the UTF-8 character set which is the standard for the Internet. UTF-8 is a multibyte character set which allows you to view many languages on the same page. Extensive development work has been done in PhpGedView to support the UTF-8 character set throughout the program.

What languages does PhpGedView support?

See the About page for more information about language support in PhpGedView.

Note: PhpGedView relies on the support of the PhpGedView user community to supply languages. If you would like to make contributions to the languages, then follow the instructions in the LANGUAGES section of the readme.txt file.

How can I convert my GEDCOM to UTF-8?

Your GEDCOM file should be encoded in the UTF-8 character set, especially if you use special characters. Most of the current commercial packages allow you to specify the character set when you export your GEDCOM. If UTF-8 is not one of the supported options, then you should export your GEDCOM first using the Unicode or Windows character set.

A common encoding option for GEDCOMS is ANSI. PhpGedView will attempt to convert ANSI encoded GEDCOM files to UTF-8 during the GEDCOM import, but ANSI encoding will usually corrupt characters outside of the standard ASCII 256. So if you have special characters in your GEDCOM file, then you probably want to export to Unicode or a Windows character set and then use another program to convert it to UTF-8.

To convert a GEDCOM encoded in another character set to UTF-8, open your GEDCOM file using Windows Notepad. Near the top of the file you should see the line

1 HEAD ANSI


Change this line to read

1 HEAD UTF-8


This will alert PGV and other genealogy programs that the GEDCOM is encoded in the UTF-8 character set.

Then select "File" -> "Save As..." from the menu options. In the "Save As" dialog box select "UTF-8" from the Encoding drop-down list.

Other text editing programs that support the UTF-8 character that you might want to try are MS Word, WordPerfect, and OpenOffice.

PAF supports the UTF-8 character set at export, so you can also try importing your GEDCOM into PAF and then exporting it back out again.

How is it US english comes back, overriding my personal choices?

The Non-Registered Visitor's language is also english, overriding the by default language settting? (...please, don't tell me I have to change my computer's settings: it took me some time before I tuned it so that it's International Settings fit my needs (Windows XP))

It happens for registered users too (even admin), and only on firefox 3. It seems to work perfectly with Internet Explorer or Firefox 2. I think it's related to firefox way of handling the language setting. --Cyan 06:21, 8 August 2008 (EST)