DVDPedia - extended characters in html exports

Report your bugs here - if someone else has already mentioned the same bug, just add on to their post with as much info as possible to make the hunting easier.
Post Reply
bdge
Bruji Friend
Bruji Friend
Posts: 14
Joined: Tue Sep 13, 2011 7:23 am
Location: London, England

DVDPedia - extended characters in html exports

Post by bdge »

Hi. I'm on the latest DVDPedia and using the excellent "MyFancyIndex" template to export a bunch of films. Problem is that certain characters, such as the o-acute in Almodovar (Almodóvar) and the o-slash in Soren (Søren) are appearing as Chinese (I think) characters.

If you go to http://oddpupil.org/ilatest you will see what I mean -- check out the rows for "Talk To Her", "Innan frosten (Wallander)", "Forbrydelsen II", "The Killing", "Hesher", "The Story of the Weeping Camel". They all have at least one of Director, Writer or Studio in which there's a diacritic that's mis-rendered.

If you look at the source of the IMDB pages where the information originally comes from, both it and my export page are declared as UTF-8. However in my page the characters themselves are not represented as special HTML entities (e.g. "ø"), rather they are a single UTF-8 character, so I suppose something within DVDPedia's HTML export engine, or the MyFancyIndex template, has converted them to a single character representation of the entity using the wrong code page?

Any ideas how to work around it?

Many thanks in advance.
User avatar
Nora
Site Admin
Posts: 2155
Joined: Sun Jul 04, 2004 5:03 am
Contact:

Re: DVDPedia - extended characters in html exports

Post by Nora »

Try retyping those characters in DVDpedia and then export the collection again. I've just tried to repeat the problem here using "Fobrydelsen" as an example (love that show! :)) but when downloading the data from IMDb the writer actually doesn't come along for this particular entry so I'm guessing you either did an advanced search on another site or added "Søren Sveistrup" manually.

If retyping the character in DVDpedia doesn't fix the problem, could you please send me a .dcard file for that entry so I can check it out here? To create a dcard, select the entry in DVDpedia's table view and drag it out to your Desktop. The .dcard file will automatically be created and you simply have to attach it to the email. You'll find our email on the support page.
bdge
Bruji Friend
Bruji Friend
Posts: 14
Joined: Tue Sep 13, 2011 7:23 am
Location: London, England

Re: DVDPedia - extended characters in html exports

Post by bdge »

Hi again Nora, I have tried to copy / paste that character in to the Writers field from a variety of places (here, random web pages, text editor) and it does a weird thing. It treats the Paste as if I've pressed comma, then inserts the extended character entity on its own as if I've pressed comma again, and leaves the cursor ready to start creating another tag.

To clarify.. screenshot below is what happens when I did the following:

1. Click just to the right of the 'Per Daumiller' tag
2. Type 'Test'
3. Press Cmd-V to paste in the o-slash (ø)

I was expecting of course to see 'Testø'.

Whether this is related to the original problem I have no idea. This has distracted me from it a bit! I'll keep investigating.


Image
bdge
Bruji Friend
Bruji Friend
Posts: 14
Joined: Tue Sep 13, 2011 7:23 am
Location: London, England

Re: DVDPedia - extended characters in html exports

Post by bdge »

More info -- it doesn't matter what I do, there is still some kind of mistranslation going on because pasting the correct character into the name and re-exporting as HTML, it still comes up wrong, e.g. "Pedro Almodóvar" is coming out as "Pedro Almod贸var"

Screenshot of the above text here, just so we can be sure we're talking about the same characters! --


Image
User avatar
Nora
Site Admin
Posts: 2155
Joined: Sun Jul 04, 2004 5:03 am
Contact:

Re: DVDPedia - extended characters in html exports

Post by Nora »

Can you please send me your Database.dvdpd file as well as your autofill.xml file so I can try to reproduce the problem here with your exact settings and data. You'll find the files in your Home folder under ~/Library/Application Support/DVDpedia/
Please archive the files before sending to save space. (Select them and choose 'Archive' or 'Compress' from the File menu. That'll create a new file called 'Archive.zip' which you can then attach to the email.)

If you're on Lion, it hides the Library folder by default so to go to your DVDpedia data folder you have to use the Finder's 'Go' menu, select 'Go to Folder' and copy paste ~/Library/Application Support/DVDpedia into the window that appears.
User avatar
Conor
Top Dog
Posts: 5346
Joined: Sat Jul 03, 2004 12:58 pm
Contact:

Re: DVDPedia - extended characters in html exports

Post by Conor »

1. The paste into a token field is an annoyance we been trying to change but have not been able to get around the Apple behavior for that field. I have a working solution currently in the beta version of DVDpedia, but I am not completely happy with it as I like to create the tokens when nothing is being edited, but so far the token field will not let me know if there is an edit going on.

2. For the export template make sure the template you are using declares utf-8 as the character set at the top in the meta section when viewed with a text editor and not as HTML. Should be:

Code: Select all

<meta http-equiv="content-type" content="text/html; charset=utf-8" />
I planning to update all the templates soon as there should be no modern browser that is not UTF-8 compatible in this day and age.

If that does not solve the export issue, do please send us your database as Nora mentioned as well as letting us know what template you are using for your export.
User avatar
Conor
Top Dog
Posts: 5346
Joined: Sat Jul 03, 2004 12:58 pm
Contact:

Re: DVDPedia - extended characters in html exports

Post by Conor »

I forgot to mention the most important thing, Forbrydelsen is a must-have in any crime collection. Eagerly waiting for season two to be on sale with English subtitles.
Post Reply