Entering Titles That Have Characters?

eupnea63355
March 31, 2009, 01:09 AM posted in General Discussion

Is it possible to use WLCP with lessons that have characters in the url, such as the most excellent Poems With Pete? Thanks.

Profile picture
andrew_c
March 31, 2009, 04:05 AM

It should

the problem is that in Windows, to the best of my knowledge, you cannot type Chinese characters in the console (the black box that pops up when running)  WLCP..

Therefore, you should copy and paste the lesson name into lessons.txt in order to download such lessons. 

Please share any difficulties you have.

Profile picture
eupnea63355
May 02, 2009, 12:20 AM

I stopped here to ask about the colon in cutting-a-frog, read the comment on EditPad Lite, downloaded it, and IT WORKS!!! Thanks so much you guys for sharing your discoveries.

Again, Andrew, the most wonderful thing about your program, for me, is being able to download all of those little mp3's. Loading them up into my player and letting them play at random really gives my brain a good workout. This aural comprehension opportunity is just great for someone like me who has little other opportunity to experience the language. So thanks again. A ga-zillion thanks! (ESL people, ga-zillion is slang for a huge number)

Profile picture
andrew_c
March 31, 2009, 03:25 PM

That's helpful information, but I need to clarify a few things to help..

Does it still ask you to type the name of a lesson, as if there is no lessons.txt file at all, or is something else happening?   After you run it, is there a welovechinesepodlog.txt file produced somewhere that you could send me to take a look at, my email is andrew.corrigan at gmail dot com? 

Profile picture
eupnea63355
March 31, 2009, 04:25 PM

It is asking me to type in the name of a lesson, as if there is no lessons.txt file. Thanks for asking!

Profile picture
eupnea63355
April 06, 2009, 05:01 PM

Hi. Andrew. I just noticed you asked me to send you the log file. Sorry I missed that note! However, I do have success with the lessons.txt file that I save as a .txt file, using filenames that do not have characters. If a file has chars it changes the hanzi to ???'s.

So, the encoding seems to be a problem. I'll send the log file along, but that's basically what you'll see. Thanks!

Profile picture
derek
April 16, 2009, 11:01 AM

I just had this problem trying to download an advanced lesson (谋杀案). I deleted lessons.txt and added the lesson again, but wlcp still reported no lessons found. Did some investigation and found that older versions of Python had a lot of problems with unicode support, so I installed Python 2.6 and tried again, this time it worked fine.

My guess is that there is a bug in older versions of Python, probably the codecs module and or the string strip() method that sometimes returns null for a unicode string.

Thoughts?

Profile picture
andrew_c
April 16, 2009, 12:24 PM

Hi Derek,  If it works, that's good enough. I don't really understand unicode to be honest.

Eupnea, sorry I never replied, it's been on my todo list, I've just been busy.

Profile picture
derek
April 16, 2009, 08:02 PM

Hi Andrew,

It could also be platform specific. I was using WinXP when the problem occurred, but have never had this problem on Linux.

Python 3.0 is now available as a stable production release, I will install this version and do some more testing when I have time.

Profile picture
andrew_c
April 17, 2009, 12:24 AM

I doubt WLCP is compatible with 3.0

Profile picture
eupnea63355
March 31, 2009, 02:17 PM

Thank you Andrew. I have made a very nice lessons.txt file...went to an area where there is a "real" internet connection, and I cannot get the program to read, or find, the .txt file. I put it in all the wlcp directories, trying each one, and none work. I had been running wlcp from a shortcut on the desktop, so tried to see if it would run from the actual file location. That didn't work either. Any ideas? (By the way, the trip out of the "sticks" was worth it, as I still downloaded lots of lessons individually. So much faster than at home. I still have my lesson.txt file awaiting with Pete's poems.)

To be clear, I made the file in Notepad, and saved it in unicode.

 

Profile picture
derek
April 21, 2009, 03:30 AM

Andrew,

Your right, I tried Python 3.0.1 but WLCP would not run. Removed 3.0.1, reinstalled 2.5 and tried downloading the advanced lesson (谋杀案) again. This time it worked!

I will do some more investigation to see if this is really a python issue or is system related.

 

Profile picture
sfrrr
April 26, 2009, 01:43 AM

Derek--did you ever nail down what causes CPod to reject unicode characters in lessons.txt? I'm still stumped.

Profile picture
derek
April 27, 2009, 04:59 AM

Hi sfrrr,

After much tedious experimenting it seems that the problem has nothing to do with Python or WLCP. I have only been able to replicate the problem on WinXP by doing the following:

1. Created a new lessons.txt with "Notepad" and added some advanced lessons. Started WLCP, all downloaded with no problems.

2. Opened lessons.txt with "Wordpad", added some new advanced lessons, then saved as a unicode text file. Got a warning prompt saying that all text formatting would be lost, clicked on OK to save anyway. Started WLCP, NO LESSONS FOUND!

I have not been able to replicate the problem on Linux, so I think it is a Windows/Wordpad/Notepad UTF-8 encoding issue.

If you are using WinXP/Vista then a possible fix is to download a chinese text editor such as NJStar, delete lessons.txt and create a new one with the chinese editor and save it as a UTF-8 text file.

This should save lessons.txt with the correct encoding.

Good luck!

Profile picture
sfrrr
April 27, 2009, 09:34 PM

Derek--it sounds so reasonable. I'm about to try it. I'll let you know. And thanks.

Profile picture
derek
April 28, 2009, 12:17 AM

Some more info I found on unicode support on WinXP/Vista:

1. Files saved from Wordpad as Unicode text files are stored as UTF16 Little Endian.

2. According to some of the experts on Unicode, Notepad does not provide full support for it. Using Notepad may sometimes give unexpected results.
No matter what encoding you saved a text file with, Notepad will always 'try to guess' when you open the file.

WLCP opens lessons.txt as a UTF-8 file, so Wordpad and Notepad are not recommended if you are downloading advanced or media lessons. Use an editor that can explicitly save text files as UTF-8.

The two editors I use for writing Chinese are OpenOffice.org Writer and NJStar Word Processor, both have options for saving as UTF-8.

Profile picture
sfrrr
April 28, 2009, 09:21 PM

Derek--I dislike the interface of NJStar--seems very old-fashioned to me. So I use EditPadLite (free to individuals) to edit and save in one of several Unicode formats--in this case UTF8. Worked like a charm. Thank you, thank you.

Profile picture
derek
April 29, 2009, 03:28 AM

sfrrr,

Happy to help out. Just tried EditPadLite, a very good editor. Must be time to retire NJStar, I've been using it for so many years and never bothered to try anything else. Nice one, thanks.

Profile picture
sfrrr
April 20, 2009, 07:38 PM

I installed python 2.6 and, so far, I still can't get wlcp to recognize my lessons.txt. I'm still experimenting.

 

Sandra