fix

matthiask

February 03, 2010, 05:38 AM posted in General Discussion

Hi to fix wlcp for the latest changes:

in file wlcpod_netops.py

search "signout", replace it with "Sign out"

in file wlcpod_parser.py

search line:

'get_title = re.compile('<h1><a href.*?>(?P<level>.*?)</a> -\s*(?P<name>.*?)\s*</h1>')'

replace with

get_title = re.compile('<h1>\s*<a href.*?>(?P<level>.*?)</a> -\s*(?P<name>.*?)\s*</h1>')

Thats all folks.

Tal

February 03, 2010, 11:52 AM

I'm unable to find the line you quote for the wlcpod_parser.py. I've searched for that line using Wordpad, it says it's not there.

I couldn't find the other one (the wlcpod_netops.py "signout") either, until I remembered I'd edited that file as ajaja recently suggested. I replaced the edited file with the original and found "Signout", (not "signout".) Changed it anyway.

Needless to say it's still not working for me.

davidchchang

September 11, 2010, 07:30 PM

The URL for the log-in page has moved. Change "https" to "http" (so that the URL becomes http://chinesepod.com/accounts/signin) in wlcpod_netops.py and you will be able to download lessons again.

Zhaoyang

February 03, 2010, 02:09 PM

Also not fixed for me. It logs in, asks for a lesson name, and then gives this:

Downloading lesson HTML data...

Traceback (most recent call last):

File "./wlcpod.py", line 443, in

main()

File "./wlcpod.py", line 263, in main

title_and_level, lesson_desc, base_url, xml, vocab_mp3s, dia_mp3s, exp_mp3s = lesson_xml(lesson_name, vocab_trad, vocab_simp, dia_trad, exp_trad, disc_trad)

File "/home/foo/Chinese/chinesepod/wlcpod/wlcpod_parser.py", line 250, in lesson_xml

title, desc, base_url = lesson_params(disc_html)

File "/home/foo/Chinese/chinesepod/wlcpod/wlcpod_parser.py", line 235, in lesson_params

title = (lambda x: '%s (%s)' % (x.group('name'), x.group('level')))(get_title.search(html))

File "/home/foo/Chinese/chinesepod/wlcpod/wlcpod_parser.py", line 235, in

title = (lambda x: '%s (%s)' % (x.group('name'), x.group('level')))(get_title.search(html))

AttributeError: 'NoneType' object has no attribute 'group'

Zhaoyang

February 03, 2010, 02:28 PM

Added the backslash. Fixed, fixed, fixed!

matthiask, I'll remember you. Thank you.

Tal

February 04, 2010, 12:36 AM

OK then, great work mathiask, it's working again. 非常感谢！

I'm still using ajaja's edit of the netops file though.

matthiask

February 04, 2010, 02:51 AM

one more: CPOD also changed the expansion page: To fix the xml file, search for

break_expansion in wlcp_parser.py and replace the next two lines with this:

break_expansion = re.compile(r'(<h3>.*?</h3>)|(<button.*?</button>.*?</div>.*?</p>)', flags = re.DOTALL)
break_expansion_sentence = re.compile(r'(<h3>(?P<exp>.*?)</h3>)|(<button.*?url=(?P<mp3>.*?\.mp3).*?</button>.*?\s*(?P<words>.*?)\s*<br />.*?(</p>|<span\ id=.*?>(?P<idiomatic>.*?)</span>))', flags = re.DOTALL)

andrew_c

February 04, 2010, 03:20 AM

matthiask, got your pm, if you have a google account I can add you as an administrator on code.google.com/p/wlcp or if you just send me a zip file containing the update i'd be happy to post it for you.

matthiask

February 03, 2010, 02:07 PM

tal, just search for get_title. the line should be quite at the beginning. there is a formatting error here on the site. the changed line should have a backslash in front of the bolded s

get_title = re.compile('<h1>\s*<a href.*?>(?P<level>.*?)</a> -s*(?P<name>.*?)s*</h1>')

Tal

February 04, 2010, 02:01 PM

Anyone else finding the expansion fix still has a problem? (Namely, only 1 sentence from each group of 3 is processed?)

matthiask

February 04, 2010, 02:51 PM

You are right. only one sentence, some other artifacts. I have to look into it once more :( - sorry WLCP is a new piece of software for me. Trying to understand it while fixing is not always optimal.

daniel70

February 04, 2010, 03:07 PM

Hey Matthiask,

By coincidence I've been working on this at the same time. If you want to send me a pm with an email address, I can email my py files to you, you can diff them with what you've got, and take whatever is useful. My version addresses this issue, and a vocab issue. I'm not a python coder though, so I apologize in advance for coding style, etc.

dreiundzwanzig

February 24, 2010, 08:14 PM

Does this fix still work?

I'm still getting an error and WLCP will automatically shut down - even after performing all of your changes.

davidchchang

April 16, 2010, 04:16 AM

dreiundzwanzig: It seems that the login form's action and input fields have been updated, so those need to be updated in wlcpod_netops.py as well.

If you search for "logged_in_page = get_html" (without quotes) and replace that line with:

logged_in_page = get_html(u'https://chinesepod.com/accounts/signin', data = urllib.urlencode({u'email':email, u'password':password, u'authenticity_token':authenticity_token}))

you should be able to log in and download lessons if you've made matthiask's changes above. Remember to put a tab in front of that pasted line if there isn't one, because Python is picky about whitespace.

tbarrett

May 10, 2010, 08:06 AM

Hi guys, new WLCP user here. Just applied all the fixes mentioned above, and it seems to be running without any issues. Thanks for all your hard work!

matthiask

February 04, 2010, 07:13 AM

andrew, sent you another one ;)