fix

matthiask
February 03, 2010, 05:38 AM posted in General Discussion

Hi to fix wlcp for the latest changes:

in file wlcpod_netops.py

search "signout", replace it with "Sign out"

in file wlcpod_parser.py

search line:

'get_title = re.compile('<h1><a href.*?>(?P<level>.*?)</a> -\s*(?P<name>.*?)\s*</h1>')'

replace with

get_title = re.compile('<h1>\s*<a href.*?>(?P<level>.*?)</a> -\s*(?P<name>.*?)\s*</h1>')

 

Thats all folks.

Profile picture
Tal
February 03, 2010, 11:52 AM

I'm unable to find the line you quote for the wlcpod_parser.py. I've searched for that line using Wordpad, it says it's not there.

I couldn't find the other one (the wlcpod_netops.py "signout") either, until I remembered I'd edited that file as ajaja recently suggested. I replaced the edited file with the original and found "Signout", (not "signout".) Changed it anyway.

Needless to say it's still not working for me.

Profile picture
davidchchang
September 11, 2010, 07:30 PM

The URL for the log-in page has moved. Change "https" to "http" (so that the URL becomes http://chinesepod.com/accounts/signin) in wlcpod_netops.py and you will be able to download lessons again.

Profile picture
Zhaoyang
February 03, 2010, 02:09 PM

Also not fixed for me. It logs in, asks for a lesson name, and then gives this:

Downloading lesson HTML data...

Traceback (most recent call last):

File "./wlcpod.py", line 443, in

main()

File "./wlcpod.py", line 263, in main

title_and_level, lesson_desc, base_url, xml, vocab_mp3s, dia_mp3s, exp_mp3s = lesson_xml(lesson_name, vocab_trad, vocab_simp, dia_trad, exp_trad, disc_trad)

File "/home/foo/Chinese/chinesepod/wlcpod/wlcpod_parser.py", line 250, in lesson_xml

title, desc, base_url = lesson_params(disc_html)

File "/home/foo/Chinese/chinesepod/wlcpod/wlcpod_parser.py", line 235, in lesson_params

title = (lambda x: '%s (%s)' % (x.group('name'), x.group('level')))(get_title.search(html))

File "/home/foo/Chinese/chinesepod/wlcpod/wlcpod_parser.py", line 235, in

title = (lambda x: '%s (%s)' % (x.group('name'), x.group('level')))(get_title.search(html))

AttributeError: 'NoneType' object has no attribute 'group'

Profile picture
Zhaoyang
February 03, 2010, 02:28 PM

Added the backslash. Fixed, fixed, fixed!

matthiask, I'll remember you. Thank you.

Profile picture
Tal
February 04, 2010, 12:36 AM

OK then, great work mathiask, it's working again. 非常感谢!

I'm still using ajaja's edit of the netops file though.

Profile picture
matthiask
February 04, 2010, 02:51 AM

one more: CPOD also changed the expansion page: To fix the xml file, search for

break_expansion in wlcp_parser.py and replace the next two lines with this:

break_expansion = re.compile(r'(<h3>.*?</h3>)|(<button.*?</button>.*?</div>.*?</p>)', flags = re.DOTALL)
break_expansion_sentence = re.compile(r'(<h3>(?P<exp>.*?)</h3>)|(<button.*?url=(?P<mp3>.*?\.mp3).*?</button>.*?\s*(?P<words>.*?)\s*<br />.*?(</p>|<span\ id=.*?>(?P<idiomatic>.*?)</span>))', flags = re.DOTALL)

Profile picture
andrew_c
February 04, 2010, 03:20 AM

matthiask, got your pm, if you have a google account I can add you as an administrator on code.google.com/p/wlcp or if you just send me a zip file containing the update i'd be happy to post it for you.

Profile picture
matthiask
February 03, 2010, 02:07 PM

tal, just search for get_title. the line should be quite at the beginning. there is a formatting error here on the site. the changed line should have a backslash in front of the bolded s

get_title = re.compile('<h1>\s*<a href.*?>(?P<level>.*?)</a> -s*(?P<name>.*?)s*</h1>')

Profile picture
Tal
February 04, 2010, 02:01 PM

Anyone else finding the expansion fix still has a problem? (Namely, only 1 sentence from each group of 3 is processed?)

Profile picture
matthiask
February 04, 2010, 02:51 PM

You are right. only one sentence, some other artifacts. I have to look into it once more :( - sorry WLCP is a new piece of software for me. Trying to understand it while fixing is not always optimal.

Profile picture
daniel70
February 04, 2010, 03:07 PM

Hey Matthiask,

By coincidence I've been working on this at the same time. If you want to send me a pm with an email address, I can email my py files to you, you can diff them with what you've got, and take whatever is useful. My version addresses this issue, and a vocab issue. I'm not a python coder though, so I apologize in advance for coding style, etc.

Profile picture
dreiundzwanzig
February 24, 2010, 08:14 PM

Does this fix still work?

I'm still getting an error and WLCP will automatically shut down - even after performing all of your changes.

Profile picture
davidchchang
April 16, 2010, 04:16 AM

dreiundzwanzig: It seems that the login form's action and input fields have been updated, so those need to be updated in wlcpod_netops.py as well.

If you search for "logged_in_page = get_html" (without quotes) and replace that line with:

logged_in_page = get_html(u'https://chinesepod.com/accounts/signin', data = urllib.urlencode({u'email':email, u'password':password, u'authenticity_token':authenticity_token}))

you should be able to log in and download lessons if you've made matthiask's changes above. Remember to put a tab in front of that pasted line if there isn't one, because Python is picky about whitespace.

Profile picture
tbarrett
May 10, 2010, 08:06 AM

Hi guys, new WLCP user here. Just applied all the fixes mentioned above, and it seems to be running without any issues. Thanks for all your hard work!

Profile picture
matthiask
February 04, 2010, 07:13 AM

andrew, sent you another one ;)