fix
matthiask
February 03, 2010, 05:38 AM posted in General DiscussionHi to fix wlcp for the latest changes:
in file wlcpod_netops.py
search "signout", replace it with "Sign out"
in file wlcpod_parser.py
search line:
'get_title = re.compile('<h1><a href.*?>(?P<level>.*?)</a> -\s*(?P<name>.*?)\s*</h1>')'
replace with
get_title = re.compile('<h1>\s*<a href.*?>(?P<level>.*?)</a> -\s*(?P<name>.*?)\s*</h1>')
Thats all folks.
davidchchang
September 11, 2010, 07:30 PMThe URL for the log-in page has moved. Change "https" to "http" (so that the URL becomes http://chinesepod.com/accounts/signin) in wlcpod_netops.py and you will be able to download lessons again.
Zhaoyang
February 03, 2010, 02:09 PMAlso not fixed for me. It logs in, asks for a lesson name, and then gives this:
Downloading lesson HTML data...
Traceback (most recent call last):
File "./wlcpod.py", line 443, in
main()
File "./wlcpod.py", line 263, in main
title_and_level, lesson_desc, base_url, xml, vocab_mp3s, dia_mp3s, exp_mp3s = lesson_xml(lesson_name, vocab_trad, vocab_simp, dia_trad, exp_trad, disc_trad)
File "/home/foo/Chinese/chinesepod/wlcpod/wlcpod_parser.py", line 250, in lesson_xml
title, desc, base_url = lesson_params(disc_html)
File "/home/foo/Chinese/chinesepod/wlcpod/wlcpod_parser.py", line 235, in lesson_params
title = (lambda x: '%s (%s)' % (x.group('name'), x.group('level')))(get_title.search(html))
File "/home/foo/Chinese/chinesepod/wlcpod/wlcpod_parser.py", line 235, in
title = (lambda x: '%s (%s)' % (x.group('name'), x.group('level')))(get_title.search(html))
AttributeError: 'NoneType' object has no attribute 'group'
Zhaoyang
February 03, 2010, 02:28 PMAdded the backslash. Fixed, fixed, fixed!
matthiask, I'll remember you. Thank you.
Tal
February 04, 2010, 12:36 AMOK then, great work mathiask, it's working again. 非常感谢!
I'm still using ajaja's edit of the netops file though.
matthiask
February 04, 2010, 02:51 AMone more: CPOD also changed the expansion page: To fix the xml file, search for
break_expansion in wlcp_parser.py and replace the next two lines with this:
break_expansion = re.compile(r'(<h3>.*?</h3>)|(<button.*?</button>.*?</div>.*?</p>)', flags = re.DOTALL)
break_expansion_sentence = re.compile(r'(<h3>(?P<exp>.*?)</h3>)|(<button.*?url=(?P<mp3>.*?\.mp3).*?</button>.*?\s*(?P<words>.*?)\s*<br />.*?(</p>|<span\ id=.*?>(?P<idiomatic>.*?)</span>))', flags = re.DOTALL)
andrew_c
February 04, 2010, 03:20 AMmatthiask, got your pm, if you have a google account I can add you as an administrator on code.google.com/p/wlcp or if you just send me a zip file containing the update i'd be happy to post it for you.
matthiask
February 03, 2010, 02:07 PMtal, just search for get_title. the line should be quite at the beginning. there is a formatting error here on the site. the changed line should have a backslash in front of the bolded s
get_title = re.compile('<h1>\s*<a href.*?>(?P<level>.*?)</a> -s*(?P<name>.*?)s*</h1>')
Tal
February 04, 2010, 02:01 PMAnyone else finding the expansion fix still has a problem? (Namely, only 1 sentence from each group of 3 is processed?)
matthiask
February 04, 2010, 02:51 PMYou are right. only one sentence, some other artifacts. I have to look into it once more :( - sorry WLCP is a new piece of software for me. Trying to understand it while fixing is not always optimal.
daniel70
February 04, 2010, 03:07 PMHey Matthiask,
By coincidence I've been working on this at the same time. If you want to send me a pm with an email address, I can email my py files to you, you can diff them with what you've got, and take whatever is useful. My version addresses this issue, and a vocab issue. I'm not a python coder though, so I apologize in advance for coding style, etc.
dreiundzwanzig
February 24, 2010, 08:14 PMDoes this fix still work?
I'm still getting an error and WLCP will automatically shut down - even after performing all of your changes.
davidchchang
April 16, 2010, 04:16 AMdreiundzwanzig: It seems that the login form's action and input fields have been updated, so those need to be updated in wlcpod_netops.py as well.
If you search for "logged_in_page = get_html" (without quotes) and replace that line with:
logged_in_page = get_html(u'https://chinesepod.com/accounts/signin', data = urllib.urlencode({u'email':email, u'password':password, u'authenticity_token':authenticity_token}))
you should be able to log in and download lessons if you've made matthiask's changes above. Remember to put a tab in front of that pasted line if there isn't one, because Python is picky about whitespace.
tbarrett
May 10, 2010, 08:06 AMHi guys, new WLCP user here. Just applied all the fixes mentioned above, and it seems to be running without any issues. Thanks for all your hard work!
matthiask
February 04, 2010, 07:13 AMandrew, sent you another one ;)
Tal
February 03, 2010, 11:52 AMI'm unable to find the line you quote for the wlcpod_parser.py. I've searched for that line using Wordpad, it says it's not there.
I couldn't find the other one (the wlcpod_netops.py "signout") either, until I remembered I'd edited that file as ajaja recently suggested. I replaced the edited file with the original and found "Signout", (not "signout".) Changed it anyway.
Needless to say it's still not working for me.