fix
matthiask
February 03, 2010 at 05:38 AM posted in General DiscussionHi to fix wlcp for the latest changes:
in file wlcpod_netops.py
search "signout", replace it with "Sign out"
in file wlcpod_parser.py
search line:
'get_title = re.compile('<h1><a href.*?>(?P<level>.*?)</a> -\s*(?P<name>.*?)\s*</h1>')'
replace with
get_title = re.compile('<h1>\s*<a href.*?>(?P<level>.*?)</a> -\s*(?P<name>.*?)\s*</h1>')
Thats all folks.
tbarrett
May 10, 2010 at 08:06 AM
Hi guys, new WLCP user here. Just applied all the fixes mentioned above, and it seems to be running without any issues. Thanks for all your hard work!
davidchchang
April 16, 2010 at 04:16 AM
dreiundzwanzig: It seems that the login form's action and input fields have been updated, so those need to be updated in wlcpod_netops.py as well.
If you search for "logged_in_page = get_html" (without quotes) and replace that line with:
logged_in_page = get_html(u'https://chinesepod.com/accounts/signin', data = urllib.urlencode({u'email':email, u'password':password, u'authenticity_token':authenticity_token}))
you should be able to log in and download lessons if you've made matthiask's changes above. Remember to put a tab in front of that pasted line if there isn't one, because Python is picky about whitespace.
dreiundzwanzig
February 24, 2010 at 08:14 PM
Does this fix still work?
I'm still getting an error and WLCP will automatically shut down - even after performing all of your changes.
daniel70
February 04, 2010 at 03:07 PM
Hey Matthiask,
By coincidence I've been working on this at the same time. If you want to send me a pm with an email address, I can email my py files to you, you can diff them with what you've got, and take whatever is useful. My version addresses this issue, and a vocab issue. I'm not a python coder though, so I apologize in advance for coding style, etc.
matthiask
February 04, 2010 at 02:51 PM
You are right. only one sentence, some other artifacts. I have to look into it once more :( - sorry WLCP is a new piece of software for me. Trying to understand it while fixing is not always optimal.
Tal
February 04, 2010 at 02:01 PM
Anyone else finding the expansion fix still has a problem? (Namely, only 1 sentence from each group of 3 is processed?)
andrew_c
February 04, 2010 at 03:20 AM
matthiask, got your pm, if you have a google account I can add you as an administrator on code.google.com/p/wlcp or if you just send me a zip file containing the update i'd be happy to post it for you.
matthiask
February 04, 2010 at 02:51 AM
one more: CPOD also changed the expansion page: To fix the xml file, search for
break_expansion in wlcp_parser.py and replace the next two lines with this:
break_expansion = re.compile(r'(<h3>.*?</h3>)|(<button.*?</button>.*?</div>.*?</p>)', flags = re.DOTALL)
break_expansion_sentence = re.compile(r'(<h3>(?P<exp>.*?)</h3>)|(<button.*?url=(?P<mp3>.*?\.mp3).*?</button>.*?\s*(?P<words>.*?)\s*<br />.*?(</p>|<span\ id=.*?>(?P<idiomatic>.*?)</span>))', flags = re.DOTALL)
Tal
February 04, 2010 at 12:36 AM
OK then, great work mathiask, it's working again. 非常感谢!
I'm still using ajaja's edit of the netops file though.
Zhaoyang
February 03, 2010 at 02:28 PM
Added the backslash. Fixed, fixed, fixed!
matthiask, I'll remember you. Thank you.
Zhaoyang
February 03, 2010 at 02:09 PM
Also not fixed for me. It logs in, asks for a lesson name, and then gives this:
Downloading lesson HTML data...
Traceback (most recent call last):
File "./wlcpod.py", line 443, in
main()
File "./wlcpod.py", line 263, in main
title_and_level, lesson_desc, base_url, xml, vocab_mp3s, dia_mp3s, exp_mp3s = lesson_xml(lesson_name, vocab_trad, vocab_simp, dia_trad, exp_trad, disc_trad)
File "/home/foo/Chinese/chinesepod/wlcpod/wlcpod_parser.py", line 250, in lesson_xml
title, desc, base_url = lesson_params(disc_html)
File "/home/foo/Chinese/chinesepod/wlcpod/wlcpod_parser.py", line 235, in lesson_params
title = (lambda x: '%s (%s)' % (x.group('name'), x.group('level')))(get_title.search(html))
File "/home/foo/Chinese/chinesepod/wlcpod/wlcpod_parser.py", line 235, in
title = (lambda x: '%s (%s)' % (x.group('name'), x.group('level')))(get_title.search(html))
AttributeError: 'NoneType' object has no attribute 'group'
matthiask
February 03, 2010 at 02:07 PM
tal, just search for get_title. the line should be quite at the beginning. there is a formatting error here on the site. the changed line should have a backslash in front of the bolded s
get_title = re.compile('<h1>\s*<a href.*?>(?P<level>.*?)</a> -s*(?P<name>.*?)s*</h1>')
Tal
February 03, 2010 at 11:52 AM
I'm unable to find the line you quote for the wlcpod_parser.py. I've searched for that line using Wordpad, it says it's not there.
I couldn't find the other one (the wlcpod_netops.py "signout") either, until I remembered I'd edited that file as ajaja recently suggested. I replaced the edited file with the original and found "Signout", (not "signout".) Changed it anyway.
Needless to say it's still not working for me.
davidchchang
September 11, 2010 at 07:30 PMThe URL for the log-in page has moved. Change "https" to "http" (so that the URL becomes http://chinesepod.com/accounts/signin) in wlcpod_netops.py and you will be able to download lessons again.