Declaring SRS Bankruptcy

SF_Rachel
August 13, 2012, 10:12 PM posted in General Discussion

I’ve been using SRS via Anki / Ankidroid for a couple of years now; it’s an integral part of my daily study routine. Even on days that I don’t study, I plug away at my flashcards. It really helps with momentum. Over time though, I’ve become concerned about the growing size of my card deck; Anki doesn’t really seem to want to handle large data files (for instance, the online service only supports decks < 10MB; mine is now 3X or 4X that). Every week I see increasing hints that catastrophic data corruption is in my future.

 

Now, I’ve long suspected that the data size of my deck was due to the possibility Anki retains way more review data than is really necessary. That is, most of the data in my deck file is the review history of each card, not the card data itself. Anki retains each card’s entire lifetime history. It seems to me that the most recent 3 reviews + most recent single interval ought to provide enough predictive power to calculate a useful next interval. I have no idea how other SRS platforms handle this, but I do suspect Anki’s is not the most elegant or data efficient.

 

That aside, I really do love Anki, and have finally even gotten it configured so it displays cards exactly the way I want (it displays the hanzi really big, the pinyin is there but in a tiny font that is hard to see. That way my eye doesn’t go right to the pinyin, while I have the safety net of knowing it’s there). I hate to abandon my deck too – I’m a little bit OCD and certainly a completist, and I do love that Anki has some nice data analysis features that let me see things like predicting my card load for the future, counting my Hanzi, etc. Nevertheless, I think I’m on the verge of declaring SRS bankruptcy and starting over on a new platform.

 

Profile picture
bababardwan
August 13, 2012, 11:12 PM

Am I the only one who can see the start of Rachel's comment on the community page, but can't see anything when coming into this post?

Profile picture
SF_Rachel
August 14, 2012, 12:43 AM

Witchcraft again, I suppose. The original post must be "there" since we can see it from the front page, but I can't see it here either. Thanks for pointing it out, baba. Here's the high points.

I’ve been using SRS via Anki / Ankidroid for a couple of years now; it’s an integral part of my daily study routine. Even on days that I don’t study, I plug away at my flashcards. It really helps with momentum. Over time though, I’ve become concerned about the growing size of my card deck; Anki doesn’t really seem to want to handle large data files (for instance, the online service only supports decks < 10MB; mine is now 3X or 4X that). Every week I see increasing hints of a catastrophic data corruption in my future.

Now, I’ve long suspected that the data size of my deck was due to the possibility Anki retains way more review data than is really necessary. That is, most of the data in my deck file is the review history of each card, not the card data itself. Anki retains each card’s entire lifetime history. It seems to me that the most recent 3 reviews + most recent single interval ought to provide enough predictive power to calculate a useful next interval. I have no idea how other SRS platforms handle this, but do simple suspect Anki’s solution is not the most elegant or data efficient.
   
Just the same, I really do love Anki, and have finally even gotten it configured so it displays cards exactly the way I want (it shows me the hanzi really big, the pinyin is there, but in a tiny font that is hard to see. That way my eye doesn’t go right to the pinyin while I have the safety net of knowing it’s there). I hate to abandon my deck too – I’m a little bit OCD and certainly a completist, and I do love that Anki has some nice data analysis features that let me see things like predicting my card load for the future, counting my Hanzi, etc. Nevertheless, I think I’m on the verge of declaring SRS bankruptcy and starting over on a new platform.

1. Any long-time SRS users out there with data management ideas to share?
2. Any thoughts for people who have used other SRS platforms such as Pleco? I’m especially interested in anyone who has tried more than one platform and can compare and contrast!
3. Anyone with any experience switching platforms have ideas on minimizing disruption to my study routine?

Profile picture
pretzellogic

at first, I thought you were talking about the company that creates Anki was going bankrupt....

I use pleco. It's ok. I need desperately to do a better job of using Pleco as my daily review tool. But for whatever reason, Pleco is uncompelling for me to use for review. But others have posted here raving about Pleco as the greatest thing since sex was invented. I'd say 3 out of 4 Cpod users that use Pleco rave positively about Pleco. But that's just my opinion/casual observation.

Profile picture
SF_Rachel

pretz, could be you're just not a flashcard person. There's no reason flashcarding needs to really be at the center of anyone's review regimen. For me personally, I'm just someone who's masochistic enough to value a strict daily dose of drudgery. Esp. w/SRS, flashcarding for me is like drinking 8 glasses of water a day: I know I don't really have to do it, but I've established the habit now and doing it every day just seems so gosh-darned virtuous.

Profile picture
bodawei

'drinking 8 glasses of water a day: I know I don't really have to do it, but I've established the habit now and doing it every day just seems so gosh-darned virtuous.'

Living in China might break you of the habit. :)

Profile picture
SF_Rachel

LOL! I'm actually a 可口可乐 girl though (in which there is no virtue, I know). The water thing was just 比喻。

Profile picture
bodawei

'he water thing was just 比喻'

Oh that's a relief, I had visions of you overdoing things.

Actually even bottled water can be a problem in China - the cheap stuff is sold as 矿的 ('mineral water') which it turns out can give you kidney problems, particularly in children. I have just revised our order to the picturesquely named 怡宝的 (plain water).

Profile picture
babyeggplant
August 14, 2012, 01:40 AM

Can you just start another deck? I have one deck for sentences, one for writing, and another for Thai. I have my writing deck suspended because I recently transfered everything over to skritter. I wasn't sure how to transfer the scheduling info and I'm not even sure that you can, so I just started over. It took me a couple of days to get through everything, but I'm caught up now. As far as trying other platforms, I haven't. I'd try to keep everything in one place, but I'd like to hear what other have to say.

Also, I'm a bit curious about your deck! My sentences deck has about 1,700 cards and is less than 6MB. You must have an enormous amount of cards.

Profile picture
SF_Rachel

I'll answer your last question first since it kind of bears on everything else for me.

My deck is 15,000+ cards. Only about 300 of that is phrases connecting nouns with measure words (and those are "recall" only). The rest are words and short fixed phrases for which I have both recall and recognition cards (so a total of about 7500 facts). I've built that deck gradually and organically (about 60-70 new facts per week) over the past 2+ years.

With a deck that old, most of the cards are mature, and my average interval is now over 365 days. However, on any given day more than 80% of my card load is cards with current intervals of less than 60 days. These days I'm reviewing about 280 cards daily (in doses over the course of the day).

I make a lot of mistakes (about 15% failure rate), but that's the theoretical beauty of SRS, isn't it? It keeps track for me and patiently keeps trying to help me learn the ones I get wrong. So I kind of "need" the little ego boost I get at the beginning of each day where the first two dozen cards or so are very mature, very easy, and satisfyingly announce that I'll next see them in 4 years. :-)

Anyway, with a deck like that you can probably see why taking my current facts and starting over with them is impractical. Not only would the initial load nearly kill me, but the subsequently unnaturally high rate of "immature" cards (that I had gotten right!) would still keep my card load up way too high for months.

I'm interested by what you say about writing. I don't think it would work for me given the way I've set myself up over time to treat flashcarding as something I can do while only half-paying attention (watching TV, standing in line) but it's intriguing just the same.

Profile picture
babyeggplant

Wow! That definitely is a lot of cards. You must have a massive vocabulary. I only do recognition, so that cuts down a lot. At my current rate, it would take me about 20 more years to reach 10,000 cards.

Profile picture
SF_Rachel

I heard Jenny say once in a long-ago podcast that when she was learning English her target was 20 new words EVERY DAY. I was inspired to try it, but found that I couldn't sustain that. 10 new words per day (with permission to fail) is my max target.

For my Chinese vocabulary to ever be half as good as Jenny's English (if you listen to her word choice, her English vocabulary is astoundingly large and nuanced for a non-native speaker not living in language) is still an extremely aspirational long-term goal.

Profile picture
root
August 14, 2012, 04:36 AM

I've used Pleco on a daily basis for about 12 months, and anki for about a month. The biggest draw for Pleco was the automatic voice reading (with the add-on). The biggest drawback of Pleco is the difficulty of syncing, it's rather cumbersome.

In the end, i had to switch to a pay-service, at iknow.jp -- it's the best of all worlds. They have lots of pre-made content, super nice mobile apps, and excellent website apps. All content is not only word recordings, but also pictures (which help much more than I expected), and example sentences for each word (with full sound). A little more than SRS, there is no self-scoring -- it's either multiple choice quiz, or type the pinyin (for one word, or whole sentence). Full sentence pinyin typing makes a big difference, too, to my surprise.

I know, Anki and Pleco can also be configured to use quizzes, and combine several types of quizzes (sound/hanzi, meaning/pinyin, hanzi/pinyin, etc...) into one review session, but it always seemed like the investment to get your own content for sentences, with images, and MP3 would be rather huge. (If anyone has a method to enter CPod vocab/sentence MP3 into Anki, I would really really love to hear!)

As a side note, I usually end up breaking up decks into no more than 200 characters, i feel bigger does not really help with efficiency. This way the size can be managed and still get enough of a random selection for each daily review.

as a PS -- if noone has a way to enter CPod MP3s into Anki, I really would love to see a collaboration with iknow, similar to scritter, instead of developing a home-grown flash cards here, but that's a different discussion...

Profile picture
babyeggplant

I only review sentences in Anki and remembering is easier when I have audio. The easiest way I've found is to open the dialogue file in Audacity and export the sentence/sentences that I'm adding. I also have the dialogue tab opened and copy and paste the sentence right from there. When I don't have the mp3 right at my fingertips (if I'm watching something on youtube or listening to random sentences found in the cpod glossary) I have even used my computer's own mic to record something with audacity. Sure, the quality is pretty horrible. I have quite a few audio files that have someone talking, sneezing, or a phone ringing in the background. Poor audio is still better than none for me.

Profile picture
SF_Rachel

Root -- Interesting, thanks. Personally, I find that directing, creating, and maintaining my own content is more than half of the learning exercise from me. I've never understood the attraction of using decks that other people create. But as I've said other places before, I tend to find drudgery satisfying.

I'll also be the first to say that I know that I SHOULD be using flashcards for whole sentences, not individual words. (Sigh), vocabulary acquisition has sort of become my crack. How could something that's so wrong FEEL SO RIGHT?

I do all my full-sentence work with a pencil and paper, for which I set aside a couple of hours a week. I agree, it makes a ton of difference. Largely, I suspect, because you're constantly reinforcing the high-frequency stuff so that it's impossible to forget it. The moderate frequency and low frequency stuff doesn't stick so well that way, but everything gradually accumulates over time.

Profile picture
root

Yea, I also do love the simplicity of a simple context-free flashcard deck, quick to review, can do 100 or so cards very quickly, I did for about a year with Pleco. With Pleco entering is pretty easy, just add directly from the dictionary on the device, no need the PC at all. Anki iPhone app is a bit more limited, it's not too easy to add new items, afaik, so that's another plus for Pleco.

Both Anki and Pleco do make it a bit more difficult to attach context to cards, which is where I find iKnow is much stronger. Every word fact can have 6 different quiz modes (pairs of sound, pinyin, hanzi, and translation), and on top of it will show you one out of several example sentences when you get the anser right (with a picture and an MP3). In addition it's also useful to type pinyin for these example sentences from hearing the sound, using the same content. The progress feels much slower, but I am just hoping that context is important, or so they keep telling us here at CPod :)

As far as entering data being half of the learning -- also fully agree. You can enter your own facts on iKnow no problem, the biggest effort is finding MP3s for words and sentences, that's why i mostly used pre-made "top 3000" content up until now. In the first 1000 the most frequent characters pretty much overlap with my CPod-learnt material, so it wasn't a problem. In the second 1000 now, it's getting a bit divergent, so I am looking for an automatic way to export daily CPod vocab and expansion MP3s to a folder. Making your own content would solve this issue, but the effort of grabbing 20-40 sentence MP3's every day with Audacity still scares me a little bit, don't know about you.

My ideal situation is that the learning happens mostly by doing CPod lessons, with the vocab and expansion content magically being added to my SRS deck, with the same high-context review modes as I've gotten used to on iKnow... Yea, I know, that's a big want. The learning from entering would be replaced by the learning from the lesson, and all would be well, no overhead :) It just seems too good of a match to pass up, both CPod and the other place emphasize lexical, context based learning, and full sentences, with the only difference being CPod has a lot of content but barely any way to access it, and iKnow has great apps, but no teaching lessons...