How many Chinese words do you know?

RJ
September 02, 2012, 07:40 PM posted in General Discussion

How many Chinese words do you know? Find out.

 

http://www.zhtoolkit.com/apps/wordtest//index.php

Profile picture
SF_Rachel
September 02, 2012, 11:02 PM

Quantification tools are always fun. This one is more fun than most (for nerds like me, anyway), since they tell you something about the methodology and offer a chance to download words used in your test. 

I'd be interested to know if anyone thinks that their estimate is accurate. My score came back as slightly more than double what I really think my word count is. I'm a pretty hard core self-quantifier so I'd be very surprised if I were underestimating myself this much. It could be that I scored several items as "known" where I correctly guessed the meaning based on my knowledge of the characters alone, but still this seems like a big difference.

One thing I found a little troubling was that one of my sample words was the transliteration of a Western given name (比尔 for "Bill") around the center of the rankband. I'm not sure what the justification for counting Western transliterations at all would be. But if they are to be counted, my instinct says  they'd belong in a pretty advanced rankband, not the middle.

 

 

 

Profile picture
RJ

Here's one for English. Pretty interesting to compare.

http://testyourvocab.com/

Profile picture
bodawei

33,600 English words, 8,685 Chinese words. I reckon that I am possibly about level with an ambitious 8 year old Chinese kid or a very lazy 11 year old, but even that is a troublesome measure because I have a different vocab set to most kids.

I tend to agree that the test over-estimates your count of Chinese words but I am not able to say why. I am reminded of that game - you get one eight or nine letter English word and you have make 35 or 40 words out of the letters, only using each letter at most once, no plurals. With any dozen chinese characters you should be able to make hundreds of words, so that would suggest that your word count should be way above your 'character' count.

None of this seems very scientific, but good fun - thanks for posting RJ.

Profile picture
guolan

Hmmm. When I chose the option to test using words, I also scored way above my expectations, and decided the test does overestimate one's vocabulary level. But, when I chose the option to test using characters, I scored way below what I expected, and decided I must focus more on recognizing characters out of context. I found it surprising that the test says I know over 10 times as many words as I know characters.

Profile picture
SF_Rachel
September 02, 2012, 11:35 PM

Here's a site that helps you quantify the number of characters you know.

http://chineselevel.com/

The methodology is different since all the content is provided in the context of reading a couple of short essays, and you just mark out the words you don't know. Which not only seems reasonable -- this is how the real world works, after all -- but also more likely to keep you honest as you test yourself (in context, I found myself very unlikely to say "I almost know that word"). At least for me.

 

 

Profile picture
bodawei

Rachel - how does this one work? (Silly question?) There was one short very easy reading on the home page - say I could read it all. I don't see how it quantifies the characters I know. Missing something. There are only three tabs - the others are 'research findings' and T-shirts. What did you think of the research findings by the way - I was surprised he said that he didn't know how to tell what a Chinese word is. He lacks the certainty of a native Chinese speaker? :)

Profile picture
SF_Rachel

Don't worry, I had a similar problem when I first encountered this. If you say "too easy, know them all" it doesn't LOOK like it, but the page refreshes super fast and a new text appears for you to read. There are a total of three texts to get through. Assuming you know all the words and mark none as "unknown" then the algorithm extrapolates that you would have the vocab of a native speaker. But if you mark words out as unknown, then the algorithm counts and evaluates, and reduces the estimate.

Regarding "what is a Chinese word" I actually think it's a somewhat valid question. The question comes down to how wide a grey band there is between compound words and short phrases (hint: in Chinese it's a very wide grey area). Think of German, which tends to compound a lot: "geschwindekeitsbegrenzung" is a single word, but why isn't it gescwindekeits (speed's) begrendzung (limit), two words? Because that's just how German is.

English has, I think, a pretty wide grey area because (at least in the American varieties) we tend to argue about spelling compound words. I used to be responsible for maintaining the style manual for my marketing department, and we'd get into real knock-down drag-outs over words like setup. Set up? Set-up? I think we ended up compromising that "set up" is a verb, "setup" is a noun, and "set-up" is an adjective.

For instance in the test referenced here, he chose to count 现代社会 (modern society) as one word, not two words 现代 (modern times) 社会 (society)。When you're taking the test, you can't say you know 社会 but don't recognize 现代。So his choices about how to parse the text mattered.

John did a post on it on Sinosplice donkey's years ago.

Profile picture
bodawei

Rachel - thanks for the low down on the test. Maybe I need to re-fresh it, I stayed on the page quite a while; is it possible I didn't notice a change in text!? Yes ... :)

On word counts .. I suppose so .. but I expected him to be something of an expert, there are rules. I wouldn't think that a knowledge of either German or English helps much, this is Chinese.

I did a reading course and you were set reading tests and you either got the word count right or wrong. This is China. There is something of a probability question when you get ambiguous sentences but I wouldn't think this happens with great frequency, and there is always a most likely option. Fun when it does happen, as in John's example on sinosplice (thanks for that) - but his example kind of proves my point, there is a single most likely interpretation.

现代社会 is I think two words, but it is quite a while since I looked at this. I take your point that using his rules (one word) you had to understand the whole expression to 'pass' - something that knocked my score around.

Profile picture
guolan

Taking this test was more fun that the other tests! It says I know far fewer words than the test listed above, but I think it's probably more accurate. At any rate, it was enjoyable to see what meaning I could glean from the passages it gave me.

Profile picture
root
September 03, 2012, 12:39 PM

cute.  Both chinese tests suggest I know waaay more words than I thought myself :(  Wonder if results would be different but for the self-scoring.  Oh well, but English vocab is 5x bigger than Chinese, so, yay!?

Profile picture
Purrfecdizzo
September 03, 2012, 01:29 PM

Interesting and enjoyable evaluation, thanks for the link.

George