Expand your vocab - no new character!

goulnik
April 03, 2008, 04:48 PM posted in General Discussion

I finally got around to doing what I wanted to do for a long time, i.e. starting from a list of characters I know, prepare a lists of all words containing those characters.

I needed a dictionary file for this, so as a trial run, this little script uses CEDICT, but this can easily be changed. As a trial, I used the recent Upper Intermediate lesson Saved by the Gong - Maths, with comments. From 200+ characters, it generates over 5600 definitions, i.e. all possible combinations of those 200 characters from words in the CEDICTionary.

It's all JavaScript, using a flat dictionary text file, so performance may not be impressive, so feel free to download / modify.

Profile picture
gesang
April 04, 2008, 08:33 AM

hi goulniky, this is a very useful idea. and it sounds like you did so much work!! i already scanned the internet finding something like this! but unfortunately i am not able to use your script. i typed several common characters to try them, but nothing happened. what could i have done wrong?

Profile picture
marcelbdt
June 03, 2008, 09:33 PM

This seems to be an interesting toy, but when I tried to load the dictionary, firefox (v 2.0.0.14 under windows) dies, and refuses to answer anything. I waited like 5 minutes before I killed it - should I have been more patient?

Profile picture
goulnik
April 04, 2008, 09:31 AM

oops, filename bug out of the way, should work now with Firefox (but possibly not with IE).

Profile picture
gesang
April 04, 2008, 10:13 AM

it worked!! thank you!!! its amazing. i keep a document in ”words“ where i just type any new character i learn sorted in passages from a-z by pinyin but no translation... i use this to paste in the "character training quiz" applet on www.mdbg.net . i came to about 650 characters...i put them in your script after loading your dictionary. got about 7000 matches!!! wow, now i have a word documet with 191 pages of new vocabularys to study!! :-) this tool is really great! i sometimes wonder why people judge your chinese progress by how many characters you already knew...take the word 生意 for example. how can you guess it means business by translating the single characters... or 马上 ! i will come back to your script later for sure!! thanks again, 格桑

Profile picture
RJ
April 04, 2008, 10:24 AM

This is true vocab, I agree. Going back to how many characters do you need to read a newspaper, I think the real question is how many words do you know.

Profile picture
gesang
April 04, 2008, 10:36 AM

yes, and this is why i love those kind of chinese words where the meaning of combined characters is absolutely logical ...like 唱片,一 月,二月。。。,工课 and so on! ;-)

Profile picture
goulnik
April 04, 2008, 09:14 AM

gesang, what you need to do is : 1) make sure dictionary data is loaded into the large text box at bottom right - easiest is to click on [load], but you could also copy paste any word list in there. 2) paste text into the box in the middle left - typically the list of characters and/or word you already know. This is basically free text, no specific constraint. 3) click on [scan], it will run every charater in your vocab text (2) against the longer definition list (1). The result will appear in the text window at bottom left. The [clr] and [sel] buttons are for clearing / selecting content from the text boxes underneath.

Profile picture
gesang
April 04, 2008, 11:43 AM

i took the first page from my new "goulniky-wordlist" and choose all the words where combination, to me (sometimes more, sometimes less) , is logical or kind of a transscription of the meaning...: 爱国 [ài guó] patriotic/love of country/patriotism 爱国者 [ài guó zhě] patriot 爱好 [ài hào] to like/to be fond of/to be keen on/interest/hobby 爱好者 [ài hào zhě] lover (of art, sports, etc)/amateur/enthusiast/fan 爱护 [ài hù] cherish/treasure/take good care of 爱人 [ài ren] spouse/husband/wife/sweetheart 爱上 [ài shàng] to fall in love with; be in love with 安定 [ān dìng] stable/quiet/settled/stabilize/maintain/stabilized/calm and orderly 安定化 [ān dìng huà] stabilization 安全 [ān quán] safe/secure/safety/security 安全带 [ān quán dài] seat belt 安全网 [ān quán wǎng] safety net 安全问题 [ān quán wèn tí] safety issue/security issue 安心 [ān xīn] feel at ease/be relieved/set one's mind at rest/keep one's mind on great isn't it? love that!! I absolutely recommend goulnikys script!!!!!!!!!!!

Profile picture
furyougaijin
April 04, 2008, 11:55 AM

@RJBerki & gesang My impression is that there is no linear relationship between the number of words you know, the number of characters you know and the ease with which you can read a newspaper. I do agree that on the lower end (the more frequent words) it is often impossible to guess the word meaning from the individual characters. However, on the high end (the far less frequent words) it becomes much easier and hence your character knowledge comes into play. I'm only speculating based on my personal experience but a study of should be relatively straightfoward and I wouldn't be surprised if someone somewhere has actually performed it...

Profile picture
RJ
April 04, 2008, 01:20 PM

furyougaijin, your comment makes sense. I understand what you are saying and feel that is probably correct. I have been wanting to ask you. You made a comment once that learning 6000 characters didnt help your ability to develop speaking or listening skills. ( I hope I havent mis-quoted you). As I see continuous improvement in my ability to read and write I still struggle with speech. I just cant bring things to the surface fast enough. My refresh rate and reaction time is just too slow. I am trying to change the way I study so I can focus on speech for a while. Surely the vocab you aquired from learning 6000 characters helped you with speech? Your answer may help me decide what changes to make. Its almost like reading and writing are a separate albeit related project. your thoughts? -RJ

Profile picture
furyougaijin
April 07, 2008, 05:28 PM

@RJBerki You haven't misquoted me. I don't think learning the characters has done a lot for my listening or speaking skills. It has done lots for my reading ability - in fact, if there is a written text that I don't understand now it is usually because I'm not thinking hard enough or have forgotten something that I SHOULD have known. However, that is based purely on recognition of graphic forms. Recognising words from their sound only is a completely different ballgame. I am occasionally able to guess a word I hear even if I have never seen it written before. It usually happens when (a) there is a lot of context and (b) when the word components have relatively unique pinyin readings - the ones I can immediately associate with a meaning, i.e., not a 'shi', a 'ji' or a 'yi'... Speech is the same: I can make myself understood by writing down several characters to express a particular idea or concept - but whether or not these particular characters actually constitute a real word in modern Chinese is a random guess (fishes or plants are nearly a 100% hit though). If I try to 'read out' these characters, the response is most often a blank face. So, once again, as I have NOT done any systematic vocabulary study alongside my character study, it has done nothing for my speaking abilities and almost nothing for my comprehension of the spoken language. But hey - it's done loads for other areas so you don't hear me complaining...

Profile picture
davulf
June 03, 2008, 08:17 PM

Whenever I open up CEDICT, I see all kinds of funky characters instead of Simplified. Anyone know why?

 

ä?œä?? ä?œä?? [zuo4 zhu3] /(v) decide/(v) give support or back sb/

 

This also shows the same thing if I'm using QQ or whatnot. I hate it.

Profile picture
gesang
April 04, 2008, 10:48 AM

...on the other hand...you have to study the vocab before anyway... when i listen to chinesepod audio and a new vocab is explaied by looking at the meaning of the single characters, i think, oh, thats logical...but i think often i would not notice cases like this when i try to read a unknown chinese text...