A Script To Easily Search Traditional Chinese Words With CC-CEDICT In Linux

Last updated: February 15, 2017 at 21:55 pm
The CC-CEDICT Chinese-English Dictionary is incredibly useful but a little difficult to search effectively and efficiently initially. As a text file, you could use a text editor and simply search using CTRL+F, but it might be quite an inefficient way to search. As of the time of this writing, December 2016, there are 114886 entries, so searching it with CTRL+F it may give you many hits. You need to use simple regular expressions to get what you really want. This article gives you a free script you can use to easily, efficiently, and effectively search the CC-CEDICT.

(Note: This script will only work if you are searching Traditional Chinese Characters. If you want to search Simplified Chinese Characters, read my article here about how to search Simplified Chinese Characters with the CC-CEDICT dictionary.)

1. Bash Shell Script For Searching Traditional Chinese Words With CC-CEDICT

Below is the raw code for the script. If you would prefer to download the whole script as a file, you can download it here.

#this script reads in Chinese input and assigns it to a variable
#called chinese_words
#then grep searches the content and outputs the definition
#This script only words with Traditional Chinese Characters / Words
echo "Enter the Traditional Chinese Character or Word that you want to search:"
read chinese_words
grep "^$chinese_words " current_cc-cedict.txt

Next you need to actually download the CC-CEDICT file. You can download a version of the dictionary here. Download the text file version.

Currently, the CC-CEDICT file I have is called cc-cedict_2017_01_08.txt
Rename the file as current_cc-cedict.txt
By command line you can rename it by doing:
mv cc-cedict_2017_01_08.txt current_cc-cedict.txt
Or you could just rename it manually using your GUI.

Download both those files into a folder, then cd into that folder, and then to run the script do:
bash start_chinese.sh

For this article, I am going to assume you downloaded the script and dictionary file into a folder called cc-cedict inside the ~/Downloads folder. Specifically they will be downloaded in ~/Downloads/cc-cedict/

How To Use The Script

Here is an example of how to use the script. Open a terminal and type:

cd ~/Downloads/cc-cedict/
bash start_chinese.sh
Enter the Traditional Chinese Character or Word that you want to search:
如果
#it will output
如果 如果 [ru2 guo3] /if/in case/in the event that/

that is is ☺

If you are interested in learning about bash scripting, you can read about it here.

Here is a screenshot of another example using the script.

2. Create A Bash Alias For Conveniently Searching The CC-CEDICT Many Times

If you use the script for searching Traditional Chinese Characters often, as I do, you can create an alias in bash, to make the command super easy to use over and over again without having to cd into the folder where the script is and run bash start_chinese.sh every time.

For this alias, we will assume the script start_chinese.sh and the CC-CEDICT file calledcurrent_cc-cedict.txt are in ~/Downloads/cc-cedict/

There are 2 ways to set up the bash_aliases file
You can download my alias file, or you can manually set up your bash_aliases file yourself using a test editor.

Way 1: Set Up The Alias By Downloading Alias File

Download my alias file here into your home folder.

Then rename the file bash_aliases.txt as .bash_aliases
by typing:
mv bash_aliases.txt .bash_aliases

Way 2: Set Up The Alias Manually With A Text Editor

Now for setting up the bash alias. Do the following commands:
cd ~
#Next use your favorite text editor, and create a file called .bash_aliases
#my favorite is vim, so in this example, I use vim as the text editor.
vim .bash_aliases
#now inside the file copy the following line
alias zhongguostart='aaa=$(pwd) && cd ~/Downloads/cc-cedict/ && bash start_chinese.sh && cd $aaa'
(If your .bash_aliases file already has lines in it, just add the above line below them.)
And finally save the file.

Testing the Alias

Now, close all your terminals. And reopen a terminal. Now if you type in zhongguostart in any terminal, it will then prompt you to input the Traditional Chinese Characters, and once you do, it will then output the definition.

Example:
zhongguostart
Enter the Traditional Chinese Character or Word that you want to search:
電腦
電腦 电脑 [dian4 nao3] /computer/CL:臺|台[tai2]/

Here is a screenshot with that example

I called this alias zhongguostart. But you can call it anything you want. Adjust the name in the bash_aliases file if you want.

What did you think of this article? Do you like the script and alias? Let’s discuss it in the comments below.

Share this:

1 thought on “A Script To Easily Search Traditional Chinese Words With CC-CEDICT In Linux”

Leave a Reply