Village DB: About
Welcome to the Roots Village Database, a digitization of the information from the Index of Clan Names By Villages published by the American Consulate General in Hong Kong in the 1970s. Originally used to investigate immigration fraud, this data is now valuable for genealogy research.
The data here comes from the Index of Clan Names By Villages. There are four books, one each for Toishan, Sunwui, Hoiping, and Chungshan. Posted here is the note from the reprint edition, along with the introductions from the original four volumes:
- Note for the Reprint Edition
- Introduction - Toishan Index
- Introduction - Sunwui Index
- Introduction - Hoiping Index
- Introduction - Chungshan Index
Data entry from all four volumes is complete. Please let us know if you find any errors.
Newly added is village data from Yanping/Enping 恩平, from the 恩平县志 (恩平县地方志编纂委员会编 2004. 北京市: 方志出版社). Thanks to Patrick Chew for digitizing this data.
Thanks to Him Mark Lai for wanting this to happen in the first place; to Beatrice Yu and Tony Tong for the early work starting in 2001; and to Andy Fong et al. for hosting the database at its current site.
Frequently Asked Questions
Q: What does "Map Location" in the Heungs mean?
A: According to the introduction of the Index, the map locations are keyed to the grid coordinates of the U.S. Army Map Service Series covering Kwangtung Province. The grid system is MGRS (Military Grid Reference System), but using older grid labels where the second letter is off by ten letters (not including I and O). Thus, "FQ8262" would be "FE8262" in today's MGRS grid (the full coordinates would be 49QFE8262). As a further complication, the map in the original Hoiping book erroneously uses the polyconic grid (with 10,000 yard grid marks), and swaps the horizontal and vertical components. Thus, for Hoiping, "FQ0589" should have been "FQ7473", which translates to 49QFE7473 in today's MGRS grid system.
For your convenience we have converted the MGRS location to an approximate area viewable on Google Maps. (Please keep in mind that the precision of the original map locations is ±1000m, and street map data for China as shown on Google is GPS-offset by another margin of error.)
For those of you who are curious, the database is running on MySQL as a backend, and perl cgi scripts for the interface. Input of the data in Chinese was done by volunteers using STC, or Standard Telegraph Code, which maps a 4-digit code to a character. Apparently, there are two different telegraph encodings, one for Taiwan and one for mainland China. The version used in our data is of the mainland variety, and apparently can be found in a book entitled 《電報明碼》. Naturally, this book is nowhere to be found (I haven't had the chance to beam over to Hong Kong and search the large bookstores there [update: I have, and it's still nowhere to be found, though I didn't have time to search the big libraries there]), and the various tables out on the internet are rife with mistakes. The telegraph data that this database uses is culled mainly from information put together by the Unicode people. This data, combined with a couple of other sources, gives us a telegraph code table of 7977 characters, which still appears to be missing a few. If anyone knows where I might find a more complete table, or the book, please let me know.
Romanizations for the characters are provided in jyutping (Cantonese) and pinyin (Mandarin). Great effort has been placed into making sure that (most of) these are correct. Let us know if you run into problems.
LinksThe Him Mark Lai Digital Archive
A Toisanese/Szeyap Bibliography
Version History1.32 - 2020.07.27
- make higher level administrative levels "sticky" when scrolling down the search results table
- add heung references in village notes
- add quick toggle of pinyin/jyutping with control-P/control-J
- add locations for Yanping townships
- show map of heungs for surname searches
- add data for Yanping
- improve pinyin romanizations
- add subvillages
- add some notes
- add area names (Chungshan only)
- data entry is now complete!
- added option to search heungs/subheungs
- added the introduction from each of the four volumes
- show all villages under each heung, grouped by subheung
- use unicode
- updated surnames list on search page
- added links to google maps for map locations
- display pinyin with tone marks, etc.
- fixed some bugs, made some optimizations
- AndyF got access to this site
- added Google Analytics
- added dynamic show/hide of pinyin, stc, etc.
- fixed an unfortunate bug which caused searching of village names not to work at all
- fixed an obscure encoding problem which prevented search of surname Hui (it's actually an obscure mysql bug, i think)--thanks to Warren Huie for alerting me to this problem
- HTML redesign
- long lists now sorted
- added name search to village
- pinyin display
- improved search - searches surnames by chinese character instead of romanization
- village search now shows enclosing County, Area, and Heung
- selecting surnames popup automatically searches; selection "sticks"
- made data entry more reliable (error checks for empty fields, duplicates, etc.)
- improved editing, added deletion of entries
- fixed up some errors in the data
- added rudimentary search
- made title (of browser window) more descriptive
- numbered listings
- made multiple names (aka's) explicit and easier to read in listings
- added surname to Village listings
- first public release, for the first data entry party
last modified 2020 July 28 by Dominic Yu