Cameron D. Campbell 康文林

Family, Social Mobility, and Inequality in China and in Comparative Perspective

Menu
  • Research
    • Abridged CV
    • Full CV (PDF)
    • 2 page CV (PDF)
    • Google Scholar
    • 百度学术
    • ORCID
    • HKUST Repository
  • News
  • Data
    • China Government Employee Database – Qing (CGED-Q) 中国历史官员量化数据库(清代)
      • Download Data
      • Search by Name
      • CGED-Q Jinshenlu Public Release – Resources for Users
    • China Multigenerational Panel Databases 中國多代人口数据庫
      • Download Data
  • Lee-Campbell Group
    • People
    • Projects
    • Publications
  • Photography
    • Photo site 摄影网站
    • Map view
    • Updates
  • Contact
Menu

Using R to Analyze the CGED-Q JSL Public Release

Chen Jun, an MA Student at Central China Normal University, has created Chinese-language materials including a training guide and Powerpoint decks explaining how to use R to analyze the CGED-Q JSL Public Releases.

Chen Jun created these slides while serving as my TA for a graduate class at CCNU in fall 2022 in which students learned about ongoing ‘big data’ studies in Chinese history, and then did final projects in which they used R to conduct analysis fo the CGED-Q JSL on a topic of their choice. The slides explain how to import the downloaded CGED-Q JSL files into R, create and transform variables, parse strings, and tabulate and graph results.

In summer 2023, he distilled the material in the slides into a training guide with examples.

Here is his summary in Chinese:

以上材料是R语言的基础教程,内有教学PDF和R语言代码包。此教程主要包含Rstudio的界面简介、变量创建、数据转换、制作图表、数据集链接等内容。此教程所利用的材料是由李中清-康文林团队辑录的中国政府雇员数据库—清代(China Government Employee Database-Qing,简称CGED-Q)中的缙绅录数据库(JSL database 1900-1912;1850-1864)。JSL数据库是一个超大型字符串类型数据库,目前国内尚未有系统性的针对大型字符串类型数据库的R语言教程,此教程是初次尝试,供广大R语言使用者、初学者参考。

此教程的作者是陈俊,华中师范大学硕士研究生;联合作者是康文林,香港科技大学和华中师范大学教授。另外,韦圣彬在此教程的制作中提供了莫大的帮助,特此感谢。此教程主要是作者根据数据库处理经验并参考部分网络资源(已标明出处)制作而成的,如果有使用者发现有任何未注明引用的地方,请联系作者删除。此教程是一个初步的尝试,在技术层面和组织层面都存在着问题,恳请广大读者和使用者批评指正。

在Rstudio中打开R语言程序包,有的使用者可能遇到出现乱码的问题,具体的解决方法是:依次点击File—Reopen with encoding—UTF-8即可。

Training Guide

R语言在历史数据库分析中的运用——以《缙绅录》数据库为中心的基础教程 (PDF)

Slides

1-Rstudio的界面;导入、查看、编辑数据

2-字符串处理的基本函数

3-制表

4-直方图、散点图、折线图

5-dplyr包在字符串处理中的简单运用

6-数据集的内外链接

R Files

2-字符串处理的基本函数R code file
3-制表R code file
4-直方图、散点图、折线图R code file
5-dplyr包在字符串处理中的简单运用R code file
6-数据集的内外链接R code file
  • Instagram
  • Photography website
  • Bluesky
  • LinkedIn

Recent Posts

  • New piece in Guangdong Social Science

    March 29, 2025
  • New article in Explorations in Economic History

    March 21, 2025
  • China Government Employee Dataset-Beiyang (CGED-BY) added to online search

    February 11, 2025
  • Paper in 历史档案 (Historical Archives) by Chen Jun on mid- and low-level Qing military officials

    October 20, 2024
  • Kinship information in the 同年齿录 and related sources completed in August 2024

    August 28, 2024
  • CGED-Q Meeting at Central China Normal University, July 29, 2024-August 2, 2024

    August 6, 2024

Recent Photography

  • HKUST Guangzhou 香港科技大學(廣州)

    March 29, 2025
  • Guozijian and Confucius Temple in Beijing 北京國子監及孔廟

    March 29, 2025
  • Yonghegong in Beijing 北京雍和宮

    March 29, 2025
  • Sunset at Razor Hill, near the HKUST campus 鷓鴣山日落竟

    February 15, 2025
  • Taiwan Province City God Temple 台灣省城隍廟

    February 15, 2025
  • Zhongzheng District in Taipei 臺北中正區

    February 15, 2025
  • A walk from Oban to Dunbeg and back, in the winter

    February 15, 2025

©2025 Cameron D. Campbell 康文林 | Theme by SuperbThemes