Cameron D. Campbell 康文林

Family, Social Mobility, and Inequality in China and in Comparative Perspective

Menu
  • Research
    • Abridged CV
    • Full CV (PDF)
    • 2 page CV (PDF)
    • Google Scholar
    • 百度学术
    • ORCID
    • HKUST Repository
  • News
  • Data
    • China Government Employee Database – Qing (CGED-Q) 中国历史官员量化数据库(清代)
      • Download Data
      • Search by Name
      • CGED-Q Jinshenlu Public Release – Resources for Users
    • China Multigenerational Panel Databases 中國多代人口数据庫
      • Download Data
  • Lee-Campbell Group
    • People
    • Projects
    • Publications
  • Photography
    • Photo site 摄影网站
    • Map view
    • Updates
  • Contact
Menu

Using ‘comparison of means’ to calculate proportions at IPUMS-USA

Posted on November 15, 2011 by camecamp

(I wrote this for the students in my undergraduate lecture course Introduction to Social Demography. They are working with IPUMS-USA for a final project.  I thought it might be of more general interest to others who are using IPUMS-USA for each.)

We often want to calculate the proportion of people with some characteristic according to the values of two other variables.  The characteristic of interest might be represented by a single value of a categorical variable, or one or more values of a categorical variable, or even a range of values in a continuous variable.  We can do this with the ‘comparison of means’ tab that we use to compute the mean of income, socioeconomic index, or other continuous variables.  We just have to recode the categorical variable that we are interested in into a dichotomous variable that is 1 if the person has the characteristic we are interested in, and 0 otherwise.

For example, we might want to calculate the proportion of people who have ever been married, according to year and age group.  By ‘ever been married’, we mean anyone who is currently married, or was married in the past, but is now widowed, separated, or divorced.  In the MARST variable for marital status, that would be anyone who had values 1-5.  The remaining value, 6, corresponds to people who have never been married.

Of course, we could do a cross-tabulation in which our column variable was marital status, our row variable was age, and our control variable was year.  We could add up the percentages of people in statuses 1-5 in the various tables.  Of course, we could recode 1-5 into one category and have the computer do the addition for us, but we would still end up with a lot of output to go through.

Alternatively, we could recode marital status into a dichotomous variable that takes on the value of 0 or 1 according to whether someone has ever been married, and then compute the mean of that new variable for different combinations of year and age group.  In the following example, I have set up a ‘comparison of means’ calculation in which the dependent variable is MARST recoded so that all values corresponding to categories where a person is currently married or was married in the past (MARST 1 through 5) are 1, and the never married are 0.  The mean of this variable will be the proportion of people who are married, or were married in the past but are now widowed, separated, or divorced.

In the following, pay particular attention to the use of recode in the specification of the dependent variable to turn marst into a dichotomous variable:

 1 proportion_ever_married_by_age_and_year

 

Below is an example setting up a calculation to calculate proportions enrolled in school.  School enrollment is originally coded so that 1 indicates that someone is not enrolled, and 2 indicates that they are enrolled.  We recode to change 1 to a 0, and 2 to a 1, so that the mean ends up being the proportion currently enrolled.  Note that for the school enrollment variable, it only makes sense to consider people who are at the right age to be enrolled in school.

2 enrollment_example

Of course, you could do this with any number of other variables, including variables that were originally numeric or continuous.  In the example below, I have transformed POVERTY so that it is 0 or 1 according to whether the household in which an individual lives is at or below the poverty line.  POVERTY is originally coded as a three digit number that represents the household’s income as a percentage of the poverty line.  100 means that a household is at the poverty line, 001-099 means that a household is below the poverty line, and 101 up to 500 means that a household is above the poverty line.  There are no values above 500 because POVERTY is top-coded: if a household is earning more than 500% of the poverty line, it is just set to 500.  In the specification of the dependent variable, I have used the recode facility to change all values of poverty that are 101 or higher to 0, and all values of 001 to 100 to 1.  The mean of the variable is therefore the proportion of people living in poverty.  Note that the recode excludes 0 because 0 indicates that the value is not available.

3 poverty_recode_example

The value in each cell represents the proportion of individuals of the specified race in each year who are in poverty.

1 thought on “Using ‘comparison of means’ to calculate proportions at IPUMS-USA”

  1. Pingback: 2013 SJTU Summer Short Course: Social Demography | Cameron Campbell 康文林

Comments are closed.

  • Instagram
  • Photography website
  • Bluesky
  • LinkedIn

Recent Posts

  • New piece in Guangdong Social Science

    March 29, 2025
  • New article in Explorations in Economic History

    March 21, 2025
  • China Government Employee Dataset-Beiyang (CGED-BY) added to online search

    February 11, 2025
  • Paper in 历史档案 (Historical Archives) by Chen Jun on mid- and low-level Qing military officials

    October 20, 2024
  • Kinship information in the 同年齿录 and related sources completed in August 2024

    August 28, 2024
  • CGED-Q Meeting at Central China Normal University, July 29, 2024-August 2, 2024

    August 6, 2024

Recent Photography

  • HKUST Guangzhou 香港科技大學(廣州)

    March 29, 2025
  • Guozijian and Confucius Temple in Beijing 北京國子監及孔廟

    March 29, 2025
  • Yonghegong in Beijing 北京雍和宮

    March 29, 2025
  • Sunset at Razor Hill, near the HKUST campus 鷓鴣山日落竟

    February 15, 2025
  • Taiwan Province City God Temple 台灣省城隍廟

    February 15, 2025
  • Zhongzheng District in Taipei 臺北中正區

    February 15, 2025
  • A walk from Oban to Dunbeg and back, in the winter

    February 15, 2025

©2025 Cameron D. Campbell 康文林 | Theme by SuperbThemes