Skip to main content
Cornell Program on Applied Demographics

Census 2020 results

Data and analyses for New York from the data products as they are released over time by the U.S. Census Bureau

  • Introduction
  • Disclosure Avoidance System
  • Geography
  • Apportionment
  • Redistricting (PL94-171)
  • DHC
  • Detailed race information
  • PPMF microdata
  • Data Quality

Introduction

External resources:

History

The decennial census has been conducted in years ending in "0" since 1790, as required by the U.S. Constitution. Article I, Section 2 states that:

"Representatives and direct Taxes shall be apportioned among the several States which may be included within this Union, according to their respective Numbers . . . The actual Enumeration shall be made within three Years after the first Meeting of the Congress of the United States, and within every subsequent Term of ten Years, in such Manner as they shall by Law direct."

Use of Census Data

Beside determining how many representatives each state get, Census data is used for many other cases, for example:

  • Representation in State and Federal legislative chambers
  • Funding based on population counts and characteristics
  • Learn about the counts and characteristics of the residents in a community
  • Planning future resources
  • Business decisions
  • A base for population estimates estimates and projections
  • Source for research in demographic change and the social fabric
  • Denominators for health statistics
  • Etc.

2020 Census Data product

After the questionnaires are collected, the Census Bureau goes through a process of verification, unduplication and filling in the gaps. Before releasing the counts, the Census Bureau makes sure it doesn't release any personal sensitive information and runs the counts through the Disclosure Avoidance System. The Bureau releases different data products over time, each with more level of detail.

  • Apportionment data, release date April 26, 2021. This data contains total counts at the State level and determines the number of representatives in the House of Representatives.
  • Redistricting data (PL94-171), release date August 12, 2021. This data contains total counts, voting age and detail on race/ethnicity down to the Census Block level.
  • Demographic profiles and Demographic and Housing Characteristics (DHC), released date May 25, 2023. The profile provides an overview of most important demographic characteristics for most governmental geographic areas (counties, cities, towns, places) and Census tracts. The DHC contains a set of detailed tables with demographic and housing information.
  • Detailed Demographic and Housing Characteristics (DDHC), release date DDHC-A (totals and age/sex): Released September 2023. Additional data on household types and sizes (DDHC-B) was released in August 2024.

Many geographic boundaries are being confirmed and updated every ten year. The Geography tab contains links to the current geography boundary files.

Data Quality

During all the steps in the Census operation a lot of data is collected that can be used to quantify how well each process in the operation is going. Some of these metrics are used to measure the efficiency of the processes, and others can inform claims about the quality of the count.

Another way of gauging quality is comparing the Census counts with independent population estimates.

The Data quality tab contains links to analyses of data quality.

Disclosure Avoidance System (DAS)

External resources:

At the Census Bureau

Elsewhere

Background

The U.S. Census Bureau is required to do an “actual Enumeration” of all the people living in the U.S. every 10 years (U.S. Constitution, Article 1, Section 2). The bureau also is required to keep personally identifiable information confidential for 72 years (92 Stat. 915; Public Law 95-416). Title 13, U.S. Code, Section 9, provides the mandate for the bureau to not “use the information furnished under the provisions of this title for any purpose other than the statistical purposes for which it is supplied; or make any publication whereby the data furnished by any particular establishment or individual under this title can be identified; or permit anyone other than the sworn officers and employees of the Department or bureau or agency thereof to examine the individual reports (13 U.S.C. § 9 (2007)).”

The dual requirement for an accurate count and the protection of respondents and their data creates a natural tension: The more accurate (and therefore usable) the reported data is, the easier it may be to identify individual responses. And yet, as the raw data is altered before being reported (to protect confidentiality), the less usable the publicly released data is.

Differential Privacy

The Census Bureau also added some uncertainty and obscured some of the data in the past as a way to avoid disclosure.

In the second half of last decade the Census Bureau announced that they decided to develop new DAS tools for the 2020 Census. These new tools are based on the concept known in scientific and academic circles as “differential privacy.” It is also called “formal privacy” because it provides provable mathematical guarantees, similar to those found in modern cryptography, about the confidentiality protections that can be independently verified without compromising the underlying protections.

“Differential privacy” is based on the cryptographic principle that an attacker should not be able to learn any more about you from the statistics we publish using your data than from statistics that did not use your data. After tabulating the data, we apply carefully constructed algorithms to modify the statistics in a way that protects individuals while continuing to yield accurate results. We assume that everyone’s data are vulnerable and provide the same strong, state-of-the-art protection to every record in our database.

The consequences of the new DAS for data use

Over the last few years, the Census Bureau developed the new DAS, kept stakeholders informed and solicited feedback on several iterations of the system.

In June 2021 the Census Bureau settled on final settings of the system. The acting Director wrote in a blog post about consequences of the DAS on the redistricting data:

With these parameters, some small areas like census blocks may look “fuzzy,” meaning that the data for a particular block may not seem correct. Importantly, our approach yields high quality data as users combine these "fuzzy” blocks to form more significant geographic units like census tracts, cities, voting districts, counties, and American Indian/Alaska Native tribal areas. Our calibration was designed to achieve acceptable quality thresholds for these levels of geography.

So, if you’re looking at block-level data, you may notice situations like the following:

  • Occupancy status doesn’t match population counts. Some blocks may show that the housing units are all occupied, but the population count is zero. Other blocks may show the reverse: the housing units are vacant, but the population count is greater than zero.
  • Children appear to live alone. Some blocks may show a population count for people under age 18 but show no people age 18 and older.
  • Households appear unusually large. For example, you may find blocks with 45 people, but only three housing units.

Though unusual, situations like these in the data help confirm that confidentiality is being protected.

Noise in the block-level data will require a shift in how some data users typically approach using these census data.

Instead of looking for precision in an individual block, we strongly encourage data users to aggregate, or group, blocks together. As blocks are grouped together, the fuzziness disappears. And when you step back with more blocks in view, the details add together and make a sharp picture.

In short: block level data is very fuzzy, but fuzziness should disappear in aggregates. Be careful with interpreting average household size as the system is not optimized to deal with this metric.

In August 2021 the Census Bureau also releases a demonstration product based on the same final settings, but applied to the 2010 Census. This allows for more insights in the amount of noise for different situations.

Impossible and improbable block counts in the 2020 redistricting file compared with 2010

20102020
Count% of allCount% of all
Non empty blocks250,070233,182
Households (occupied houses) and household population
Household population > 0, but occupied houses = 0Impossible in 201014,2766.1%
Household population < occupied houses (Persons per houshehold < 1)5,7642.5%
Household population = 0, but occupied houses > 01,8340.8%
PPH > 10530.0%4,5101.9%
Youth only
Only 0-17210.0%2,8081.2%
Without GQ and only 0-1710.0%2,7951.2%

Geography

External resources:

At the Census Bureau

Elsewhere

Cities, towns and villages

During the decades boundaries of places are subject to change. There are also villages that choose to unincorporate, etc. During the 2010-2020 decade, the village of Kirias Joel in Orange County decided to split with Monroe Town and create a separate Town, Palm Tree Town.

A reference map with the 2020 boundaries of the cities, towns and villages can be found at: pad.human.cornell.edu/maps2020/maps/ReferenceMaps.pdf

Public Use Microdata Areas (PUMA)

Every ten years the Census Bureau delineates PUMAs with input from the states and stakeholders. The Census Bureau defines PUMAs for the tabulation and dissemination of decennial census and American Community Survey (ACS) Public Use Microdata Sample (PUMS) data.

The 2020 PUMA boundaries are presented at: pad.human.cornell.edu/census2020/PUMA2020maps.cfm with a comparison to the 2010 boundaries and PUMA IDs.

Urban areas

Urban areas are defined to identify concentrated areas where people live. They are redrawn every 10 years after the Decennial Census. The delineation process in 2020 differed on several points from the 2010 process.

More information on the 2020 delineation process can be found here. The new boundaries are displayed on TIGERWeb (link in resources) and shapefiles can be downloaded here. There are two files there, one with the 2010 boundaries and one with the 2020 boundaries.

PAD created an atlas of Urban Areas in New York State

Apportionment

External resources:

At the Census Bureau

Apportionment is the process of dividing the 435 memberships, or seats, in the House of Representatives among the 50 states based on the population figures collected during the decennial census.

Apportionment is based on the number of residents in each state and counts of overseas population that are assigned to each state.

In 2020 the resident population of New York was 20,201,249 (19,378,102 in 2010) and the overseas population that was assigned to New York was 14,502 (42,953 in 2010). The growth of the New York population didn't keep up with the total growth of the US population and the NY share declined. The New York 2020 count was just not enough to hold on to 27 seats in the House of Representatives.

New York Resident population as reported in the 2020 Apportionment numbers and previouses Censuses

New York202020102000199019801970
Resident Population20,201,24919,378,10218,976,45717,990,45517,558,07218,236,967
Percent Change4.2%2.1%5.5%2.5%-3.7%8.7%
Share of US population6.1%6.3%6.7%7.2%7.8%9.0%
Number of Seats262729313439

Redistricting (PL94-171)

External resources:

At the Census Bureau

Elsewhere

On August 12, the Census Bureau released the Public Law 94-171 data, better know under the name redistricting data. The primary use of this data is drawing new legislative districts that will give everybody equal representation. The August data release will also provide the first look at the demographic characteristics of the nation by state, county, city, all the way down to the census block level, including:

  • Race and ethnicity.
  • Population 18 years and over.
  • Occupied and vacant housing units.
  • People living in group quarters like nursing homes, prisons, military barracks and college dorms.

Redistricting data for New York

The data released in August was in a 'legacy' format. We transformed this data and produced a number of Excel workbooks and geodatabases that contain data from the redistricting data and similar data from Census 2000 and Census 2010.

All PL data in Excel format

Excel sheets with all of the raw data:

2020 PL data compared with 2010 and 2000

Excel workbooks for different levels of geography (select number of variables) - Coming soon

* These workbooks contain revisions that were the result of the Count Question Resolution program. Only totals are revised, so when comparing totals over time, using the revised numbers are recommended. Especially in 2000 there were quite a few corrections because Group Quarters were originally tabulated in the wrong block. The number of revision in 2010 was much smaller. Download 2000 CQR revisions. Download 2010 CQR revisions.

Select results

Percent change in county population between 2010 and 2020

Population by Economic region

Population count Change (count) Change (%)
 4/1/20004/1/20104/1/2020 2000-20102010-2020 2000-20102010-2020
          
New York State18,976,82119,378,10220,201,249 401,281823,147 2.1%4.2%
          
Capital District1,029,9271,079,2071,106,088 49,28026,881 4.8%2.5%
Central NY780,716791,939785,114 11,223-6,825 1.4%-0.9%
Finger Lakes1,199,5881,217,1561,222,868 17,5685,712 1.5%0.5%
Long Island2,753,9132,832,8822,921,694 78,96988,812 2.9%3.1%
Mid-Hudson2,179,1892,290,8512,398,150 111,662107,299 5.1%4.7%
Mohawk Valley497,935500,155483,358 2,220-16,797 0.4%-3.4%
New York City8,008,6548,175,1338,804,190 166,479629,057 2.1%7.7%
North Country425,859433,193421,694 7,334-11,499 1.7%-2.7%
Southern Tier657,297657,909640,036 612-17,873 0.1%-2.7%
Western NY1,443,7431,399,6771,418,057 -44,06618,380 -3.1%1.3%

Change in population by Economic region for the major race/ethnicity categories

Change 2010-2020 (count) Change 2010-2020 (%)
RaceTotalNon Hispanic WhiteNon Hispanic BlackNon Hispanic AsianHispanic TotalNon Hispanic WhiteNon Hispanic BlackNon Hispanic AsianHispanic
            
New York State823,147-705,340-24,835510,135531,110 4.2%-6.2%-0.9%36.3%15.5%
            
Capital District26,881-62,5268,88519,91519,813 2.5%-6.9%12.7%69.0%46.5%
Central NY-6,825-53,3185,8756,59610,468 -0.9%-7.9%10.8%40.1%40.8%
Finger Lakes5,712-63,5383,3668,54125,511 0.5%-6.5%2.8%31.2%37.0%
Long Island88,812-199,25311,06276,331147,790 3.1%-10.2%4.5%50.3%33.5%
Mid-Hudson107,299-118,42414,02219,951128,248 4.7%-7.8%5.8%21.0%32.7%
Mohawk Valley-16,797-45,1891,7234,3268,013 -3.4%-10.2%9.9%51.5%37.1%
New York City629,057-3,048-84,404345,383154,274 7.7%-0.1%-4.5%33.6%6.6%
North Country-11,499-28,028-4657293,353 -2.7%-7.2%-3.2%18.2%25.3%
Southern Tier-17,873-55,8112,9755,1848,496 -2.7%-9.5%14.1%27.5%47.0%
Western NY18,380-76,20512,12623,17925,144 1.3%-6.7%8.7%85.6%44.4%

Change in population by Economic region for voting age and non voting age

Change (count) Change (%)
Age groupTotalVoting age (18+)Non voting age (0-17) TotalVoting age (18+)Non voting age (0-17)
        
New York State823,1471,034,962-211,815 4.2%6.9%-4.9%
        
Capital District26,88142,730-15,849 2.5%5.0%-6.9%
Central NY-6,82511,758-18,583 -0.9%1.9%-10.4%
Finger Lakes5,71231,714-26,002 0.5%3.4%-9.5%
Long Island88,812149,664-60,852 3.1%6.9%-9.1%
Mid-Hudson107,299120,666-13,367 4.7%7.0%-2.4%
Mohawk Valley-16,797-8,251-8,546 -3.4%-2.1%-7.9%
New York City629,057657,026-27,969 7.7%10.3%-1.6%
North Country-11,499-3,351-8,148 -2.7%-1.0%-8.6%
Southern Tier-17,873-2,596-15,277 -2.7%-0.5%-11.1%
Western NY18,38035,602-17,222 1.3%3.3%-5.7%

Demographic and Housing Characteristics (DHC)

External resources:

At the Census Bureau

Background

On May 25th, 2023, the Census Bureau released the Demographic and Housing Characteristics file (DHC). This product provides detailed demographic and housing characteristics about the nation and local communities. Some of the detailed tables are produced at the block level, other tables with more are not produced below the tract or county level. The Census Bureau advises to combine blocks together to improve accuracy and diminish implausible results.

At the same time the Census Bureau released Demographic Profiles (DP). Demographic Profiles provide an overview of the topics covered in the 2020 Census in one, easy-to-reference table for geographies down to the tract level.

Cornell Program on Applied Demographics Products

Using data from the DHC and DP, Cornell PAD created:

  • Historic Age distributions. We built a website that lets you look at age data at the County level and compare it with age distribution in previous decades
  • County demographic profiles. We combined the Demographic Profiles for the New York counties in an Excel workbook and added some charts
  • Demographic profiles for primitive geography. We created Demographic Profiles for the New York primitive geographies in an Excel workbook with some charts. In this file are profiles for Counties, sub counties (cities and towns) and incorporated places within towns/remainder of towns.

Initial findings

Our initial findings of this data were:

  • The median age in New York State was 39.0 in 2020, up 1 year from 2010 (38.0).
  • Between 2010 and 2020, the median age in New York State rose 1.3 years for men (36.3 to 37.6) and 1.0 years for women (39.4 to 40.4).
    • Hamilton County remained the oldest county in New York for 2020 with a median age of 57.0 (up from 51.3 in 2010).
    • Still the youngest county in New York State, the median age in Tompkins rose by 1.2 years from 29.8 in 2010 to 31.0 in 2020.
  • Of the 604 U.S. counties with over 100,000 population, Tompkins County NY had the 10th lowest median age in 2020.
  • The old-age dependency ratio in New York State was 20.7 in 2000, 21.1 in 2010, and continued to rise to 26.9 in 2020.
    • Simultaneously, the child dependency ratio continued to decline, going from 39.6 in 2000 to 34.8 in 2010, and reaching 32.4 in 2020.
  • The population under 5 in New York declined by 8.2% between 2010 and 2020- slightly less than the national decline of 8.9%.
    • The population in New York aged 65+ grew by 30.2% from 2010 to 2020, while the 85+ population grew by 13%.
  • In 2020, New York State had the fourth largest populations of both same-sex married (48,442) and same-sex unmarried couples (35,096) in the country.
    • This was the first Decennial Census to show distinct totals for opposite- and same-sex married and unmarried partner households.
  • In 2020, there were about 4.0 million owner-occupied (up from 3.9 million in 2010), 3.8 million renter-occupied (up from 3.4 million in 2010), and 773 thousand vacant housing units (down from 790 thousand in 2010).
  • In 2020, New York was the 8th most diverse state in the U.S. with a diversity index value of 71%.

Findings from looking at the single year of age distributions:

  • There is severe age heaping (peaks at ages ending in a 0 or a 5), especially in New York City
  • Smaller counties have very jagged patterns, probably caused by Differential Privacy (more jagged than in 2010)
  • Most age distributions are comparable between the DHC and the estimates base, BUT
    • Census counts of young children are lower
    • Census counts of student ages are higher
    • NY State Census counts were slightly lower than in estimates base for ages 45-75
    • The pattern of the 18-22 peak caused by the presence of colleges in a county is suspect in the DHC in some counties. For example, Schoharie County has a sharp peak in the DHC at age 21, but in previous decade this sharp peak was at age 19, which makes sense with mostly 2-year programs at SUNY Cobleskill. The DHC patterns of the college peaks in Cortland, Tompkins and Otsego counties are also suspect.
    • Age distributions of other Group Quarters need more analyses

Detailed race information

External resources:

At the Census Bureau

Downloadable files:

Detailed race by age and sex (Detailed DHC-A and DHC-B)

The 2020 Census asked everyone to provide information about ther origin as part of the race and ethnicity questions. These details are considered detailed race information.

In September 2023, the Census Bureau released age and sex tabulations of this detailed race information on data.census.gov and in January 2024 the data was also made available in downloadable summary files. This was followed in August 2024 with information on types of household and tenure by detailed race of the householder.

This data provides great insight in the diversity of the racial/origin backgrounds of the population. It contains population counts and sex-by-age statistics for approximately 1,500 detailed racial and ethnic groups, such as German, Lebanese, Jamaican, Chinese, Native Hawaiian, and Mexican, as well as American Indian and Alaska Native (AIAN) tribes and villages like the Navajo Nation.

There are a few things about this dataset that make it different from other Census products:

  • The number of detailed race categories
    In 2010 the Census Bureau published data on 331 detailed races/tribes and race groups. In this 2020 product there is data on around 1,500 detailed races/tribes and race groups.
  • Limited geography levels
    This data is only available for the nation, the states, counties, tracts, places and Native American Areas.
  • Adaptive design
    The amount of age detail for each geography/race group is dependent on the size of the group in that area. In the comparable product for 2010 very detailed age/sex data was available for all groups with at least 100 people. The table below shows the different thresholds for the number of age categories in this product.
  • Internal inconsistencies
    To limit the disclosure risk the Census Bureau adds some random noise to all the counts for each age/sex cell in this data product. The totals, and number of males and females are derived as sum of the detailed counts and are thus consistent, but tracts do not necessarely add up to counties and counties not to the state. Detailed races also do not necessarily add up to the regional race groups.
Table 1: Group size and details provided
Group size Most comprehensive table type produced
Detailed groups Regional groups
Nation and state Substate and AIANNH Nation and state Substate Age and sex (DDHC-A) Household type (DDHC-B) Tenure (DDHC-B)
0-499 22-999 0-4,999 94-4,999 -> Total count only Total households only Total households only
500-999 1,000-4,999 5,000-19,999 5,000-19,999 -> 4 age categories by sex 2 categories (Family/Non family) 3 categories (Own, own free, rent)
1,000-6,999 5,000-19,999 20,000-149,999 20,000-149,999 -> 9 age categories by sex 6 categories 3 categories
7,000+ 20,000+ 150,000+ 150,000+ -> 23 age categories by sex 8 categories 3 categories

Excel file with NY County data on detailed race by sex and age (Detailed DHC-A)

Download NY County data on detailed race

The detailed DHC-A contains a lot of information, but is not easy to use. We created an Excel workbook that contains information on all the detailed and regional groups, but in a slightly different format.

The main differences with the downloadable DDHC-A summery files are:

  • Added similar information from the Demographic and Housing Characteristics (DHC) file on race.
    The Census Bureau used an hierarchical classification where each detailed race is within a region race group and each regional group is within a race category. For example: the detailed race group "British" is within the regional race group "European" and the regional race group "European" falls within the White race. Adding DHC information puts it all in one place and users can calculate the share of the White population that is British.
  • Aggregated detail age groups in wider age groups.
    If a group was large enough to publish data for 23 age groups, only those 23 age groups were published. In this Excel sheet these 23 age groups are combined into the 9 age groups and 4 age groups that are used for smaller groups. This means for example all detailed race groups with age detail contain an estimate for the 65+ group, which is one of the 4 age groups. Keep in mind that the aggregation of cells increases the spread of the noise. Adding 4 age groups together for example, doubles the MOE.
  • No age by sex.
    The detailed DHC-A file contains data on age by sex, but not on age alone. The Excel file only contains to sum of the number of male and female for a certain age group.
  • For each detailed race group is also information on the race and regional group this detiled race belongs too.
    This enables the user to filter for example only the European detailed race groups.
  • Alone and alone or in combination put together.
    The detailed DHC-A contains information on detailed race groups alone, and also on that detailed race group in combination with any other race group. Putting these records makes it easier to compare.

The Census brief and technical documentation published by the Census Bureau are good places to start if you want to learn more about this data. It also contains information about the suppression of some cells. These suppressed cells have a value of -888,888,888 in the Excel file. Aggregates that contain one of these suppressed cells also got this value.

Excel file with NY County data on household type and tenure by detailed race of the householder (Detailed DHC-B)

Download NY County data from detailed DHC-B

Similarly to the detailed DHC-A, we created an Excel workbook that contains information from the detailed DHC-B on household type and tenure for all the detailed and regional groups, but in a slightly different format. We added the population size from DHC-A to these tables as that is used to determine the provided detail.

The random noise added to avoid disclosure is added to the individual cells and summed to create a total. This is done independently for the tenure tables and the household type tables. This causes two different estimates for the same statistic, but you can see one as an estimate of number of occupied houses and the other as an estimate of the number of households.

We added a third estimate to the table which averages the totals from the tenure and from the household type tables (inverse variance weighted). This estimate of the total has a lower MOE and is thus more accurate, but should only be used if you are only interested in the total number of units.

Privacy Protected Microdata Files (PPMF)

External resources:

At the Census Bureau

Cornell produced files:

Background

In August 2024 the Census Bureau released Privacy Protected Microdata Files (PPMF), files with microlevel information about persons in one file and about households and housing units in another. Each record in the person file represents a single person and defines the characteristics of that person (age, sex, race, hispanic origin, etc.). Each record in the unit file represents a living quarters and the household within. It contains information on occupancy, tenure, household size, household type, etc.

This PPMF is consistent with the PL94-171 redistricting file and the DHC. This means that the number of records in an area is equal to the published count for that area, but also aggregates of people in a certain age group, a race group, etc. are the same. So if you count all records in the PPMF that indicate that this person is male with an age between 0 and 4 than you get the same results as published in DHC table P12.

The PPMF however allows users to create tabulations that were not published in the DHC tabulations. Data users might be interested in characteristics of their neighborhood and that neighborhood doesn't line up with Census tracts. Or data users are interested in age groups that are not in table P12, for example 6 to 12 year olds in a school district. The PPMF also allows for cross-tabulations, for example age distributions of married and unmarried partners.

Guidance

There are a couple of things to keep in mind while using the PPMF:

  • The persons and units in the files do NOT represent the data as collected!
    The Disclosure Avoidance System (DAS) added noise to almost all the counts and the microdata was generated based on the noise added block counts. For example the noise might have changed the total population for a block and the shares of males and females. The PPMF records reflect these noise added numbers of males and females.
  • Persons do not add up to households.
    Noise was added to all cells in a specific count table without considering the relation between cells. This can cause some inconsistencies, like blocks with household population, but without householders, or more spouses than householders. This also means that there are no links between partners or between children and parents.
  • The person file and the unit file are not consistent with each other.
    The DAS injected noise to the person table and unit tables separately and if you would mix the two files, you will find all kinds of inconsistencies.
  • The Top-Down algorithm used to produce these files is optimized for the published tables, and accuracy of custom tabulations are not guaranteed.
    The Census Bureau is working on tools for data users to create confidence intervals.
  • The PPMF files are huge.
    With almost 335 million person records and 143 million unit records the .zip file which combines the two files is around 7GB and unzipped the two files are around 120GB for person file and over 50GB for the unit file. Specialized software is necessary to filter these files down.
    We created an example extract in MS-Excel for Albany County, NY. Albany County has 303 thousand residents and fits within Excel. Excel's pivot tables are a great tool to get the tabulations you are looking for.

Last modified: August 14, 2024