1887

OECD Statistics Working Papers

The OECD Statistics Working Paper Series - managed by the OECD Statistics and Data Directorate – is designed to make available in a timely fashion and to a wider readership selected studies prepared by staff in the Secretariat or by outside consultants working on OECD projects. The papers included are of a technical, methodological or statistical policy nature and relate to statistical work relevant to the organisation. The Working Papers are generally available only in their original language - English or French - with a summary in the other.

Joint Working Papers:

Testing the evidence, how good are public sector responsiveness measures and how to improve them? (with OECD Public Governance Directorate)

Measuring Well-being and Progress in Countries at Different Stages of Development: Towards a More Universal Conceptual Framework (with OECD Development Centre)

Measuring and Assessing Job Quality: The OECD Job Quality Framework (with OECD Directorate for Employment, Labour and Social Affairs)

Forecasting GDP during and after the Great Recession: A contest between small-scale bridge and large-scale dynamic factor models (with OECD Economics Directorate)

Decoupling of wages from productivity: Macro-level facts (with OECD Economics Directorate)

Which policies increase value for money in health care? (with OECD Directorate for Employment, Labour and Social Affairs)

Compiling mineral and energy resource accounts according to the System of Environmental-Economic Accounting (SEEA) 2012 (with OECD Environment Directorate)

English

What is the role of data in jobs in the United Kingdom, Canada, and the United States?

A natural language processing approach

This paper estimates the data intensity of occupations/sectors (i.e. the share of job postings per occupation/sector related to the production of data) using natural language processing (NLP) on job advertisements in the United Kingdom, Canada and the United States. Online job advertisement data collected by Lightcast provide timely and disaggregated insights into labour demand and skill requirements of different professions. The paper makes three major contributions. First, indicators created from the Lightcast data add to the understanding of digital skills in the labour market. Second, the results may advance the measurement of data assets in national account statistics. Third, the NLP methodology can handle up to 66 languages and can be adapted to measure concepts beyond digital skills. Results provide a ranking of data intensity across occupations, with data analytics activities contributing most to aggregate data intensity shares in all three countries. At the sectoral level, the emerging picture is more heterogeneous across countries. Differences in labour demand primarily explain those variations, with low data-intensive professions contributing most to aggregate data intensity in the United Kingdom. Estimates of investment in data, using a sum of costs approach and sectoral intensity shares, point to lower levels in the United Kingdom and Canada than in the United States.

English

Keywords: Data asset, natural language processing, Data intensity, data economy, job advertisements
JEL: J21: Labor and Demographic Economics / Demand and Supply of Labor / Labor Force and Employment, Size, and Structure; E01: Macroeconomics and Monetary Economics / General / Measurement and Data on National Income and Product Accounts and Wealth; Environmental Accounts; C80: Mathematical and Quantitative Methods / Data Collection and Data Estimation Methodology; Computer Programs / Data Collection and Data Estimation Methodology; Computer Programs: General; C88: Mathematical and Quantitative Methods / Data Collection and Data Estimation Methodology; Computer Programs / Data Collection and Data Estimation Methodology; Computer Programs: Other Computer Software
This is a required field
Please enter a valid email address
Approval was a Success
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error