LMI For All

Documentation & Development

User Tools

Site Tools


Sidebar

Start Pages

Team Pages

Upcoming Events

Apr 24 Modding Day
data:lfs

Labour Force Survey (Unemployment rates)

  • Submitted by: Luke Bosworth, l.p.bosworth@warwick.ac.uk
  • Submitted on: 04/06/2018
  • Revision; 7

Data File(s)

This data load contains 6 files , which are as follows:

File Content Size (bytes)
LFSraw06-10.csv LFS Data 2006-2010 (raw numbers) 42,039,261
LFSraw11-17_2DigInd.csv LFS Data 2011-2017 (raw numbers) (2 digit Industries) 50,152,547
LFSraw11-17_75Ind.csv LFS Data 2011-2017 (raw numbers) (75 Industries) 50,135,417
LFSlabelled06-10.csv LFS Data 2006-2010 (labelled values) 175,816,666
LFSlabelled11-17_2DigInd.csv LFS Data 2011-2017 (labelled values) (2 digit Industries) 225,948,483
LFSlabelled11-17_75Ind.csv LFS Data 2011-2017 (labelled values) (75 Industries) 218,400,457

Source Dataset

These data come from the LFS (Labour Force Survey). The raw data is provided by ONS as quarterly output.

Processing

The data provided here has been manipulated and re-weighted (see WeightP below) to give unemployment (and other) estimates by calendar year output sharing specific characteristics of the individuals involved (e.g. gender, age, qualification, etc.).

The data are provided as 2 main groups of files, this is due to classification changes within occupation from SOC2000 to SOC2010.

Fields and Columns

LFSraw06-10.csv / LFSlabelled06-10.csv

  • age - Ages (0-74)
  • AgeBands - Ages Bands (8)
  • estat - Employment Status (5)
  • Gender - Gender (2)
  • Industry92 - Industry DIVISION in main job (2 digit 1992 classification)
  • Industry07 - Industry DIVISION in main job (2 digit 2007 classification)
  • Industry92L - Industry DIVISION in Last main job (2 digit 1992 classification)*
  • Industry07L - Industry DIVISION in Last main job (2 digit 2007 classification)*
  • NQF5 - Qualifications (0-5)
  • NQF8 - Qualifications (0-8)
  • Occ2Dig2K - Occupations SOC2000 (2 digit level)
  • Occ2Dig2KL - Last Job Occupations SOC2000 (2 digit level)
  • Occ3Dig2KL - Last Job Occupations SOC2000 (3 digit level)
  • Occ4Dig2K - Occupations SOC2000 (4 digit level)
  • Region - UK Regions (12)
  • WeightP - Number of people
  • year - Years (2006-2010)

LFSraw11-16I_2DigInd.csv / LFSlabelled11-16I_2DigInd.csv

This file has the same content as the file below but for industries at 2 digit level (rather than 75 industries).

  • age - Ages (0-74)
  • AgeBands - Ages Bands (8)
  • estat - Employment Status (5)
  • Gender - Gender (2)
  • Industry07 - Industry DIVISION in main job (2 digit 2007 classification)
  • Industry07L - Industry DIVISION in last main job (2 digit 2007 classification)
  • NQF5 - Qualifications (0-5)
  • NQF8 - Qualifications (0-8)
  • Occ2Dig10 - Occupations SOC2010 (2 digit level)
  • Occ2Dig10L - Last job Occupations SOC2010 (2 digit level)
  • Occ3Dig10L - Last job Occupations SOC2010 (3 digit level)
  • Occ4Dig10 - Occupations SOC2010 (4 digit level)
  • Occ4Dig102km - SOC2010 mapped to SOC2000 Main Job Unit Code (4 digit level - 2011 only)
  • Region - UK Regions (12)
  • WeightP - Number of people
  • year - Years (2011-2017)

LFSraw11-16I_75Ind.csv / LFSlabelled11-16I_75Ind.csv

This file has the same content as the file above but for 75 industries (rather than 2 digit industries).

  • age - Ages (0-74)
  • AgeBands - Ages Bands (8)
  • estat - Employment Status (5)
  • Gender - Gender (2)
  • Ind75 - Industry in main job (75)
  • Ind75L - Industry in last main job (75)
  • NQF5 - Qualifications (0-5)
  • NQF8 - Qualifications (0-8)
  • Occ2Dig10 - Occupations SOC2010 (2 digit level)
  • Occ2Dig10L - Last job Occupations SOC2010 (2 digit level)
  • Occ3Dig10L - Last job Occupations SOC2010 (3 digit level)
  • Occ4Dig10 - Occupations SOC2010 (4 digit level)
  • Occ4Dig102km - SOC2010 mapped to SOC2000 Main Job Unit Code (4 digit level - 2011 only)
  • Region - UK Regions (12)
  • WeightP - Number of people
  • year - Years (2011-2017)

Output

When queried the dataset output provides information on the employment circumstances of the UK population. The source should only be used to identify unemployment rates. Data are provided for individual cases. These have to be aggregated up to provide an account of the numbers of people in a particular category or with particular characteristics.

Queries and calculations

LFSraw06-10.csv / LFSlabelled06-10.csv Specification

The first to fifteenth columns from 'age' to 'Region', are the characteristics of people covered by the dataset. The sixteenth column, 'WeightP', represents the number of people in the year (specified in the final column) and with the characteristics in columns one to fifteen. 'WeightP' compensates for non-response and grosses to population estimates.

To get the unemployment rates, ILO unemployed from the variable 'estat' needs to be divided by labour force the sum of 'estat' 1 through 4.

When a query involving SOC or Industry are required, then the respective variables for last job need to be used (Occ2Dig2KL, Industry92L for instance) in conjunction with ILO unemployed (estat=4). This is because there are no records for people who are unemployed and have an occupation, so the use of their last job is required.

LFSraw11-16I / LFSlabelled11-16I (Ind75 & 2DigInd).csv Specification

The first to fourteenth columns from 'age' to 'Region', are the characteristics of people covered by the dataset. The fifteenth column, 'WeightP', represents the number of people in the year (specified in the final column) and with the characteristics in columns one to eleven. 'WeightP' compensates for non-response and grosses to population estimates.

To get the unemployment rates, ILO unemployed from the variable 'estat' needs to be divided by labour force the sum of 'estat' 1 through 4.

For Example
Extract unemployment and labour force levels for females in 2012 using the variable estat:
year = 2012
gender = 2
Region = 1
A calculation then needs to be done by selecting those that are ILO unemployed (estat=4) and dividing by the labour force (estat=1 to 4). The unemployment rate % is then ILO Unemployed/SUM(FT,PT,SE,ILOUnemp).

When a query involving SOC or Industry are required, then the respective variables for last job need to be used (Occ2Dig10L, Ind75L for instance) in conjunction with ILO unemployed (estat=4). This is because there are no records for people who are unemployed and have an occupation, so the use of their last job is required.

Rules for suppressing data or raising warning flags

The rules of thumb used are:

  1. If the numbers employed in a particular category / cell (defined by the 12 regions, gender, status, occupation, qualification and industry (75 categories)) are below 1,000 then a query should return “no reliable data available” and offer to go up a level of aggregation across one or more of the main dimensions (e.g. UK rather than region, some aggregation of industries rather than the 75 level, or SOC 2 digit rather than 4 digit). This information is held in the variable 'WeightP'.
  2. If the numbers employed in a particular category / cell (defined as in 1.) are between 1,000 and 10,000 then a query should return the number but with a flag to say that this estimate is based on a relatively small sample size and if the user requires more robust estimates they should go up a level of aggregation across one or more of the main dimensions (as in 1).

This is done not only for any queries about Employment (including Replacement Demand calculations) but also for Pay and Hours.

In the case of Pay and Hours the API needs to interrogate the part of the database holding the employment numbers to do the checks, as in 1.and 2. above, but then report the corresponding pay or hours values as appropriate.

Rounding of estimates

In order to avoid false impressions of precision the API should round up the estimates before delivering the answer to any query. In the case of LFS unemployment rates any number should be rounded to the nearest percentage point.

Closing Notes

data/lfs.txt · Last modified: 2018-06-04 11:48 by Luke Bosworth