LMI For All

Documentation & Development

User Tools

Site Tools


Sidebar

Start Pages

Team Pages

Upcoming Events

Apr 24 Modding Day
data:specsheet_template

Completed Data file specification sheets:

Data File Specification Sheet Template example

:!: This is a template. Please copy the contents into a new page to create your actual spec sheet.

  • Submitted by: David Owen, d.w.owen@warwick.ac.uk
  • Submitted on: April 11, 2013
  • Revision; 1 (Revision goes up whenever spec sheet is unclear and needs to be clarified)

Data File(s)

This data load contains 3 files, which are as follows:

File Content Size
ASHE2013M.csv generated ASHE data, 2013, males 25GB
ASHE2013F.csv generated ASHE data, 2013, females 25GB
coeff.csv coefficients used to generate data 2MB

Source Dataset

This data comes from the ASHE dataset, date of dataset, conditions, warnings, etc.

Processing

Describe briefly steps taken, if any, to convert data from source data into the files you are submitting.

Fields and Columns

ASHE*.csv Files

These files contain the following columns:

Be sure to explain all codings and references here.

  • SOC - the SOC code.
  • gender - gender coding. Refer to coeff.csv and previous ASHE datasets.
  • region - region coding. Refer to coeff.csv and previous ASHE datasets.
  • WeightP - a weight against the sum of observations of this data point.
  • WeightInc - number of observations, to be summed and weighted against WeightP to yield the true earnings figure.

coeffs.csv File

… information about coefficients file here …

Output

The output of queries on this dataset is the estimated weekly pay of persons matching the filtering dimensions, given in pounds sterling. Note that the dataset does not contain averages between multiple filtering expressions; these must be calculated in the querying software.

Queries and calculations

Estimated weekly pay for set of filtering conditions

To receive the estimated weekly pay for a set of filtering conditions, query by these conditions and then apply the weightings as follows (this is a made-up example!):

  • For all matching observations, take the sum of the estimated pay and multiply by the sum of WeightInc divided by WeightP.
  • In other words, weekly_pay_weighted = (sum(pay * WeightInc) / sum(WeightP))

Please feel free to attach source code here!

Averaging the pay

An example: suppose you want to average the weekly pay across both males and females in London. To do so,

  • calculate the weekly pay for males and the weekly pay for females in London.
  • add both and divide by the sum of observations for both males and females.

Closing Notes

data/specsheet_template.txt · Last modified: 2013-04-11 17:03 by David Owen