A First Look at Quantifying Skills Pathways

Published: Posted on

In this blog, Alex Smith from the West Midlands Combined Authority looks at what data can tell us about where in the education and skills system, social mobility is the strongest, and where it is weakest.

This blog is an introduction to the WM REDI project “Understanding and Addressing Skills Impacts and Needs“. The project looks at pathways for young people through education and their prospects afterwards, impacts of addressing disadvantage, and future trends and vulnerabilities.


Mindful of the devastation which the COVID-19 pandemic has wreaked on every stratum of the education and skills system, and the years of technological change that have been telescoped into a few months, it is clearly vital that we understand how well these institutions served our young people to begin with. Viewed through the levelling lens of fairness, this means answering the question:

Where in the education and skills system is social mobility strongest, and where weakest? 

This question is the subject of an ongoing piece of WM REDI research, which this blog post will introduce. As I’ll show below, the huge quantity of administrative data available to researchers and the public makes it possible to address this question quantitatively. The challenge is to corral enough data to link deprivation and disadvantage at every stage, from school readiness for 5-year-olds to the future earnings of our graduates. Data can be cut to define disadvantage in two main ways:

  • Deprivation, as measured by the Indices of Multiple Deprivation (IMD), can be analysed at the quintile level, meaning five equally-sized populations at different levels of deprivation. Findings can then be presented visually in map form. Limits of sample size mean that Lower Super Output Areas (LSOA) used to define geographical Census areas will need to be grouped together by these deprivation levels, rather than presented individually. In other words, an equally deprived child in Birmingham or in Coventry is interchangeable for us, if not for their parents.
  • Eligibility for Free School Meals (FSM). Analysis here will be done for both individual Local Authorities, and for the West Midlands region, as defined by our three Local Enterprise Partnerships: The Greater Birmingham, Coventry and Warwickshire, and the Black Country. For reasons of sample size and the protection of privacy, this is as low as we may go.

Because deprivation by geography and deprivation by family income, while they do correlate, are two distinct vectors of disadvantage, I will be aiming to assess both of them at every stage of education and skills attainment.

School Readiness

We know that disadvantaged children in the West Midlands are less likely to reach the expected standard in early development, as measured by Early Years Foundation Stage Profile (EYFSP) teacher assessments. These assessments are carried out in the final term of the year in which a child reaches age 5, and look at physical and mental development, independence, and general ability to play, explore, and learn.

Department for Education data allows us to compare school readiness for FSM-eligible pupils to the average pupil, for each local authority in the region. From this, we can see that there is a pronounced inequality in outcomes. In the West Midlands region (The NUTS1 geography, which includes all of Staffordshire, Warwickshire, and Worcestershire), 55% of FSM-eligible pupils reach the expected standard of development, versus 71% of non-FSM pupils. In England, the respective figures are 55% and 73%. The chart below, based on 2017/18 data, contrasts FSM-eligible pupils (yellow) with non FSM-eligible (blue):

Within the West Midlands region, the greatest inequality by this measure is in Shropshire and Worcestershire, both of which have a 24% attainment gap between FSM and non-FSM pupils, with less than half of FSM pupils reaching the expected standard. Behind this are Herefordshire, Warwickshire, and Staffordshire, with a 19% gap. This variability might be expected to some extent as these areas are large and encompass both rural and urban areas. However, there is significant variance within the metropolitan area also, with only a 10% gap in Birmingham and Wolverhampton and a 17% in Dudley.

For the West Midlands region as a whole, we can also look at the difference between the lowest and highest deciles of the IMD. This indicates similar performance, but slightly less inequality of outcomes, in the West Midlands than the England average. As of 2018, 62% of children from the most deprived areas reach the expected standard, versus 79% in the highest decile. For England as a whole, these figures correspond to 61% and 80%.

However, these differences are slight, and the extent of inequality still large in our region. Were all children in the West Midlands to reach the same level of early childhood development as the 10% most advantaged, an additional 7,794 pupils would have reached the expected standard of development in that year, of a total cohort of 72,737 pupils. For reference, the different levels of deprivation for which data is available are shown in the map below:

School Performance

The UK Government’s ‘Find and compare schools in England’ dataset for pupil attainment and onward destination is a powerful and comprehensive dataset. It makes it possible to assess average marks for pupil assessments by their level of disadvantage and type of school attended, as well as how many free school meal-eligible pupils study at each school.

It does, however, introduce some challenges to the researcher. ‘Disadvantage’ for a pupil in this data refers to whether or not a school would be eligible for Pupil Premium funding for admitting that pupil. Consequently, FSM-eligibility is combined with looked-after children and children of parents serving in the Armed Forces. As datasets for other stages of education tend to consider FSM-eligibility on its own, this makes it more difficult to make comparisons. While IMD is not available in the data, the assumption could be made that the level of deprivation in the Census area in which the school is located is representative of where the student is. The strength of this assumption, however, would need to be tested.

Attainment data is based on the Attainment 8 measure, which compares the results a pupil achieves with the results of other pupils with the same prior attainment. It is essentially a measure of added value rather than specific grades. Once this research project looks at how disadvantage plays out in the wider skills system, it is likely to revisit attainment data as an explanatory variable. Additionally, the fact that data are available for each individual school makes it possible to understand how much of the influence of material disadvantage in educational attainment is mediated through deprived children not having access to as good a school, and how much in other ways.

Employment for School Leavers

The onward destinations of pupils in this dataset is one of most important sources of information this research project will make use of. It traces the future of past cohorts of  pupils leaving every school in the region, at either Key Stage 4 or Key Stage 5. Pupils from disadvantaged and non-disadvantaged backgrounds can be compared in terms of the proportion who go on to attend university, further education, an apprenticeship, or the workplace. We can also estimate the proportion of pupils who become NEET (Not in Employment, Education, or Training.) Combined with the further and higher education pupil records discussed later, and estimated earnings for school leavers, we will be able to get a detailed picture of how disadvantage affects future earnings and employment outcomes, and by how much.

Further Education and Apprenticeships

With some assumptions, it is then possible to follow the prospects for students in the West Midlands who leave school to enter further education. The Individualized Learner Record, a collection of student records, tracks core information about a student’s background and achievement in further education, including age, ethnicity, postcode address, institution, and course studied. In addition to this rich data on further education institutions, there are public data on the labour market outcomes for FE students, by region, level, subject, age, and the presence of learning difficulties.

The primary limitation of these datasets is that they do not include the students’ prior education or whether they were previously from a deprived background. Consequently, it is not possible to know what influence this has on the course they choose to study. Second, no link is made between deprivation and post-further education incomes. However we do have destinations for school leavers, which will make it possible to estimate how represented deprived students are in FE. Also, the postcode addresses of learners make it possible to determine how deprived an area they are from, corresponding to our second measure of deprivation.

Higher Education

There are similar challenges for higher education. We know from HESA data on the 2017/18 academic year that we can gauge the outcomes of higher education students by several metrics:

  • Indices of Multiple Deprivation by quintile can be associated (at the UK (and four nations) level) with employment outcome, including part and full-time as well as further study, voluntary work, and long and short-term unemployment.
  • Office for Students data shows the impact of FSM eligibility and having been in care on the likelihood of a student staying on at university and graduating, as well as the proportion who attain at least a second class degree. Given that FSM eligibility and experience of care are the largest two markers of what the DfE consider ‘disadvantage’, this will allow us to continue the school-level chain of progression through higher education. Socio-economic background of the parents is also available in these data.
  • At the same geographical level, we can view the relationship between institution, subject studied and salary quartiles. These data are only available where the institution and course subject area combination provides sufficient sample size but should permit some interesting analyses across both variables. Complete quartiles are available for all subjects at the UK level.
  • Salary bands can also be observed by destination region of the student after they graduate. Combining this with graduate retention data could be invaluable.
  • Office for Students data shows the proportion of degree-level apprenticeships go to students in each POLAR quintile area. POLAR quintiles refer to geographical areas at different levels of higher education provision and may allow us to compare access to university versus degree-level apprenticeships for young people living in different areas.
  • Quintile participation can also be analysed by gender and age, providing at least a cursory sense of the impact of growing up in a deprived area on mature applicants as well as those straight out of secondary school.

The missing link, ultimately, is the connection between deprivation and specific institution and course of study chosen. This would allow us to break down how much the impact of deprivation on graduate earnings is mediated through geography, attainment, and course studied. This missing knowledge would make it easier for us in the West Midlands to understand how our efforts should be concentrated to produce better outcomes for our young people.

Conclusion – bridging the gaps

On this first pass through available data on education and skills, we have identified three main missing links in publicly-available data:

  1. Comparing further education course studied to disadvantage.
  2. Earnings for students who have completed a further education course, by FSM eligibility.
  3. Disadvantage compared to higher education course studied.

The connection between deprivation and choice of further education subject might be illuminated by making use of linked administrative records that show individual students’ paths through each stage of education. The National Pupil Database (NPD) connects disadvantage (DfE definition) to variables such as institution of study, local authority, and educational path through to Level 3, including different combinations of GCSEs, A Levels, and vocational qualifications.

Having by these means linked deprivation to subject studied, the aforementioned data linking FE subject to earnings would make it possible to complete this whole branch of the tree, showing how disadvantage affects outcomes through school readiness, school, and further education through to the labour market.

The remaining link, between deprivation and higher education course studied, is more difficult to address. If and when Longitudinal Educational Outcomes data becomes available in a secure microdata form, it will be possible to trace the journey of individual students from deprived and non-deprived backgrounds through institution, course, and subsequent earnings and employment prospects. In the absence of this data, it may be necessary to partner directly with universities to understand this connection in detail, with reference also to internships and industry placements.

Economic modelling may allow us to look at what the economic consequences might be of the status quo continuing, and various scenarios of either improving or worsening inequity in skills and education. It also may let us quantify the economic cost of disadvantage at each stage of education, and identify the main bottlenecks

Finally, the datasets I have mentioned here are available for different years and a variety of cohorts. Once it has been ascertained in which parts of the education and skills system disadvantage is most pronounced, we will want to see whether the situation is getting better or worse.

This blog was written by Alex Smith, from the West Midlands Combined Authority.  

The views expressed in this analysis post are those of the authors and not necessarily those of City-REDI / WM REDI or the University of Birmingham

Sign up to our mailing list.

Leave a Reply

Your email address will not be published.