Request ID: FOI-4095-2324 Date published: 25 March 2024
You asked
I am a final year Robotics student and have decided to do my Advanced Project on the spread of congestion across the London Underground as an SIR analogy to the contagion of epidemics.
I will be using the complete averaged 2022 Crowding datasets as the network operating baseline but am struggling to find one-off measurements of the same resolution. I am aiming to use historical crowding and delay data on particular days to create an algorithm that finds a relationship between them.
I am therefore requesting pairs of 2022 datasets which show crowding and delay data, respectively, in the highest resolution possible across all LU and EZL stations/lines. The exact dates of these samples are not important but they would ideally be spread over different days of the week and seasons.
Clarification received 19/02/2024: As requested, I have selected specific dates in 2022 I would like data for:
20th & 27th Jan
18th, 25th & 27th Feb
6th & 27th Mar
12th & 24th May
2nd & 21st-25th June
1st-3rd, 16-18th & 31st July
29th Aug
15th Sept
1st & 2nd Oct
7th Nov
12th Dec
The averaged Crowding data I am using is from files named 'NBT22XXX_Outputs.xlsx' on http://crowding.data.tfl.gov.uk/ Please let me know if you would benefit from any more details.
Further clarification received 27/02/2024: Yes, I am requesting 'delay data' for the same days. I unfortunately haven't come across exactly how historical delays are logged; all I need is data in a similar time resolution to the 'crowding data' that tracks abnormal system operation - either by link or by platform (platform closure, delayed departure, maintenance, etc). Worst case scenario, I can just infer when trains are running behind schedule from the 'Link Frequency' sheet for those days, when provided. But ideally, I would like to be able to quantify the 'health'/performance of a platform by differentiating between disruption types too.
We answered
TfL Ref: FOI-4095-2324
Thank you for your request of 14th February 2024 – as clarified on 19th and 27th February 2024 - asking for information about crowding and delays on the London Underground. Your request has been considered in accordance with the requirements of the Freedom of Information Act and our information access policy.
Specifically you asked:
“I am requesting pairs of 2022 datasets which show crowding and delay data, respectively, in the highest resolution possible across all LU and EZL stations/lines. The exact dates of these samples are not important but they would ideally be spread over different days of the week and seasons.”
And;
“I have selected specific dates in 2022 I would like data for:
20th & 27th Jan 18th, 25th & 27th Feb 6th & 27th Mar 12th & 24th May 2nd & 21st-25th June 1st-3rd, 16-18th & 31st July 29th Aug 15th Sept 1st & 2nd Oct 7th Nov 12th Dec
The averaged Crowding data I am using is from files named 'NBT22XXX_Outputs.xlsx' on http://crowding.data.tfl.gov.uk/ Please let me know if you would benefit from any more details.”
And;
“Yes, I am requesting 'delay data' for the same days. I unfortunately haven't come across exactly how historical delays are logged; all I need is data in a similar time resolution to the 'crowding data' that tracks abnormal system operation - either by link or by platform (platform closure, delayed departure, maintenance, etc). Worst case scenario, I can just infer when trains are running behind schedule from the 'Link Frequency' sheet for those days, when provided. But ideally, I would like to be able to quantify the 'health'/performance of a platform by differentiating between disruption types too.”
I can confirm that we hold the information you require.
In relation to delay data, please see the first two spreadsheets attached, which is the best data that can be produced in relation to your request. The “Notes” spreadsheet explains how the data has been compiled. The second spreadsheet provides the data itself (you may need to expand the width of some columns to read the data in full).
In regard to the crowding data, I am afraid that this aspect of your request is largely being refused under section 14 of the Freedom of Information Act due to the disproportionate amount of resource it would take to provide it (see here for more information about the application of section 14: https://ico.org.uk/for-organisations/foi/freedom-of-information-and-environmental-information-regulations/section-14-dealing-with-vexatious-requests/what-does-section-14-1-of-foia-say/#:~:text=Section%2014(1)%20is%20designed,of%20disruption%2C%20irritation%20or%20distress.). As you can see from the following link - https://ico.org.uk/about-the-ico/media-centre/news-and-blogs/2023/09/information-commissioner-calls-on-public-authorities-to-stop-using-spreadsheets-in-foi-responses/ - the Information Commissioner recently issued an advisory notice to public authorities calling for an immediate end to the use of original source excel spreadsheets when responding to FOI requests. In response to this TfL will only provide spreadsheets which have been converted into CSV format when responding to FOI requests. This is a time consuming process as each worksheet in an original source spreadsheet has to be re-formatted individually, and then checked individually for any hidden data. In response to your request for crowding data colleagues have sourced the information, but this has been provided in 27 separate excel spreadsheets, each containing 12 worksheets (so 324 worksheets that would need converting in total). There is no quick or efficient way of achieving this, and we see no wider public interest in expending the necessary resource to do so. We have tried converting a spreadsheet into pdf format, but this is in itself a very slow process, and renders the data in a format that is virtually impossible to follow. However, that said, please see the attached folder which provides the data for one of the 27 spreadsheets. I am not certain exactly which of the worksheets is relevant for your exact interest, and if it is only a small number (i.e. one or two) then we should be able to re-format the equivalent data for the other 26 spreadsheets without excessive resource being expended on it. If this would be of value to you please let me know.
Please see the attached information sheet for details of your right to appeal as well as information on copyright and what to do if you would like to re-use any of the information we have disclosed.
Yours sincerely,
David Wells FOI Case Officer FOI Case Management Team General Counsel Transport for London