Data Analytics Primer
Data analytics encompasses a broad range of techniques and tools for examining data to discover insights, make predictions, and recommend actions based on the findings of analytical models. The purpose of this primer is to inform airports and key partners of the best practices for effectively implementing data analytics to enhance airport operations and improve the passenger experience. It provides a basic overview of the methods and techniques used in data analytics, describes the benefits and challenges of applying data analytics in practice, and provides the steps for implementing data analytics. The primer is intended for airport professionals of all levels who are interested in implementing or improving their data analytics practices.
Organization of the Primer
Select a section below to expand the information.
Data analytics approaches are described in terms of their descriptive, predictive, and prescriptive functions and are implemented using a range of procedures and techniques that vary in complexity to analyze an array of different sources of information.
Descriptive analytics is a process of integrating and summarizing data from multiple sources and communicating information about the data via reporting dashboards and data visualizations.
Predictive analytics involves developing advanced statistical or computational models to predict target outcomes or metrics. Predictive models for forecasting can vary in complexity, including methods such as decision trees, linear regression, machine learning, and artificial intelligence (AI) algorithms.
Prescriptive analytics supports decision-making by recommending a course of action to meet business objectives based on insights gained from descriptive analysis and predictive models.
There is no “one-size-fits-all” data analytics solution for all use cases. Airports may implement data analytics methods and techniques to address a variety of common business problems, such as traffic congestion, tracking passenger load at the airport, increased wait times at security, and issues related to overall customer service.
Valuable insights can be gained from various types of data, including text, images, video, audio, and data from sensor devices that are analyzed using different analytics methods. Text analytics involves the analysis of textual information using standard natural language processing procedures to understand customer sentiment, identify trends in social media data, and extract information from documents. Image analytics involves the process of identifying objects in digital images, classifying images, and measuring the similarity between different images. Video analytics can also be used to track the movement of people and objects across the airport, identify key events, and measure the effectiveness of marketing campaigns. Audio analytics can be used to identify speakers, transcribe speech, and measure the emotional tone of recorded speech. In addition, data collected from various types of sensor devices throughout the airport represent the Internet of Things that can be used to monitor equipment performance, predict failures, and optimize operations.
Independent of the specific use case, airports of different sizes can apply the same steps for implementing data analytics to describe and summarize the data, construct prediction models to forecast future events, and communicate insights from the data for informed decision-making. As complex organizations, airports have data needs and use analytics methods that will change and evolve over time. Airports can benefit from becoming more data-driven organizations as they deploy sophisticated analytical strategies and methods according to their needs and resources.
Airports collect very large amounts of data from a variety of sources across the organization, including passenger information and operational data from divisions such as transportation, security, operations, and concessions (see Table 1). Airport divisions also report budget statistics and financial data on a weekly basis that may be stored as separate spreadsheets. The information collected can be numerical, text data, video data, or social media data that is represented in different file formats. Integrating data from different sources, in various file formats, is time consuming but can be automated to some degree. The San Antonio International Airport Analytics Team is planning to develop an Airport Information Management System that would allow division managers to input data directly into a central reporting system rather than submitting data as separate Excel spreadsheets for analysis. The data resources are processed and entered into a central data warehouse or data lake. Des Moines International Airport is implementing an operational database to combine multiple data resources in a central repository. For example, the airport operational database is the central database or repository for all operational systems and provides flight-related data accurately and efficiently in a real-time environment. The data warehouse or data lake can be queried by members of the analytics or IT team to obtain information needed for further analysis.
Table 1. Example Data Sources and Key Performance Indicators Generated by Airports
Note: The content and figures of the WebResource can be viewed optimally using Chrome, Edge, or Firefox browsers.
Developing key performance indicators (KPIs) is an initial step in the analytics process for identifying important metrics for analysis. At many airports, the analytics team is responsible for tracking and compiling statistics about target data across airport divisions on a regular basis. As part of their strategic business plan, senior leaders at Phoenix Sky Harbor International Airport (PHX) call on airport division managers annually to identify operational or tactical projects with a focus on capital. Measuring and tracking progress on these projects was central in developing the KPIs needed to enhance operational efficiency at the airport. The PHX data analytics team reports to the director of finance (chief financial officer) and works closely with the IT department to track 85 KPIs—with 25 vital statistics for financial reporting—including data on parking, revenue, customer service, rental cars, ride sharing (e.g., Uber), aircraft operations, and mask compliance (during the COVID-19 pandemic). Since 2017, PHX has also conducted 25,000 passenger surveys with departing passengers each year to collect data on customer demographics, travel habits, spending, and parking. Leadership and division managers track KPIs across the airport on a weekly or monthly basis to evaluate progress toward target goals. Targets for the KPIs are revised at the start of the year, with KPIs added and removed as needed, based on a continuing review.
Descriptive analytics involves the process of aggregating and summarizing data to communicate insights to internal stakeholders and end users across the airport with visualizations and reporting dashboards. Reporting dashboards provide screen–readable summaries that visually display key performance indicators (KPIs) as charts, scales, and metrics; readable at-a-glance, to indicate progress toward predefined goals. Basic descriptive analytics, data visualizations, and reporting do not necessarily require a large initial investment or sophisticated analytics technology. Early in their analytics journey, airports can start at a small scale by using existing platforms for visualization and reporting (e.g., Excel spreadsheets) and then building up their analytics capabilities over time. The Customer Journey Scorecard is an innovative digital platform developed by the Houston Airport System for reporting information about eight KPIs internally in near real time (i.e., restroom cleanliness, wait times, customs and border processing, traffic, Wi-Fi connectivity, escalator use, water pressure, temperature). The scorecard provided managers and employees with feedback to enhance operations in key areas impacting the passenger experience. More complex reporting systems can be developed according to the airport’s needs and resources. For example, Dublin Airport constructed a suite of reporting dashboards to provide visibility on monthly and annual trends in airline performance for route analysis, punctuality, load factors, aircraft analysis, and throughput (Mullan 2019).
Predictive analytics is a process of developing statistical and computational models to forecast a target outcome based on historical data. Commonly used predictive analytics methods include time series forecasting, linear regression, and multivariate analysis. Airports can utilize predictive analytics models to optimize operations, improve customer satisfaction, and enhance the safety of passengers and employees. For example, a prediction model can forecast passenger demand at the airport at a given time of day based on roadway traffic, parking sales, shuttle buses, or number of bags checked. Machine learning (ML) describes a set of advanced analytics techniques that use statistical or computational algorithms to discover insights and make predictions from data. Unsupervised clustering models are used to group individual observations by their associations to nearby data points (e.g., k-means clustering). In terms of marketing, airports can cluster travelers by their use of products and services (e.g., retail vendors, Wi-Fi usage) and identify potential customers by the similarity of their purchases or other characteristics. By contrast, supervised classification models are trained on existing labeled data points and then used to assign new cases to predetermined categories (e.g., logistic regression, supporting vector machines, random forests). The John F. Kennedy International Air Terminal worked with an analytics consulting team to develop passenger profiles based on the finding that travelers arriving for very early morning flights showed a different presentation profile at security checkpoints than travelers arriving for flights later in the day. Computationally complex artificial intelligence (AI) algorithms, such as deep learning neural networks, can provide high performance analytic solutions. A limitation of deep learning AI models is that the algorithms are not transparent to interpretation. Airports can stay competitive by leveraging the power of ML approaches, although additional development and outreach is necessary for AI systems to become more widely integrated into day-to-day operations.
Prescriptive analytics is a process that extends the steps of descriptive and predictive analysis, going beyond the questions of “What happened in the past?” and “What is likely to happen in the future?” to provide recommendations about actions for addressing business problems in real time. Typical methods for prescriptive analytics include the use of rule-based systems, heuristics, and model optimization. The information from descriptive analysis and prediction models provides input for prescriptive analytics to support data-based decision-making and achieve key business objectives. For example, prediction models about peak times in passenger volume at security checkpoints provide recommendations to Transportation Security Administration personnel about opening additional security lanes or making scheduling decisions about staffing needs to meet increased demand. Decisions based on prescriptive analysis can increase efficiency and reduce airport costs over time. Prescriptive analytics can also provide data that informs the planning and development of new projects at an airport. For example, Phoenix Sky Harbor International Airport analyzed passenger survey data to identify factors influencing passenger wait times for airport buses and shuttles. The results of the analysis yielded data that led the airport to extend its sky tram service to the car rental area.
The use cases in this section provide examples of analytics solutions that were implemented by airports interviewed for the case studies to address business problems encountered by passengers in their journey through the airport. Additional descriptions and details are provided in the Case Studies section.
Customer Journey Scorecard: Digital Airport Platform
The Houston Airport System (HAS) developed a Customer Journey Scorecard as a digital platform for internally reporting feedback about current conditions at the airport. The concept combines a business component for scoring performance on foundational passenger needs with data and analytics technical architecture for implementing the platform in Power BI. The scorecard provides descriptive analytics and data visualizations on eight operational key performance indicators (KPIs) that are being tracked and managed globally: roadway traffic, wait times for Transportation Security Administration (TSA) checkpoints across all terminals (with a goal of 22 minutes or less), customs and border patrol processing, restroom cleanliness, Wi-Fi connectivity across terminals, escalator usage, water pressure, and terminal temperature. HAS uses Smart Restrooms where cleanliness is tracked by customer sentiment ratings registered on an iPad (i.e., happy face/sad face) as passengers exit the restroom. A data snapshot is taken every 15 minutes to determine whether restroom cleanliness is above or below expectations, identify any issues for attention (e.g., “wet floor”), and monitor cleaning frequency. The scorecard gives frontline employees feedback on performance in near real time and provides a line of sight from their activities to the passenger experience, to reinforce a sense of ownership and engagement. The idea for the scorecard received widespread support across relevant stakeholders. Coordination among senior leaders, managers, and the IT team converged to implement the project. In addition, HAS adopted a policy of terminal management, with a separate manager responsible for different operations in each terminal (e.g., custodial service, customer experience). The Chief Technology Officer was enthusiastic about the project feasibility and the Data and Applications team pulled the data together to better understand the passenger journey.
Key Takeaways
- The Customer Journey Scorecard is a digital platform for reporting current airport conditions in near real time to improve the passenger experience.
- The scorecard platform provides employees with performance feedback on eight KPIs.
- The scorecard reinforced employee engagement and enhanced operational efficiency.
Monitoring Roadway Traffic to Reduce Congestion
Seattle-Tacoma International Airport (SEA) was experiencing an increase in passenger volume which resulted in traffic congestion that impacted passenger travel times to the airport. Airport leadership tasked the airport’s analytics team with measuring the level of congestion that was impacting travelers. The team leveraged data from an existing intelligent transportation system (ITS) of cameras and software to monitor traffic on 1.5 miles of the airport roadway and provide travel alerts to internal teams. The analytics team established data pipelines to ingest the camera data and store traffic information in a data warehouse. Traffic engineers were consulted to develop the correct formula for calculating congestion based on the speed, density, and the volume of cars on the roadway to predict the number of passengers affected by severe congestion. SEA deploys data visualizations and analytics for predicting travel times internally to operational teams as dashboards in Tableau. The credibility of the data and prediction models is maintained by comparing travel time estimates to data that is scraped from Google Maps. The solution developed by the SEA analytics team added value by using an existing ITS system in a new way to get actionable data about travel flows to estimate congestion, alleviate pressure on the airport roadway, and improve the journey of travelers at the airport.
Key Takeaways
- The analytics team leveraged data from an existing ITS camera system to measure congestion.
- Data visualizations and reporting dashboards display expected travel times for internal teams.
- The airport sends traffic alerts in real time to alleviate congestion on the roadway.
Tracking Passenger Movements to Predict Wait Times
Terminal 4 (T4) at the JFK International Air Terminal experienced a congestion problem at check-in areas due to passengers arriving before the service counters opened. The airport wanted to learn how passengers move through different parts of the airport and gain insights about issues at check-in to alleviate peaks in TSA screening queues and improve the traveler experience, giving customers more time to spend in retail areas. T4 partnered with Copenhagen Optimization, an international software and consultancy company that specializes in airport operations, to deploy an analytics software platform that integrates different sources of information and builds prediction models to forecast wait times. Flight schedule information is analyzed to identify trends in the number of passengers arriving throughout the day and predict passenger loads. Passenger profiles were developed to estimate the presentation times at the security checkpoints. Travelers arriving for very early morning flights show a different presentation profile than passengers arriving at times later in the day. The analytics models predict the length of the security queues and associated wait times based on flight schedule information and passenger profiles. Camera data is also used to adjust the prediction models and make better recommendations about opening additional screening lanes during peak times in passenger load.
Key Takeaways
- T4 deployed an analytics software platform to track passenger flow at check-in and security.
- Prediction models estimate queue length and wait times for passengers at TSA screenings.
- Models are updated with camera data to recommend opening additional screening lanes.
Optimizing Staff Rostering with Prediction Models
DAA (formerly Dublin Airport Authority) is a data-driven organization that operates the Dublin Airport (DUB). In 2014, DAA made a strategic decision to create a data culture and invest in new operating models, technical platforms, and analytics talent that was implemented in a phased, multi-year approach. As part of this initiative, Dublin Airport deployed advanced analytics models to predict passenger volumes each day based on arrival patterns, flight schedules, and seasonal information. Passenger forecasts provide a basis for making recommendations about the number of security screening lanes to open and the number of staff required in each lane to meet the anticipated demand. An automated rostering system optimizes staff rosters based on employee work schedule preferences and shift duration to match staff preferences with predicted peaks in demand. The DAA data science and engineering team integrates outputs from the prediction models directly into operational systems, as the outputs of the staffing roster system are fed into the business system that updates staffing for users. A key focus for the future is to have model outputs ingested by business tools seamlessly as part of the workflow rather than existing in separate systems. Optimizing staff rostering provides a workable solution to a well-defined business problem. This prescriptive approach achieves several business objectives by managing staffing, reducing wait times, and improving customer service.
Key Takeaways
- Dublin Airport optimizes staff rostering based on prediction models of passenger demand.
- The outputs of the prediction models are fed directly into the staff rostering system.
- The rostering system matches staff shift preferences with expected passenger demand.
Implementing and expanding an analytics program requires airports to maintain an up-to-date IT infrastructure for collecting, integrating, storing, and analyzing the very large amounts of data collected. Some historical airport data may be stored in legacy systems such as Excel spreadsheets or relational databases (e.g., Oracle). Cloud-based systems for storing and managing data, such as a data warehouse or data lake [e.g., Amazon Web Services (AWS), Google Cloud], provide state-of-the-art technology for ingesting and storing the massive amounts of data generated by different divisions of the airport. Organizations build data lakes to harness multiple data streams, consolidate them into a single source of truth, and make the data available to intended users for appropriate use cases. Many cloud-based computing systems support data science and data analytics capabilities (e.g., AWS Redshift, Snowflake). In some cases, separate connections (e.g., Redshift connector) are needed to feed data stored in the cloud to analytics tools, such as Power BI, and regulate the amount of data ingested at one time.
Descriptive analytics do not require complex analytics platforms; data reporting dashboards can be constructed using Excel spreadsheets. Analytics platforms such as Power BI, Tableau, and Alteryx provide extended capabilities for developing more sophisticated dashboards with an integrated user interface. To remain competitive, airports should invest in and upgrade their software and computing resources to deploy more sophisticated analytics techniques for analyzing and understanding the increasing amounts of data generated across the organization. Analytical prediction models are developed using statistical packages or programming languages. Airports can leverage existing software applications for conducting data analysis (e.g., Stata, SPSS) or use open-source applications that are available at lower cost (e.g., R, Python). Several off-the-shelf commercial software products are available that combine multiple analytics capabilities in a single platform (e.g., Alteryx, Qlik Sense, Sisense). Analytics teams can also build high-performance prediction models in-house using machine learning (ML) and neural network algorithms [artificial intelligence (AI)], which provide sophisticated analytics solutions.
Airports can partner with external consultants to develop complex analytics solutions rather than hiring and retaining analytics talent. For example, JFK International Air Terminal and other airport operators are working with Copenhagen Optimization to deploy advanced analytics software for intelligent queueing and predicting passenger wait times. In addition, Dallas Fort Worth International Airport provided historical data on vehicle traffic, aircraft movements, and weather to researchers at the National Renewable Energy Laboratory that developed and compared several ML and recurrent neural network models for forecasting traffic demand (Lunacek et al. 2021). In other research, deep learning models have been deployed to identify variables that influence arrival flow at airports (Yang et al. 2020) and explore how major weather conditions impact airport operations (Schultz et al. 2021). A recent review on the role of explainable AI in air traffic management systems revealed a focus on the descriptive level of analysis with some predictive characteristics (Degas et al. 2022). Additional research is needed on the predictive and prescriptive levels of AI to realize its potential value for aviation (Salinas et al. 2020).
In developing a software application or analytics platform, the data architecture describes the overall design of the computing system and the logical and physical interrelationships between its components. The Customer Journey Scorecard developed by the Houston Airport System (HAS) IT Data and Applications team provides an example of an innovative analytics platform currently in use. Data from each of the different information sources that are initially independent (i.e., siloed) are ingested into a cloud-based data lake. For tracking restroom cleanliness, an external vendor is used for data collection, and the data is pulled from the vendor every 15 minutes using an open application programming interface (API). The raw data is evaluated, transformed, and loaded into analytics applications for subsequent steps in the analysis. The backend of the user app was developed with the Amazon Web Services (AWS) Redshift, Athena, and Lambda functions. All data sources were ingested into the AWS architecture and connected to the analytics tools. A data gateway was established in Redshift to connect AWS to Power BI to constrain the amount of data pulled into Power BI at a given time. HAS hired an external consultant to assist with the initial Power BI setup. Currently, the Customer Journey Scorecard platform is supported internally by the applications team.
Figure 1 shows a sample data architecture that outlines the steps in an end-to-end analytics solution to enhance operational efficiency and improve the passenger experience. The architecture could be implemented using other similar, comparable software products and computing resources. The steps below outline the end-to-end analytics workflow for the sample data architecture, corresponding to the numbered steps indicated in Figure 1:
- Build a data platform with roadway traffic, escalator sensors, customs processing, security screening, Wi-Fi connectivity, restroom cleanliness, water pressure, and room temperature.
- Use API to pull data from the vendor; leverage AWS Data Exchange to collect data from other sources.
- Provide staging for ingesting the data using cost-effective storage classes in Amazon Simple Storage Service (S3). Use open standards to build the data lake using the same data as the operational data platform.
- Use a read pattern schema to make the raw data and curated data read for all user roles; build all reportable datasets in Amazon S3.
- Leverage Amazon Redshift and Amazon Athena for analytics. Optionally, build data marts in Amazon Redshift for heavily used analytics. For ad hoc requirements, publish the data catalog and use Amazon Athena for analysis directly using the data lake.
- Use purpose-built databases, such as Amazon DynamoDB, and serverless architecture to deliver microservices and events for operational data storage.
- Build operational dashboards and end-user applications by leveraging these microservices.
Figure 1. Example Data Architecture for Airports
Note: The content and figures of the WebResource can be viewed optimally using Chrome, Edge, or Firefox web browsers.
Airports can realize substantial benefits by implementing and improving their use of data and analytics. The airports interviewed for the case studies provide several illustrative examples of gains obtained from analytics solutions.
Business Intelligence
- The primary value of data analytics is to inform decision–making and achieve strategic business objectives.
- For Des Moines International Airport, becoming a more data–driven organization will support more effective business and operational decisions based on data.
Data Integration and Storage
- The benefit of a cloud-based data warehouse or data lake is that it stores all the different streams of information into a single source of truth, from which end users can extract data for further analysis.
- For San Antonio International Airport, the benefit of developing a central airport information management system is that division managers can enter data directly into a central data lake or data warehouse and relevant information can be queried according to the needs of internal shareholders.
Key Performance Indicators
- The benefit of key performance indicators (KPIs) is that they provide essential information about financial performance across operational divisions.
- Phoenix Sky Harbor International Airport leveraged data from customer surveys to improve customer–facing services and processes.
Reporting Information in Near Real Time
- The benefit of data visualization and reporting dashboards is that they visually summarize data at-a-glance and communicate information about KPIs toward predefined targets.
- The Customer Journey Scorecard digital airport platform developed at the Houston Airport System is a reporting tool that provides value by improving operational efficiency. Obtaining customer feedback in near real time is beneficial for managers to mitigate issues affecting the passenger experience.
- In addition, Seattle-Tacoma International Airport repurposed an existing technology system to obtain actionable data about roadway congestion and travel time to the airport. Alerts about congestion were sent out to internal teams in real time, which helped to improve customer service.
Forecasting Target Outcomes
- Advanced analytics techniques and prediction models are useful for identifying trends in historical data and forecasting what is likely to happen in the future for a target outcome based on what has happened in the past (e.g., predicting passenger demand throughout the day).
- The JFK International Air Terminal developed prediction models that integrated flight schedule information and passenger profiles to forecast security queue lengths and reduce wait times at security checkpoints.
- Dashboards informed internal teams about increased passenger demand so they could open additional security screening lanes.
Recommendations for Solving Business Problems
- Data from the descriptive and predictive steps is taken to prescribe a course of action to solve a designated business problem.
- The advanced analytics passenger forecast models developed by Dublin Airport were essential for determining the number of staff required for security screening lanes. The staff rostering system was optimized by matching employee shift preferences with anticipated peaks in demand to provide schedule updates for users.
Airports can face several challenges in implementing analytics, from the initial understanding of the business problem to data integration to model development to making actionable decisions based on the data analytics findings.
Understanding the Problem
- At the initial phases of analytics, the business problem under consideration may not be well-defined. Internal stakeholders may not know what data is needed to address the problem, or the relevant data may not be available.
- Understanding the problem is key to finding a solution and can streamline conversations between the end users, analysts, and developers. Having a dedicated team responsible for data analytics is helpful for delineating project goals and data needs.
- The analytics team can coordinate with internal stakeholders to identify manageable projects, relevant sources of information, and suitable analytics approaches to meet the user needs.
Integrating Data Sources
- Bringing together and formatting data from diverse sources was reported as a major challenge for the airports interviewed. The quality and granularity of the data are also important considerations. For example, survey data are highly structured and easily analyzed, although self-reported information can be biased and not fully accurate.
- At Phoenix Sky Harbor International Airport, parking services generate a huge amount of data (e.g., 18 million records per day) that is very accurate but can be unmanageable.
- At San Antonio International Airport, various departments and divisions submit financial information and budget statistics as separate Excel spreadsheets on a weekly basis. The data is ingested into a data lake as a central repository and source of truth.
- Automating data collection and process automation is a pain point for many airports. The Seattle-Tacoma International Airport is using Alteryx to optimize dataflows and transform data for analysis.
Limitations of Data Reporting
- The data reported on dashboards and data visualizations is often from the past day, the past week, or the past month. Collecting passenger data in real time can be challenging.
- Data that is captured in near real time (e.g., every 5 or 15 minutes), or unstructured data from video cameras, requires an additional step of processing before it can be reported. Video camera data also does not provide information about what flight a passenger is arriving for.
- Furthermore, in building dashboards to report data in near real-time, the amount of data that is pulled from the vendor into the analytics platform (e.g., Power BI) must be limited to avoid pulling in too much data on every refresh of the application.
- The technical details must be worked out in coordination with the relevant teams and internal stakeholders (e.g., IT Data and Applications team and terminal managers in the case of the Houston Airport System Customer Journey Scorecard digital airport platform).
User Adoption and Culture Change
- Many airports reported that user adoption is a challenge for implementing analytics and that culture change is needed across different airport divisions.
- DAA (formerly Dublin Airport Authority) found that teams in operational divisions, such as security checkpoints, were more risk averse about staffing needs (i.e., scheduled more staff) and slower to adopt changes based on analytics than commercial divisions of the airport (e.g., parking) which were quicker to implement changes based on analytics.
- Data literacy and the literacy of end users regarding data may help to reduce risk aversion regarding data and to facilitate operational changes based on the results of analytics models.
- Members of the analytics team can engage with end users and internal stakeholders to provide training and explain how data analytics models can provide solutions to address practical business needs.
Implementing and improving analytics for airports is a complex and iterative process that requires collaboration, data governance, and ongoing maintenance of data and technical resources to ensure long-term success. Planning and coordination among airport leaders and internal stakeholders is essential for supporting analytic initiatives, as well as needed investments in technology and expertise. Finding the right balance between an idea, user needs, and technical implementation of the analytics solution can provide a foundation for the success of projects such as the Customer Journey Scorecard. The process of implementing data analytics involves several key steps that are outlined below:
- Align Data and Analytics with Business Objectives
- Define Objectives. Identify a small number of well-defined, high leverage business problems that are capable of being addressed and will produce results that demonstrate value to business leaders. This drives the team to identify the needed data and analytics approaches to be used.
- Identify Data Sources. Determine relevant sources of available data, including passenger data (e.g., ticketing, boarding, baggage), flight data (e.g., schedules, delays, cancellations), operational data (e.g., staffing, equipment usage, maintenance), and other data (e.g., security).
- Stakeholder Engagement. Engage relevant airport stakeholders—leaders, managers, end users, service providers—to promote engagement and collaboration in analytics implementation, align analytics initiatives with user needs, and foster a culture of data-driven decision-making.
- Data Management
- Data Quality and Cleaning. Ensure the quality of the data via preprocessing and data cleaning steps to remove duplicates, handle missing values, resolve inconsistencies, standardize data formats, perform validation, and ensure more accurate and reliable data for subsequent analysis.
- Data Integration. Establish mechanisms for collecting and integrating data resources into a unified data platform such as a data warehouse or lake. Work with airport stakeholders, airlines, third-party vendors, and technology service providers to ensure the availability of data.
- Data Storage and Infrastructure. Set up an appropriate infrastructure to store and manage the collected data. This can involve utilizing cloud-based platforms such as a data warehouse or lake. Establish data governance and security procedures to protect customer privacy.
- Analytics Model Development
- Data Visualization and Reporting. Construct visualizations of data and establish mechanisms for communicating the analyzed data in a meaningful way. Reporting dashboards and interactive visualizations can help airport stakeholders easily interpret data and understand insights.
- Data Analysis and Predictive Modeling. Develop predictive analytical models to forecast target outcomes (e.g., passenger load) using statistical and computational models [e.g., machine learning (ML) algorithms] to extract insights from historical and real-time data.
- Actionable Insights for Decision-Making. Deploy prescriptive analytics to translate insights from data into actionable recommendations for data-based decision-making. This includes optimizing resource allocation, enhancing operational efficiency, and improving the passenger experience.
- Iteration and Optimization
- Continuous Improvement. Implement an iterative process for continuously monitoring the effectiveness of analytics solutions. Collect feedback from stakeholders and evaluate the impact of data-driven decisions to refine and improve the airport analytics capabilities over time.
- Process Automation. Automate analytics processes and functions to accomplish defined business goals. Examples include automated data ingestion, integration and storage, automatic updates to reporting dashboards, and using automated statistical, ML, and artificial intelligence algorithms.
- Data Architecture. Optimize the data architecture that specifies the overall design of the computing system, including the hardware, software applications, and network access protocols, as well as the interrelationships between the system components.
The abundant data available for airports from a range of sources raises issues related to the storage and privacy of data, as well as sharing the data with external partners or competitors (see Table 2). Certain data types will require strict regulations for storage and disposal, such as personal identifying information (PII), whereas other data types are fully public and displayed on airline websites, such as passenger count. It is critical for airports to proactively take steps to ensure data security to prevent negative outcomes from data security issues.
After collecting data, it is beneficial for airports to engage in data sharing with other entities (e.g., other airports, airlines, concession vendors) to obtain better outcomes in areas such as passenger satisfaction. The complex nature of aviation, and the potential for extreme outcomes if there are incidents, fuels the need for data sharing at all levels within airports (Global Aerospace 2020). Airports can use data sharing agreements to outline how data will be shared and how to appropriately use the data (U.S. Geological Survey n.d.). When developing the agreement, certain content areas should be addressed: authority, access provisions, confidentiality and disclaimers, time limits, and modification processes. Data sharing may also occur at a group level by establishing data sharing programs that involve multiple airports that can share their experiences. For an airport to effectively be involved in data sharing, they must first identify where they can find useful data, establish how to analyze it, and then implement actions (Global Aerospace 2020). Without a learning and change implementation aspect to an airport’s data sharing involvement, there may be little to no measurable benefit.
Table 2. Potential Legal Considerations of Data Sharing and Privacy for Different Sources of Data
The information in this table should not be construed as legal advice and local, state, and federal laws should be consulted as appropriate.
Note: The content and figures of the WebResource can be viewed optimally using Chrome, Edge, or Firefox web browsers.
Airports are also regulated by law in terms of the data that they are required to collect and track. Section 214 of the Federal Aviation Administration (FAA) Modernization and Reform Act of 2012 requires airports to collect data to satisfy certain environmental regulations, obtain and maintain Part 139 Airport Certification, or be eligible to receive Airport Improvement Program (AIP) and Passenger Facility Charge (PFC) Program funding. The National Environmental Policy Act and the Airport Noise Compatibility Planning Program require all airports to track local noise exposure [14 Code of Federal Regulations (CFR) Part 150], and the Clean Air Act requires all airports to track pollutant emissions. Airports geographically located in cold climates that regularly apply deicing or anti-icing agents to equipment and runways are required by federal Airport Deicing Effluent Guidelines (40 CFR Part 449) to track statistics on usage and amount recovered. Airports must provide data on a variety of key performance indicators to secure Part 139 Airport Certification from the FAA, including pavement classification number by runway, aircraft rescue and firefighting responses within mandated response times, and runway incursions. Airports must also collect and provide data on metrics such as debt service coverage ratio, airport concession revenue per enplaned passenger, and annual enplaned passengers as a condition of receiving AIP or PFC funding from the FAA.