Data processing platform for a major European bus manufacturer

By Oussem ALOUI / Julien GREAU, 21 January 2025

Country: France

The client

Our client is a manufacturer of public transport vehicles, and is challenged to exploit vehicle operating data for activity analysis and predictive maintenance needs.

The challenge

Since 2018, Akkodis has set up 3 processing chains collecting data from 9,000 buses in operation throughout Europe; the challenge was to migrate these processing chains & the associated storage solutions to an AWS / Databricks environment

The solution

Working environment

We access Databricks through an already set up SSO connection, without any intervention on our part.

AWS Cloud Components

We use S3 for data storage, CloudWatch to monitor that storage, EC2 to run the virtual machines associated with Databricks jobs, and Databricks for Python and SQL processing.

Implementation details ensuring the solution stays up to date with new security threats

Security threat management is not directly supported by our team. We only ensure that Databricks runtimes are kept up to date with the latest improvements and fixes.

Performance Monitoring & Alerts

Performance monitoring is done through a Databricks job that tracks several metrics:

Cluster usage (in minutes) to process collected events.
Number of files rejected per day.
Number of unique VINs.
Non-standard dates over the last two weeks.
Event delays over the last 120 days.
PCM and Intellibus events per day.
Unknown VINs by date and distinct unknown VINs.

Architecture diagram

KPIs identified and monitored

KPIs currently being tracked include:

Costs of each cluster based on processing time.
Cluster usage in minutes, depending on the types of machines used.
Comparing costs between different cluster types to optimize resources.

Execution log/customer notes when monitoring issue alerts

Incidents are discussed during sprint review meetings or by email exchange. If necessary, a ticket is created and tracked in Azure DevOps, for processing by Akkodis teams

Deployment Checklist When Updating

Before each update, we check:

Correct configuration of the Databricks cluster.
The availability of associated EC2 instances.
That the notebooks are correctly versioned and pushed correctly to GitLab.

Network security practices used – e.g., separate subnets, security groups

Network security management is not directly our responsibility and is handled by a client-side team.

How is in-flight data encrypted, and where, with what certificates, how are certificates managed/refreshed, TLS protocols, etc.

The management of in-flight data encryption and certificates is not our responsibility and is handled by a client-side team.

Deployment model

Job deployment is primarily handled by Databricks, which provides the tools to schedule and execute them. We also integrated GitLab to version notebooks and allow jobs to be automatically triggered from there. However, this setup is not a typical CI/CD pipeline, as it does not include automated phases like testing or validation before deployment.

What is RTO and RPO?

RTO and RPO are not formally defined in our current environment and are supported by standard Databricks and AWS configurations.

What is workload resilience: multi-AZ? multi-region?

Our architecture is deployed in a single AWS region. We have no visibility into how Multi-AZ is used or configured.

TCO analysis/modeling

TCO analysis focuses on costs related to:

Storing data in S3, depending on volume and frequency of access.
When Databricks clusters are running, including usage time and size of configured instances.

The result

The data chains are functional. We ensure their maintenance and operation.

Projet principles

Control of the perimeter.
Mastery of data chains in an AWS environment.
Mastery of migration processes.