Companies collect billions of data points daily from sensors, machines, and systems, raising the risk of making decisions based on unreliable data. As the saying goes in the data world: Garbage in, garbage out. Poor data leads to poor results, and that can be costly.

Danfoss was aware of this when they asked FORCE Technology to assess the quality of data from a test rig in the Controls and Thermal Management (CTM) department. It turned out to be a wise decision, as the data quality did not meet expectations, and the system had been hiding minor errors in test data for years. 

The system hid errors

The project was part of a broader digitalisation initiative across Danfoss' global test centres. It was led by Daniel Frederik Busemann (then Senior Engineer, Process Automation) and aimed, among other things, to optimise data handling, which was often done manually. 

FORCE Technology analysed the data quality from a test rig capable of testing four valves in a series. The rig simulates a refrigeration system used in medium to large supermarkets.

Pipes and valves
Analysis revealed hidden errors in test data from the valve system, which led Danfoss to change their workflows.

The analysis revealed that the system began losing data points after 24 hours, something that had previously gone unnoticed. The error wasn't detected because the old system automatically filled in missing values with the most recent measurement. However, most of Danfoss's previous tests were shorter than 24 hours and, therefore, unaffected by the issue.

"We knew we couldn't be sure about the data quality, but we had never looked at our data in that way before and realised that there were actually errors throughout the entire period," says Daniel Frederik Busemann.

Changes in workflows 

Following the project, Danfoss changed its practices. Test runs are now limited to a maximum of 24 hours, and equipment with a high risk of failure has been prioritised for replacement. This is necessary, as some of the test equipment is older and expensive to replace.

"In the worst case, incomplete test data can lead us to draw the wrong conclusions. And that becomes a quality issue," says Daniel Frederik Busemann.

Faulty data can also lead to errors in AI models trained on that data, creating a chain reaction where a single error propagates throughout the system.

FORCE Technology as a process partner

The project ran over several months and began with an initial meeting. Shortly after, FORCE Technology visited the test facility in Nordborg and met with the experts who work with the test equipment daily.

"It was really important to have those physical days where we could hand over our knowledge and data for FORCE Technology to work on remotely. Our experts felt like an important part of the process," says Daniel Frederik Busemann.

FORCE Technology's ability to translate technical concepts, so that data specialists and mechanical engineers could speak the same language, strengthened the collaboration. Danfoss's experts contributed the domain knowledge necessary to assess data quality.

Validation rules for data

Although Danfoss is a global company with around 40,000 employees, not all test facilities have dedicated data specialists. That made external help essential:

"We wouldn't have had the resources to do this ourselves. It was much easier to bring in an external partner than to wait for our own data analytics people," says Daniel Frederik Busemann.

FORCE Technology didn't deliver a complete software solution, but rather a method and tools, including validation rules and code, that Danfoss can integrate into its own systems. One example of a validation rule is that a test should automatically stop if more than 10% of the data is missing. 

Measuring equipment
Danfoss' data was assessed using a two-phase framework: first system analysis, then development and implementation of data validation rules.

The rules were developed in Jupyter Notebooks and visualised in Grafana. These open-source tools make the solution scalable and accessible even after the project ends. The visualisations made it clear that data quality dropped significantly after 24 hours, something not visible in the existing graphs.

A systematic approach to data quality

FORCE Technology's method is based on international standards such as ISO 8000-61. It follows a standardised framework with seven steps divided into two phases: first, the system and data architecture are analysed; then, data validation rules are developed and implemented.

Michael Vaa, Head of IoT Architecture & Technology, Digital and Sustainable Innovation at FORCE Technology, the department that handled the project, emphasises the method's broad applicability: 

"It's about creating processes that customers and companies can implement so that systematic data checks become a regular part of their daily operations." 

Throughout the process, data is assessed across six dimensions: completeness, correctness, timeliness, uniqueness, reusability, and validity.

Data quality in an IoT world

Projections estimate that the number of IoT devices will exceed 75 billion this year, but only 1% of the data will be used, partly because the quality isn't high enough.

"When you look at the use of AI and machine learning, which can be incredibly powerful tools, you have to ask: what kind of data are you feeding into the algorithms?" says Michael Vaa.

At Danfoss, there was awareness of uncertainty in measurement equipment, sensors, and prototypes, but not everyone realised that the data transfer itself could be flawed.

"Poor data quality is a barrier to unlocking the potential of AI and digitalisation. In the data economy, where data is a commodity, you also want to ensure that it meets the necessary quality standards," says Michael Vaa.

Lessons learned and next steps

Daniel Frederik Busemann has used the results to apply for funding for a root cause analysis of the test equipment and to prioritise equipment upgrades.

In the long term, the validation rules should ideally be implemented in a dashboard so that experts can monitor data quality in real time.

"My dream is for our database to only be filled with validated data," says Daniel Frederik Busemann.

Test setup
Visualisations of data quality in open-source tools made it possible to see how data quality dropped significantly after 24 hours.

Because FORCE Technology uses open-source tools, the method can also be adopted by smaller companies without large IT budgets. But it requires a commitment to prioritising data validation.

"Data quality assurance might not immediately impact your bottom line, but it ensures you don't lose money later. It's an indispensable part of quality and risk management in data-driven companies and organisations. A production manager typically assumes there's nothing wrong with data from often expensive machines, sensors, and software solutions. It's only when complaints arise that people start investigating," says Michael Vaa.

He suggests that a quality label for data should be developed in the future, similar to labelling schemes for food.

System for data validation

Phase 1: System understanding (steps 1–3) 

1. System analysis

2. Data architecture analysis

3. Initial review 

Phase 2: Validation process (steps 4–7) 

4. Design of validation rules

5. Implementation of rules

6. Execution of rules

7. Review and improvement