The Illusion of Compliance: Why Your Digital Data Formats Are More Fragile Than You Think

Our times are dominated by the transition from paper-based to the digital exchange of data. As computer-based systems take over more and more of our communication, it is becoming increasingly important that digital data exchange be robust, secure, and accurate. Organizations across all sectors, particularly in highly regulated industries like banking, insurance, and enterprise software, rely on standardized data formats (XML, JSON, etc.) as the basis of their operations, compliance, and innovation. In this article, we demonstrate that the software systems that merely adhere to standards are not necessarily resilient in daily operations. Along the example of e-invoicing, we reveal how fragile „compliant“ systems can be. We argue for a thorough validation of critical systems based on effective test data, which is essential for mitigating risk and unlocking true strategic advantage.

The recent German mandate for e-invoicing, built upon the XRechnung and ZUGFeRD formats (both enforcing the EN16931 standard), serves as a compelling microcosm of this broader challenge. We provided a thorough introduction on these formats in Part 1 of our series on e-invoicing. As businesses globally accelerate their adoption of complex digital formats, the question of software and standards reliability moves from a technical concern to a core strategic imperative. Legal, financial, and operational ramifications of failure are too significant to ignore.

With that in mind, InputLab set out to evaluate the readiness of open-source e-invoicing libraries against real-world demands. The motivation was two-fold: (1) to demonstrate how high-quality test data can expose subtle bugs, and (2) to ensure that companies relying on these tools aren’t caught off guard by unexpected edge cases when the mandate is fully in force.

We chose the Mustang Project as the primary subject of this case study. Mustang is a widely used open-source Java library and command-line tool that supports both ZUGFeRD and XRechnung formats. It is commonly used by SMEs and ERP developers because it can generate, read, convert, and validate invoices, making it a key tool for businesses transitioning to e-invoicing. Mustang’s popularity and active development (with version 2.16.4 released in April 2025) made it an ideal candidate. If bugs exist in Mustang, they likely exist in other tools as well or indicate general challenges in handling these formats.

Testing Methodology: Beyond Surface Checks

InputLab’s team generated a comprehensive set of synthetic test invoices using our patent-pending technology. Rather than sticking to basic test cases, we designed a rich dataset that simulated a wide variety of real-world scenarios:

  • Basic Valid Invoices: Verifying that Mustang correctly processes standard cases with straightforward structures and common data.
  • Edge Cases and Boundary Values: Testing invoices with extreme complexity, including dozens of items, mixed tax rates, and foreign currencies.
  • Special Characters and Character Encoding: Ensuring Mustang handles values with special characters (e.g., ä, ç, é, ø) and special XML entities (<, &, >).
  • Optional Elements and Attributes: Including rarely used fields such as payment instructions and multiple references to assess how Mustang manages optional data.
  • Intentionally Flawed Invoices: Validating Mustang’s error handling with slightly flawed invoices, e.g., missing mandatory fields, incorrect sums, etc.

This approach allowed us to test the semantic resilience of the software and its ability to handle inputs that might occur only in rare situations. The diversity and quantity of test cases aimed to simulate years’ worth of real invoices, including the rare scenarios that might only occur once in a thousand invoices but are exactly the ones likely to reveal a latent bug.

Professionals examining digital invoices and XML code for bugs and errors, symbolizing software testing and e-invoicing compliance in Germany with references to XRechnung, ZUGFeRD, and EN16931 standards.

Unearthing the Hidden Bugs: The Mustang Case Study

Our deep dive into Mustang, using this advanced testing methodology, uncovered several significant issues that underscore the fragility we see across the digital landscape. While a full list of reported issues is available publicly (see our Mustang bug report list), the following illustrative examples reveal the risks associated with fragile e-invoicing software.

1. Crashes on empty, yet valid, fields [GitHub Issue #790]

A single empty element shouldn’t crash an entire process, especially when some fields may legally remain empty. Yet, we found that when Mustang encountered an empty ram:TradingBusinessName element, it threw a NullPointerException and terminated execution. This occurred despite the document being valid per EN16931 rules.

This issue represents more than a simple bug. It’s a potential vulnerability that could disrupt automated processing pipelines. In a real-world scenario, a single missing value could prevent processing for thousands of invoices, delaying payments and disrupting business operations.

2. Mismanagement of Base64 Encoded Attachments [GitHub Issue #792]

Invoices often contain base64-encoded attachments, such as spreadsheets or supporting documents. Per XML Schema rules, whitespace characters within these strings are permissible but should be ignored. However, Mustang’s decoder fails to account for it, leading to runtime errors.

If left unfixed, companies receiving attachments in invoices risk major disruptions in automated processing. Large enterprises routinely attach supplementary documents for audit and compliance purposes, so this issue could significantly obstruct data workflows, forcing manual intervention. If attachments are mishandled, critical data could be lost or delayed, affecting audits and financial reporting.

3. Mishandling of Unescaped Characters [GitHub Issue #793]

Another issue we identified was Mustang’s inability to handle several special characters (<, >, and &) in certain attributes during export, leading to malformed output or crashes.

This vulnerability threatens data integrity, potentially rendering otherwise compliant invoices unreadable by recipient systems. In regulated industries, this could lead to rejected invoices, payment delays, and regulatory scrutiny. Ensuring robust character encoding is vital for compliance.

These findings in a mature library like Mustang are not isolated incidents. Rather, they are symptomatic of a broader industry challenge: software can be technically „standard-compliant“ yet dangerously fragile against the diverse and sometimes messy reality of real-world data.

When the Rulebook Itself is Flawed: Critical Gaps in EN16931 and XRechnung

Perhaps more alarmingly, our investigation, driven by the need to generate comprehensive test data based on EN16931 and XRechnung artefacts, uncovered critical gaps and logical flaws within the standards themselves. This demonstrates that even relying on official validation rules isn’t a silver bullet, demanding a testing approach that can critically assess the rules, not just blindly follow them. Otherwise, businesses could suffer severe consequences.

1. Faulty Logic in Schematron Rule BR-CO-27 [GitHub Issue #403]

Schematron rules serve as the backbone for ensuring data accuracy in EN16931-compliant invoices. However, in rule BR-CO-27, we identified a logical flaw that effectively nullified its intended function. The rule was supposed to enforce the presence of either an IBAN or a ProprietaryID in payment data, but was instead structured as a tautology:

A or B or (not A and not B)  // always true

Because this condition is always true, it lets invoices pass without a valid payment identifier. Thus, defeating the whole purpose of the rule. Automated systems relying on this rule could silently approve invalid invoices. Fortunately, this issue was fixed after our report in EN16931 validation v1.3.14.

2. Missing Enforcement of Date Format Attributes [GitHub Issue #430]

Dates are critical in financial documents, dictating payment terms, invoice due dates, and more. Surprisingly, our testing revealed that several date fields within EN16931 invoices were not subject to strict validation. Examples of unvalidated fields include DueDateDateTime and BillingSpecifiedPeriod.

As a result, dates like "1st of January, 2022" could pass as valid despite not adhering to ISO 8601 (YYYYMMDD). Such unvalidated dates can lead to payment discrepancies, audit disputes, and cross-border transaction conflicts, especially where date formats differ significantly.

3. Unvalidated Elements and Incomplete Rules [GitHub Issue #429]

Our testing revealed that some elements were never validated due to gaps in the Schematron rules. Specifically, two significant oversights stood out:

  • QualifiedDataType Namespace (qdt:DateTimeString): While the udt:DateTimeString elements were subject to validation checks, their qdt: counterparts were not. This wrongly formatted dates may slip through if they are placed in less commonly used namespaces.
  • Buyer Contact Information: While seller phone and email fields (BT-42, BT-43) are checked for correct formatting, the equivalent buyer fields (BT-57, BT-58) are not. This discrepancy means that buyer contact details could contain invalid characters or formats without triggering a validation error.

These oversights increase the risk of downstream processing issues and communication errors in automated invoice handling systems. They also demonstrate that standards are living documents prone to Human errors and interpretation gaps. A rigorous, adversarial testing approach that puts the interplay between data and rules to test can uncover these hidden risks.

Responsible Disclosure: Reporting Our Findings

We followed a responsible disclosure process throughout our research. All identified bugs were reported to the relevant parties. For Mustang, we filed detailed GitHub issues with steps to reproduce and suggestions for resolution. We also contacted the maintainers of the EN16931 and XRechnung artifacts to share discrepancies in the validation rules.

Our goal in disclosing these findings was to mitigate immediate risks and contribute to broader software resilience. We aim to prompt timely corrections, ensuring that widely used e-invoicing tools and validation artifacts uphold the high standards of data integrity and accuracy mandated by EN16931 and German regulations.

A New Imperative of Testing for Resilient Software

The examples above illustrate systemic issues across both e-invoicing libraries and the standards themselves. By rigorously testing Mustang and EN16931/XRechnung artifacts, we uncovered multiple points of failure that could undermine invoice processing accuracy and regulatory compliance. Whether stemming from logical oversights in Schematron rules or incomplete validation logic, these bugs highlight the need for more robust testing frameworks that go beyond basic schema checks.

The harsh reality is that software can be technically “standard-compliant” yet still fragile in real-world use cases. Validating the structure is only part of the equation. Bugs often arise not from obvious violations but from edge cases, rare data combinations, and untested logic paths. Without comprehensive and customizable test data, even mature tools can break in production.

As we move to Part 3, we will broaden the lens. The lessons from this e-invoice project have implications far beyond invoices. We will see how customizable synthetic test data can be a cornerstone for building reliable software in any compliance-heavy domain, and how this fits into modern QA strategies like shift-left testing and test data as a service.

Kommentar verfassen

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert

Nach oben scrollen