Glossary Term

XML Schema (XSD)

The W3C standard for describing the structure, data types, and constraints of an XML document. XSD is the first validation layer in any e-invoice pipeline — does the file have the right elements, in the right order, with the right types? Schematron handles business rules on top; both UBL 2.1 and CII D16B ship authoritative XSD schemas.

Quick Facts

Catches
Structure, types, cardinality, enumerations
Does not catch
Calculation errors, VAT validity, Peppol routing
Standards body
W3C (XSD 1.0 / 1.1)
Validation layer
Layer 2 of 3 (after parse, before Schematron)
CII schema source
UN/CEFACT D16B Core Components Library
UBL schema source
OASIS (ISO/IEC 19845:2015)
Typical per-invoice cost
Microseconds with pre-compiled schema
EN 16931 syntaxes covered
UBL 2.1, CII D16B (SCRDM subset)

Definition

What it is

XML Schema (XSD, for XML Schema Definition) is a W3C recommendation for formally describing what an XML document is allowed to contain. An XSD specifies which elements exist, in what order they may appear, which attributes they carry, what data types they hold (string, decimal, date, enumeration), and what cardinality constraints apply (minOccurs, maxOccurs).

Validating an XML document against its XSD answers a precise question: is this document structurally well-formed and type-correct under the schema? It does not answer "is the content correct?" That second question is what Schematron and the EN 16931 business rules exist to answer.

XSD has two versions:

  • XSD 1.0 (2001, revised 2004) — universal support; this is what every e-invoicing standard uses.

  • XSD 1.1 (2012) — adds conditional type assignment and limited assertions; almost no e-invoicing tooling targets it.
  • The validation pyramid

    In a serious e-invoicing pipeline, XSD is the bottom of a three-layer validation stack:

    1. Well-formed XML. The bytes parse as XML at all. No mismatched tags, no broken character references, valid encoding. This is the parser's job, not XSD's.
    2. XSD-valid. The document conforms to the structural schema. cbc:IssueDate is a xs:date. cac:InvoiceLine appears at least once. cbc:LineExtensionAmount is a decimal with a currency attribute.
    3. Business-rule valid. EN 16931 rules (BR-01..BR-66), calculation rules (BR-CO-), CIUS rules (XRechnung's XR-, Peppol BIS's PEPPOL-EN16931-*). These are Schematron, not XSD.

    Skipping layer 2 — going straight from "the bytes parsed" to "the Schematron passed" — is technically possible but a bad idea: Schematron errors against a structurally invalid document are misleading and slow to debug. The disciplined order is XSD first, Schematron second.

    The UBL 2.1 schema

    OASIS UBL 2.1 (ISO/IEC 19845:2015) ships a set of XSD files that define every UBL document type, including UBL-Invoice-2.1.xsd and UBL-CreditNote-2.1.xsd. The schema imports a library of common components (UBL-CommonBasicComponents-2.1.xsd, UBL-CommonAggregateComponents-2.1.xsd) that define the building blocks shared across documents.

    A UBL invoice declares its conformance by namespace:

             xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2"
    xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2">

    The cbc: namespace holds basic-component types (single-value fields). The cac: namespace holds aggregate components (groups). XSD validation against the UBL schemas catches issues like "cbc:LineExtensionAmount is missing the currencyID attribute" or "cbc:DueDate is not a valid xs:date."

    The CII D16B schema

    UN/CEFACT publishes the Cross Industry Invoice as a set of XSD files derived from the UN/CEFACT Core Components Library. EN 16931's CII syntax uses the D16B schema, specifically the SCRDM (Supply Chain Reference Data Model) subset. The schema files live in the uncefact namespace tree (urn:un:unece:uncefact:data:standard:CrossIndustryInvoice:100, ...:ReusableAggregateBusinessInformationEntity:100, etc.).

    CII's schema is considerably more verbose than UBL's. A simple invoice total in UBL takes one element with an attribute; in CII it takes nested structures with explicit type and unit codes. This is a feature, not a bug — CII reuses Core Components across dozens of trade documents and pays the verbosity cost once.

    What XSD validation catches

    In day-to-day ERP integration, XSD validation reliably catches:

  • Missing required elements. BT-1 (Invoice number) absent? XSD fails before Schematron sees the document.

  • Wrong type. BT-9 (Payment due date) shipped as "30 January 2026" instead of 2026-01-30. XSD says: not a valid xs:date.

  • Wrong element order. UBL's Invoice requires its children in a specific order. A cbc:IssueDate before cbc:CustomizationID is invalid against the schema.

  • Unbounded extensions. Schemas declare to allow extension points. ERPs that drop custom elements outside these extension points get XSD errors.

  • Invalid enumerations. UBL uses enumerated code types for things like document type codes (380, 381, 384, ...). Sending "INVOICE" instead of "380" fails XSD if the type is declared with a fixed enumeration list.
  • It does not catch:

  • VAT calculation errors. Schematron BR-CO-* rules.

  • Whether BT-31 (Seller VAT) is a real registered number. VIES validation.

  • Whether BT-49 (Buyer electronic address) routes to a real Peppol participant. SMP lookup.
  • Practical pitfalls

    The pitfalls that bite ERP teams in production are mostly about schema resolution, not the schema language itself:

  • xsi:schemaLocation is advisory. Many parsers ignore it; some chase the URL and fail on network errors. Production validators resolve schemas from a local catalog, not from the internet.

  • Schema imports. UBL's Invoice schema imports CommonAggregateComponents, which imports CommonBasicComponents, which imports unqualified data types. All four must resolve. A single missing file in the deployment package breaks the whole stack.

  • Validation cost. First-time schema compilation is expensive (tens of milliseconds for UBL, more for CII); per-document validation against a pre-compiled schema is cheap (microseconds). Cache the compiled schema; do not recompile per invoice.

  • Permissive mode vs. strict mode. Some parsers accept documents that violate XSD when fed in "non-validating" mode. Tests need to assert that validation is actually on; "the file parsed" is not the same as "the file is schema-valid."
  • Relation to EN 16931

    EN 16931 is a semantic standard. It mandates that conformant invoices are expressed in either UBL 2.1 or CII D16B, and that they pass the Schematron rules. The XSD schemas themselves are not part of EN 16931 — they are inherited from OASIS UBL and UN/CEFACT — but conformance implicitly requires the document to be XSD-valid against the appropriate schema, because the Schematron rules assume a schema-valid input.

    For an ERP vendor, this means the XSD-validation step is mandatory in spirit even when the standards body doesn't spell it out: a document that is Schematron-passing but XSD-invalid is undefined behaviour, and downstream receivers (Peppol access points, national platforms, customer ERPs) will reject it.

    XML Examples

    UBL (Peppol, XRechnung)

    <!-- Minimal UBL Invoice header showing schema declaration -->
    <Invoice xmlns="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2"
             xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2"
             xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2"
             xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
             xsi:schemaLocation="urn:oasis:names:specification:ubl:schema:xsd:Invoice-2 UBL-Invoice-2.1.xsd">
      <cbc:CustomizationID>urn:cen.eu:en16931:2017</cbc:CustomizationID>
      <cbc:ID>INV-2026-001</cbc:ID>
      <cbc:IssueDate>2026-05-21</cbc:IssueDate>
      <cbc:InvoiceTypeCode>380</cbc:InvoiceTypeCode>
      <cbc:DocumentCurrencyCode>EUR</cbc:DocumentCurrencyCode>
      <!-- ... -->
    </Invoice>

    CII (ZUGFeRD, Factur-X)

    <!-- Minimal CII Invoice header showing schema namespaces -->
    <rsm:CrossIndustryInvoice xmlns:rsm="urn:un:unece:uncefact:data:standard:CrossIndustryInvoice:100"
                              xmlns:ram="urn:un:unece:uncefact:data:standard:ReusableAggregateBusinessInformationEntity:100"
                              xmlns:udt="urn:un:unece:uncefact:data:standard:UnqualifiedDataType:100"
                              xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
      <rsm:ExchangedDocumentContext>
        <ram:GuidelineSpecifiedDocumentContextParameter>
          <ram:ID>urn:cen.eu:en16931:2017</ram:ID>
        </ram:GuidelineSpecifiedDocumentContextParameter>
      </rsm:ExchangedDocumentContext>
      <!-- ... -->
    </rsm:CrossIndustryInvoice>

    Related Content