Parsing Xbrl part I

Xbrl is an xml derived markup language for describing financial documents and its use is mandatory in the UK and Ireland. In this series of articles I will examine xbrl in depth beginning with an technical overview.

Xbrl is an xml derived markup language for describing financial documents.

In the UK and Ireland a large subsection of companies are now legally mandated to submit their financial accounts in ixbrl format along with their annual tax return.

For software vendors in the accounting and tax industry this means xbrl integration is now expected if not yet essential.

In this series of articles I will examine xbrl in depth beginning with an technical overview, then moving onto practical application of the standard, and finishing up with an opinion piece.

If you are already familiar with xbrl and just want to see some working code feel free to check out my xbrl related open source projects:

Overview of XBRL

The users of xbrl software are accountants, shareholders, tax and government authorities and of course software developers.

Xbrl provides a format with which the aforementioned stakeholders can encode financial information and transfer it electronically.

The information should be comparable and searchable, translatable and bound by other semantic rules, for example International Financial Regulation Standards.

Xbrl is also designed with extensibility in mind, and thus intuitively simple concepts are often modeled in surprisingly complex ways.

Xbrl, as with historical methods of presenting accounting information, should give a true and fair view, that is, the figures and notes should finally be presented in plain text.

Xbrl instance documents (the equivalent of a set of financial statements) can be validated programmatically against predefined rules.

XML-based

Xbrl employs various XML ideas and standards namely:

  • Xml 1.0;
  • Xml namespaces;
  • Xlink and Xpointer;
  • Xml schema;
  • Xml schema datatypes.
  • Xhtml (for inline xbrl)

Practical application

A User with a set of accounts essentially needs to attach up to three pieces of meta-data to every piece of actual data in the document.

  1. The concept: "uk-gaap_TangibleFixedAssetsAtCostOrValuation"
  2. The period: "31-12-2016"
  3. A dimension label: "Land"*

* This is optional in most cases. If the user had omitted (3) above the a default will be assumed, in this case "AllTangibleFixedAssetsClasses".

With these three pieces of meta-data set, a valid inline xbrl tag and context for that tag can be generated for any financial concept.

Here's how the aforementioned meta-data would map to items on a normal financial report:

report with normal context

Balance Sheet
(2) 2016
(1) Fixed Assets1,333,333

report with multi-dimensional context

Fixed Assets note
(1) At cost(3) Land
(1) Balance (2) 20161,333,333

The first example is straightforward, the balance has a row and a column label.

The second example is a little trickier to comprehend at first as the presentation format of the financial concept and the underlying xbrl modeling of the same have diverged.

The balance has the same implied row label ("tangible fixed asset at cost") but has in effect two column labels, the year and the asset type even though there is only one column present in a visual sense.

This is why columns are more properly referred to as dimensions in xbrl, they are not necessarily vertical.

Implementation choices

As a software vendor you have a few choices on how to go about implementing xbrl accounts production for your users:

  • "Pre-tag": If you prescribe labels for accounts production you can simply attach the correct xbrl information to each possible label.

  • "Tag-picker": If you allow your users to design and edit their financial statements you will need to make sure to offer them valid choices for each of the three tag elements mentioned above when defining labels in their reports.

  • A mix of the two.

There are a few other pre-requisites to mention here. While this article is about the xbrl standard in practice you may be required to produce reports in inline xbrl format.

Mixing html and xbrl inline can be tricky especially in the context of adding xbrl production on top of an existing software design.

The key point to bear in mind when implementing xbrl is that you want to ensure the validity of the end document.

Whichever strategy you decide to go with you will need to be able to parse the underlying xbrl schemas (the Discoverable Taxonomy Set or DTS) and understand how they are validated.

Discoverable Taxonomy Sets and instance documents

Discoverable Taxonomy Sets (DTS) contain all the markup and validation information that is needed to create valid ixbrl instance documents (i.e. fully tagged sets of financial accounts).

Each financial system or standard generally has its own DTS, although elements may be shared amongst similar standards.

For example UK GAAP, UK IFRS, FRS 101 and FRS 102 all have their own DTS but all share common elements.

Some DTS extend others adding and overwriting elements of the parent DTS - Irish GAAP and Irish FRS 102 for example.

All DTS must import the ‘xbrl-instance-2003-12-31’ xsd file to be valid and all xbrl instance documents must have the xbrl element as their root or container element.

Schema files

A DTS usually comprises a single entrypoint xsd schema file that imports many others.

Each imported schema file may represent a reporting function ("uk auditors report"), a data type ("countries") or other shared DTS data.

Within each schema there may be one or many role-type links which are used to organize financial facts into logical categories for presentation e.g. "30 - Detailed Profit and Loss".

Each schema comprises multiple elements which represent financial concepts e.g. "Gross profit (loss)".

Each element is unique within the schema although elements may be referenced multiple times with tuple elements.

Each element has specific attributes which define its proper usage in ixbrl instance documents.

Tuples are collections of elements and have different requirements in terms of markup and validation from regular elements.

Tuples have been deprecated in favour of dimensions in the latest DTS releases and a software vendor should not now need to consider their usage when implementing XBRL reporting.

Linkbases

All DTS come bundled with ancillary xml files called linkbases that are imported into the main schema via the schemaRef attribute.

The purpose of a linkbase is to associate different types of static information with each base element of a DTS, similar to how a database record could have many associated records.

For example a financial concept might have several associated linkbase entries detailing how and where to present it in various reports, or references to extra documentation users may find useful, or how it should be used in a calculation if it is a numerical entry.

DTS linkbases have defined roles namely:

  • Calculation;
  • Definition;
  • Label;
  • Presentation;
  • Reference.

The calculation linkbase contains information regarding how its parent concept should be used in calculating aggregate totals. UK and Ireland DTS do not use this aspect of the XBRL standard.

Definition linkbases are primarily concerned with the dimensional presentation of financial concepts and validity. As a mental model you could think of dimensions as the different columns labels that a cell of data might fall under in a set of financial statements.

The label linkbase contains human readable labels for financial concepts.

The presentation linkbase contains information on where to position financial concepts when presenting them in human readable form.

The reference linkbase contains exact references to legal and other external sources that are relevant to a particular financial concept.

How linkbases work: Xlink and Xpointer

Linkbases contains information that organize financial concepts into one or many extended links. An extended link is similar to a double linked list with each node having a pointer to the previous (xlink:from) and next position (xlink:to).

Each position in the list (or resource) may be in the same linkbase file (local) or in a different one (remote).

A remote resource is represented by a locator element, that is, an element of any type that has an xlink:type attribute with the value locator.

A local resource is represented by a resource element, that is, an element of any type that has an xlink:type attribute with the value resource.

Resource elements are associated with their linkage information using the Xpointer standard which works exactly like html anchor links, i.e. a locator's href points to a link's id in the same linkbase.

An arc between two resources, whether local or remote, is represented by an arc element, that is, an element of any type that has an xlink:type attribute with the value arc.

Elements of type arc will reference elements of type loc using the to, from and label attributes respectively, thus forming a linked list or in "extended link" in Xml parlance.

Arc roles

Linked lists of related elements are further refined with respect to an "arcRole" meta-data type that is present in all elements providing linkage information.

Arc role types are defined in the xbrl standard and may include:

  • parent-child;
  • all;
  • none;
  • domain-member;
  • dimension-domain;
  • dimension-default;
  • hypercube.

The parent-child arc-role is used to create a tree representation of a DTS for visual presentation to the end user.

The remaining networks are interdependent and are used to define dimensionally valid relationships between financial concepts.

Wrap up and part II

Thats it for part one of parsing xbrl! In part two I will look at how to parse the UK GAAP DTS schema.