Life cycle inventory databases

We develop comprehensive global, national and regional LCA-databases based on economical and environmental statistics. For individual customers we develop proprietary databases, often combining input-output data with specific process data in so-called hybrid databases.


Our data collection strategy

To reduce time consumption and costs of a life cycle study, data collection should be limited to the data, which are most important in order to answer the questions asked. It is for these key data that high quality (validation, level of detail, completeness, and representativeness) is required, while the remaining part of the study may apply default data. Thus, the identification of key data may be used to guide data collection in a specific study.

The identification of key data is based on the following characteristics:

  • Key data are relatively large and have a large variation.
  • Key data explain causes of variation or other causal relations.

Key data are not just those data that are large, but those data for which there is also a large improvement potential. A large improvement potential may be revealed by a large variation in the data (although variation may also be unavoidable).

Knowledge on the determining factors for the variation may be just as important as the data themselves, since this may allow modelling of the data to the specific situation. For example, if it is known that the energy consumption in a dairy is largely determined by the size of the dairy and the degree of process integration (incl. recovery of heat), the key energy consumption data for dairies must be expressed not only as energy per kg processed milk, but also related to dairy size and degree of process integration.

Such causal relations may be particularly important when they determine what activity should be covered by a study, including influence on other systems. For example, if milk is identified as a major determining factor for the number of consumer shopping trips, a change in the distribution (e.g., provision of home deliveries of milk) might lead to a reduction of the number of shopping trips, which could be of larger environmental importance than the entire remaining life cycle of the milk itself.

When a specific desired dataset is not available, you may decide to use – with or without adjustments – another dataset as a default, e.g. older data than desired, data from a different geographical location, or from a slightly different technology. Data from statistical Input-Output databases (see the section below) may be applied for less important parts of the product systems. The additional uncertainty introduced into the study by the use of such data must be assessed, and when more than one default data set is available (e.g., one old data set from the right region, and a more recent dataset from an adjacent region), the best of these datasets is the one that (with possible adjustments) minimises the additional uncertainty.

Data management

Having applied a lot of effort in obtaining adequate data, you will want these data to be available also for future use and documentation. Therefore, adequate procedures are needed for documenting the data collection. The “data” concept covers both qualitative and quantitative information as well as “meta-data”, i.e. data about the data, e.g. information on how the data were obtained and on their validity and limitations.

Efficient data documentation, storage, and retrieval for later use, require an electronic data format, and preferably a standardised one that allows export and import between different software and databases. For this purpose we developed the SPOLD format for LCI data, in co-operation with a large number of LCI data suppliers. A modified version of this format (EcoSPOLD) was later adopted by the leading database supplier Ecoinvent.

guideline for LCA data collection systems was developed as part of the CASCADE project. This now serves as a key input to managers of national and industry database initiatives.

Input-Output databases

Databases for LCA that are based on national economic and environmental statistics are known as “Input-Output databases” or “IO-databases” for short. A number of IO-databases for LCA based on national economic and environmental statistics are now available in SimaPro format.  The main advantage of the IO-databases is that one does not have to make cut-offs, i.e., to exclude parts of the product system. Another advantage is their consistency in terms of the way data are collected. Since data are collected for the same environmental exchanges and in the same way, consistently for all activities in the economy, you avoid problems of inconsistency and false results, i.e. that an activity shows up as important because data on a specific substance emission is available for that specific activity, while it is not available for the other activities. The disadvantage of IO-databases for LCA is that activities and products may be relatively aggregated, i.e., at the level of product groups rather than individual products. However, this disadvantage can be overcome by the use of hybrid analysis, where specific industries are represented in more detail. Such hybrid databases (see below) are available to order.

The characteristics of the hybrid version of EXIOBASE compared with other IO-models

The EXIOBASE model differs from most other IO-models in the following ways:

  • The IO-model is a hybrid unit IO-model based on complete balanced monetary and physical supply-use tables. The fact that the model is in hybrid units enables one to operate with different prices over activities, e.g. energy intensive industries such as electricity generation and production of basic metals often pay less for fuels than other activities. The implication of this is that traditional IO-models often underestimate the use of fuels and electricity for energy intensive activities.
  • The model includes generation and treatment of wastes in physical units.
  • The model distinguishes between virgin production of material and recycling of waste, as well as it distinguishes between several waste treatment options of different waste fractions: incineration, landfill and waste water treatment.
  • Waste flows are calculated based on detailed national mass flow analysis including mass balances on the level of products as well as activities.
  • In accordance with the ISO14040/44 standards on LCA and the ILCD Handbook on LCA, the model handles by-product allocation by system expansion.

How to use the databases for hybrid LCA

Using IO-tables as a starting point for analysing interrelationships in an economy and the importance of different product groups is known as Input-Output Analysis (IOA). When the IO-tables are supplemented with environmental data for each industry, and IOA is applied to environmental issues, this analysis is called “Environmental IOA” or “IO-LCA” are used. As a “top-down” approach it allows a complete allocation of all activities to all products. IOA has the advantage of being complete with regard to inclusion of all relevant activities related to a product. On the other hand, the IOA cannot deal with very specific questions, since it relies on a grouping of activities in a limited number of industries. This makes it difficult to use for detailed studies, such as environmental product life cycle assessment (LCA), except for very homogenous industries. Also, the necessary environmental statistics are not always available, which means that for some environmental exchanges, adequate information may be missing.

Instead, LCA has traditionally been performed as a “bottom-up” process analysis, based on linking the input and output flows of the specific activity datasets into a product system. A significant advantage of such process analysis is exactly its capability for detail. However, a major problem in process-based LCA is the likelihood that important parts of the product systems are left out of the analysis, simply because it is a very difficult task to follow all the flows in the product system in detail.

Combining process-based LCA and IOA in what has become known as “hybrid analysis” can yield a result that has the advantages of both methods (i.e. both detail and completeness). The name hybrid analysis refers to the combination of process-based LCA and Environmental IOA.

There are several ways in which the IO-database may be combined with more specific process data. The two main approaches are:

  • Tiered hybrid analysis
  • Embedded hybrid analysis

Tiered hybrid analysis

The typical (and simple) application of IO-data in a process LCA is to start from one or more specific activities that are better documented in terms of emissions and inputs than the corresponding industries in the IO data. To this “foreground” kernel, the IO data are simply added, linking each input to the process-based system with the corresponding best fitting final use group or industry output in the IO-database. Downstream processes like recycling or waste treatment may also be added. In this way, the IO data are used to complete the upstream and downstream parts of the product system not covered by specific process data.

A very simple hybrid application starts from one single foreground activity (for example a specific industry site), identifies in the IO-database the final use group or industry output that best covers this activity, makes a copy of this IO-based process, and use the more precise data of the foreground activity to replace the less precise data in the IO-based activity (leaving the IO data as a proxy for those parts of the foreground data which are not adequate or complete). The resulting hybrid activity dataset may then be used in a direct comparison (benchmarking) with the original IO-based activity, or it may be used in further modelling of a more complex foreground system.

The advantage of this approach is that it is simple. The drawback is that there are no links back from the IO data to the foreground activities, i.e the upstream IO data do not take advantage of the added information available on the foreground activities. This also means that knowledge does not accumulate in the database, i.e. when applying the IO database for another foreground system, the added information from first foreground system is not automatically linked into the new product system.

This is the main reason for the development of embedded hybrid analysis (see below), where these drawbacks are overcome. However, embedded hybrid analysis is more demanding and therefore appeals more to the advanced user that wishes to make several LCAs while continuously improving the underlying database.

Embedded hybrid analysis

This more advanced hybrid approach utilises the common matrix nature of process-based and IO-based data, by embedding the process-based data in the IO-matrix.

The first step of this approach is identical to that of the tiered approach: The starting point is an identification in the IO-database of the final use group or industry output that best covers the activity for which more specific data are available. A copy of this IO-based activity is then made, and the more precise data of the foreground activity is used to replace the less precise data in the IO-based activity.

Two additional steps are then needed to embed the new activity in the IO-database:

  • The original IO-based activity is modified by subtracting the inputs, outputs and emissions now represented by the new hybrid activity. In order to do this, the relative production volume of the two activities needs to be known.
  • The output of the new hybrid activity is linked as inputs to all the activities that it supplies. This can be the same activities and proportions as for the original IO-based activity, or it can be a different distribution when the specific activity is supplying a specific segment of the market. The original supplies from the IO-based activity are reduced with the amounts now supplied by the new hybrid activity.

Both these steps, but especially the latter, are rather cumbersome if performed within SimaPro, since every input and output needs to be accessed separately. For the advanced user, it is therefore preferable to perform these operations in the original matrix structure, e.g. in a spreadsheet software, utilising the advantage that operations in a spreadsheet can be performed on entire rows and columns. The entire database can be exported to Excel with the “Export to matrix” function of SimaPro. The two embedding steps may then be performed by adding a row and column representing the new hybrid process, performing the additions and subtractions described above, and re-import the adjusted matrix into SimaPro or any other matrix calculation tool. The procedure is illustrated in a poster from the 2005 SETAC conference. Import of matrices to SimaPro is performed via a CSV-file, which can e.g. be generated by a macro in Excel. A macro for this purpose is available to our Executive Club members.

The advantage of the embedded approach to hybrid LCA is that the adjustments made will automatically be available for all future applications of the database. It is therefore the approach preferred by database developers and advanced users that perform several LCAs using the same underlying database.