Wednesday, December 4, 2019

v0.6.0

Many people perhaps don't follow the development of IfcOpenShell actively and are happily using the master branch of the github repository. They might be surprised to know there is a lot of activity happening in the v0.6.0 and v0.7.0 branches. This post discusses the changes in the v0.6.0 branch. The following post will elaborate on some of the design decisions we are making in the v0.7.0 branch.

Schemas


One of the most significant improvements in the v0.6.0 branch is that multiple schemas (IFC2X3, IFC4, IFC4X1 and IFC4X2) are supported from within the same executable, module or plug-in. Previously, selecting the schema had been a compile-time option.

In IfcOpenShell and most other EXPRESS-based toolkits, the IFC schema is compiled into (a) the early-bound definitions: a class hierarchy with member functions and (b) a set of methods to operate on the schema definitions at runtime (late-bound access). C++ only allows very limited runtime reflection (but the development of C++ is very active, see for example P1240) so to complement the lack of introspection a set of methods exists to query for example all attribute names or the sub- and supertypes of an entity. In the master branch these methods are static, in the v0.6.0 branch these are the member functions of a schema class, that is a more complete reference mirroring the EXPRESS schema definition at runtime. See
IfcBaseEntity::declararation()
or
IfcParse::schema::declaration_by_name("IfcWall")->as_entity()->all_attribute_names()

Writing schema agnostic code


The code generated from the four schemas are completely orthogonal class hiercharies. For the C++ compiler there is no relationship between a Ifc2x3::IfcWall and a Ifc4::IfcWall. But IfcOpenShell offers three ways to write code that adapts to the schema of the file at runtime.

(a) preprocessor


This is the approach taken in the IfcGeom modules in v0.6.0. Essentially the same code base is compiled multiple times where the schema is available as a preprocessor constant. This means you can enable specific code paths with for example #ifdef directives. In this way the added entities in Ifc4 (IfcBSplineSurface, yay!) can be selectively compiled for example.

https://github.com/IfcOpenShell/IfcOpenShell/blob/e283b51dbcced6d8121c55fafd49c9ee1f954b74/src/ifcgeom/IfcGeomFaces.cpp#L1147

Smaller code blocks can be written as macros as well.

https://github.com/IfcOpenShell/IfcOpenShell/blob/e283b51dbcced6d8121c55fafd49c9ee1f954b74/src/ifcgeom_schema_agnostic/Kernel.cpp#L74

IfcOpenShell uses CMake to create multiple shared libraries from the same code, see https://github.com/IfcOpenShell/IfcOpenShell/blob/e283b51dbcced6d8121c55fafd49c9ee1f954b74/cmake/CMakeLists.txt#L557 for the foreach loop creating multiple libraries with different directives.

Benefits: fairly readible code, full autocompletion typically in an IDE when using the static library approach
Downsides: Some infrastructure required to compile the different libraries and select the correct implementation at runtime

(b) late-bound access


There are two modes of accessing schemas. In the early-bound approach function signatures and return types are known at compilation time. In the late-bound approach attribute names are referenced by strings and types are (basically) a tagged union of all data types used in the IFC schema, mapped to C++ types.

Ifc2x3::IfcWall* wall;
// Early-bound access;
{std::string global_id = wall->GlobalId();}
// Late-bound access.
{std::string global_id = *wall->get("GlobalId");}
// ERROR: By dereferencing the return type, it is cast into a string, which will cause an exception *at runtime* when the types do not match.
{int global_id = *wall->get("GlobalId");}

Benefits:
fairly readible code
no complicated setup of different libraries
Downsides:
no code completion
errors are only spotted at runtime, not compile-time
late-bound manipulation of inverse attributes is not well supported currently in IfcOpenShell
less means for the compiler to create highly optimized code

(c) templates


C++ has very extensive support for compile time generic arguments: templates. For this purpose the Ifc2x3 and Ifc4 definitions are no longer namespaces (in v0.5.0) but are now structs to allow for dependent names.

template <typename Schema>
void print_globalid(typename Schema::IfcWall* wall) {
    std::cout << wall->GlobalId();
}

Benefits:
no complicated setup of different libraries
no autocompletion typically, but errors caught at compile-time
Downsides:
fairly unreadable code due the necessity to sprinkle the code with additional template and typename keywords throughout.
error messages are harder to make sense up (due to two phase lookup rules for example)

All three approaches are used in the IfcOpenShell code-base.

Other improvements:


Multi-threading in collaboration with TNO, MAUC and Airsquire

Direct binary glTF v2.0 output (previously supported through Collada and Collada2Gltf) in collaboration with Schuco US.

Exciting developments called BlenderBIM

Much more efficient handling of detailed facesets.

Remember, get the latest builds from IfcOpenBot https://github.com/IfcOpenBot/IfcOpenShell/commit/9bcd932bed48486bf5b5f48d24b49329c280462f#comments

2 comments:

  1. Just want to mention: oipExpress also contains an EXPRESS-Parser: see here: https://bitbucket.org/tumcms/oipexpress/src/default/ - The EXPRESS-Parser is used to generate Early-Bindings for different IFC versions. The different binding coexist in different namespaces i.e. namespace IFC4x1, namespace IFC2x3, etc. To avoid to write identical code to handle geometry data some template abuse is done e.g. https://bitbucket.org/tumcms/openinfraplatform/src/default/EMTCodeGen/CodeGen/EMTIfcEntityTypes.h - have you ever seen a template with more than 600 parameters ;) - and yes it works ;). You can read more about this in my PhD thesis (wich is unfortunately written in german language) https://mediatum.ub.tum.de/doc/1453871/1453871.pdf). In the meantime, I learned some lessons and I would change a few design and tool decisions (e.g. use Bazel instead of CMake, do test-driven development). Nevertheless, consider it as a proof of concept. I was often thinking about implementing an interpreter for the express language and also here oipExpress can serve as a first building block.

    ReplyDelete
    Replies
    1. Hi Julian, thanks for dropping by. Congrats on your thesis I heard it was very well received. So with your approach you essentially have namespaces and templated dependent names combined. Interesting. Loosing namespaces has been painful for some users using that where doing a `using Ifc2x3` for example. But in the end I think all code has to be schema agnostic. Are there other benefits to namespaces over a schema struct or other benefits to your approach I'm overseeing?

      Delete