The Annotated VRML 97 Reference

1 Intro     Concepts     3 Nodes     4 Fields/Events    Conformance
A Grammar     B Java     C JavaScript     D Examples     E Related Info    References
Quick Java         Quick JavaScript         Quick Nodes   
 

  About the Book
  
Help
  Copyright © 1997-99
  Purchase the book from Amazon.com

 

Chapter 2
Key Concepts

2.1 Intro

2.1.1 Overview
2.1.2 TOC
2.1.3 Conventions

2.2 Overview
2.2.1 File Structure
2.2.2 Header
2.2.3 Scene graph
2.2.4 Prototypes
2.2.5 Routing
2.2.6 Generating files
2.2.7 Presentation
     Interaction
2.2.8 Profiles

2.3 UTF-8 syntax
2.3.1 Clear text
2.3.2 Statements
2.3.3 Node
2.3.4 Field
2.3.5 PROTO
2.3.6 IS
2.3.7 EXTERNPROTO
2.3.8 USE
2.3.9 ROUTE

2.4 Scene graph
2.4.1 Root nodes
2.4.2 Hierarchy
2.4.3 Descendants
       & ancestors
2.4.4 Hierarchy
2.4.5 Units coord sys

2.5 VRML & WWW
2.5.1 MIME type
2.5.2 URLs
2.5.3 Relative URLs
2.5.4 data:
2.5.5 Scripting protocols
2.5.6 URNs

2.6 Nodes
2.6.1 Intro
2.6.2 DEF/USE
2.6.3 Geometry
2.6.4 Bboxes
2.6.5 Grouping & children
2.6.6 Lights
2.6.7 Sensors
2.6.8 Interpolators
2.6.9 Time nodes
2.6.10 Bindable children
2.6.11 Textures

2.7 Field, eventIn,
     eventOut

2.8 PROTO
2.8.1 Declaration
2.8.2 Definition
2.8.3 Scoping

2.9 EXTERNPROTO
2.9.1  Interface
2.9.2  URL
2.9.3 Extensions

2.10 Events
2.10.1 Intro
2.10.2 Routes
2.10.3 Execution
2.10.4 Loops
2.10.5 Fan-in & fan-out

2.11 Time
2.11.1 Intro
2.11.2 Origin
2.11.3 Discrete/cont

2.12 Scripting
2.12.1 Intro
2.12.2 Execution
2.12.3 Initialize/shutdown
2.12.4 eventsProcessed
2.12.5 Direct outputs
2.12.6 Asynchronous
2.12.7 Languages
2.12.8 EventIns
2.12.9 fields events
2.12.10 Browser interface

2.13 Navigation
2.13.1 Intro
2.13.2 Navigation
2.13.3 Viewing
2.13.4 Collisions

2.14 Lighting
2.14.1 Intro
2.14.2 'off'
2.14.3 'on'
2.14.4 Equations
2.14.5 References

+ 2.6 Node semantics

2.6.1 Introduction

Each node may have the following characteristics:

  1. A type name. Examples include Box, Color, Group, Sphere, Sound, or SpotLight.
  2. Zero or more fields that define how each node differs from other nodes of the same type. Field values are stored in the VRML file along with the nodes, and encode the state of the virtual world.
  3. A set of events that it can receive and send. Each node may receive zero or more different kinds of events which will result in some change to the node's state. Each node may also generate zero or more different kinds of events to report changes in the node's state.
  4. An implementation. The implementation of each node defines how it reacts to events it can receive, when it generates events, and its visual or auditory appearance in the virtual world (if any). The VRML standard defines the semantics of built-in nodes (i.e., nodes with implementations that are provided by the VRML browser). The PROTO statement may be used to define new types of nodes, with behaviours defined in terms of the behaviours of other nodes.
  5. A name. Nodes can be named. This is used by other statements to reference a specific instantiation of a node.

TIP: The most commonly used values have been selected as the default values for each field. Therefore, it is recommended that you do not explicitly specify fields with default values since this will unnecessarily increase file size.

TECHNICAL NOTE: VRML's object model doesn't really match any of the object models found in formal programming languages (object oriented, delegation, functional, etc.). This is because VRML is not a general-purpose programming language; it is a persistent file format designed to store the state of a virtual world efficiently and to be read and written easily by both humans and a wide variety of tools.

TECHNICAL NOTE: Nodes in general may have a couple of other characteristics:
  1. A name assigned using the DEF keyword--See Section 2.3.2, Instancing, for details.
  2. An implementation--The implementations of the 54 nodes in the VRML 2.0 specification are built in. The PROTO mechanism (see Section 2.6, Prototypes) can be used to specify implementations for new nodes (specified as a composition of built-in nodes) and the EXTERNPROTO mechanism (see Section 2.6.4, Defining Prototypes in External Files) may be used to define new nodes with implementations that are outside the VRML file (see Section 2.8, Browser Extensions). Implementations are typically written in C, C++, or Java, and use a variety of system libraries for 3D graphics, sound, and other low-level support. The VRML specification defines an abstract functional model that is independent of any specific library.

2.6.2 DEF/USE semantics

A node given a name using the DEF keyword may later be referenced by name with USE or ROUTE statements. The USE statement does not create a copy of the node. Instead, the same node is inserted into the scene graph a second time, resulting in the node having multiple parents. Using an instance of a node multiple times is called instantiation.

Node names are limited in scope to a single file or prototype definition. A DEF name goes into scope immediately. Given a node named "NewNode" (i.e., DEF NewNode), any "USE NewNode" statements in SFNode or MFNode fields inside NewNode's scope refer to NewNode (see "2.4.4 Transformation hierarchy" for restrictions on self-referential nodes). PROTO statements define a node name scope separate from the rest of the file in which the prototype definition appears.

If multiple nodes are given the same name, each USE statement refers to the closest node with the given name preceding it in either the file or prototype definition.

TECHNICAL NOTE: DEF was an unfortunate choice of keyword, because it implies to many people that the node is merely being defined. The DEF syntax is
    DEF nodeName nodeType { fields } 

For example:

    DEF Red Material { diffuseColor 1 0 0 } 

A vote was taken during the VRML 2.0 design process to see if there was consensus that the syntax should be changed, either to change the keyword to something less confusing (like NAME) or to change the syntax to

    nodeType nodename { fields } 

For example:

    Material Red { diffuseColor 1 0 0 } 

VRML 1.0 compatibility won out, so DEF is still the way you name nodes in VRML 2.0.

The rules for scoping node names in VRML also seem to cause a lot of confusion, probably because people see all of the curly braces in the VRML file format and think it must be a strange dialect of the C programming language. The rules are actually pretty simple: When you encounter a USE, just search backward from that point in the file for a matching DEF (skipping over PROTO definitions; see Section 2.6.3, Prototype Scoping Rules, for prototype scoping rules). Choosing some other scoping rule would either make VRML more complicated or would limit the kinds of graph structures that could be created in the file format, both of which are undesirable.


TECHNICAL NOTE: Similarly, if an authoring tool allows users to multiply instance unnamed nodes, the tool will need to generate a name automatically in order to write the VRML file. The recommended convention for such names is an underscore followed by an integer (e.g., _3).

DEF/USE is in essence a simple mechanism for writing out pointers. The Inventor programming library required its file format to represent in-memory data structures that included nodes that pointed to other nodes (grouping nodes that contained other nodes as children, for example). The solution chosen was DEF/USE. One algorithm for writing out any arbitrary graph of nodes using DEF/USE is

  1. Traverse the scene graph and count the number of times that each node needs to be written out
  2. Traverse the scene graph again in the same order. At each node, if the node has not yet been written out and it will need to be written out multiple times, it is written out with a unique DEF name. If it has already been written out, just USE and the unique name are written. If it only needs to be written once, then it does not need to be DEF'ed and may be written without a name.

This algorithm writes out any arrangement of nodes, including recursive structures.

A simple way of generating unique names is to increment an integer every time a node is written out and give each node written the name "_integer": The first node is written as DEF _0 Node { ... } and so on. Another way of generating unique names is to write out an underscore followed by the address where the node is stored in memory (if you're using a programming language such as C, which allows direct access to pointers).

The DEF feature also serves another purpose—you can give your nodes descriptive names, perhaps in an authoring tool that might display node names when you select objects to be edited, and thus allow you to select things by name and so on. The two uses for DEF—to give nodes a name and to allow arbitrary graphs to be written out—are orthogonal, and the conventions for generating unique names suggested in the specification (appending an underscore and an integer to the user-given name, if any) essentially suggest a scheme for separating these two functions. Given a name of the suggested form

     DEF userGivenName_instanceID ... 

The first part of the name, userGivenName, is the node's "true" name—the name given to the node by the user. The second part of the name, instanceID, is used only to ensure that the name is unique, and should never be shown to the user. If tools do not follow these conventions and come up with their own schemes for generating unique DEF/USE names, then after going through a series of read/write cycles a node originally named Spike might end up with a name that looks like %3521%Spike$83EFF*952—not what the user expects to see!


2.6.3 Shapes and geometry

2.6.3.1 Introduction

The Shape node associates a geometry node with nodes that define that geometry's appearance. Shape nodes must be part of the transformation hierarchy to have any visible result, and the transformation hierarchy must contain Shape nodes for any geometry to be visible (the only nodes that render visible results are Shape nodes and the Background node). A Shape node contains exactly one geometry node in its geometry field. This following node types are valid geometry nodes:

2.6.3.2 Geometric property nodes

Several geometry nodes contain Coordinate, Color, Normal, and TextureCoordinate as geometric property nodes. The geometric property nodes are defined as individual nodes so that instancing and sharing is possible between different geometry nodes.

2.6.3.3 Appearance nodes

Shape nodes may specify an Appearance node that describes the appearance properties (material and texture) to be applied to the Shape's geometry. The following node type may be specified in the material field of the Appearance node:

The following nodes may be specified by the texture field of the Appearance node:

The following node may be specified in the textureTranform field of the Appearance node:

The interaction between such appearance nodes and the Color node is described in "2.14 Lighting Model".

TECHNICAL NOTE: Putting the geometric properties in separate nodes, instead of just giving the geometry or Shape nodes more fields, will also make it easier to extend VRML in the future. For example, supporting new material properties such as index of refraction requires only the specification of a new type of Material node, instead of requiring the addition of a new field to every geometry node. The texture nodes that are part of the specification are another good example of why making properties separate nodes is a good idea. Any of the three texture node types (ImageTexture, PixelTexture, or MovieTexture) can be used with any of the geometry nodes.

Separating out the properties into different nodes makes VRML files a little bigger and makes them harder to create using a text editor. The prototyping mechanism can be used to create new node types that don't allow properties to be shared, but reduce file size. For example, if you want to make it easy to create cubes at different positions with different colors you might define

PROTO ColoredCube [ field SFVec3f position 0 0 0 
                    field SFColor color 1 1 1 ] 
{ 
  Transform { translation IS position 
    children Shape { 
      geometry Cube { } 
      appearance Appearance { 
        material Material { diffuseColor IS color } 
      } 
    } 
  } 
} 

which might be used like this:

Group { children [ 
  ColoredCube { color 1 0 0 position 1.3 4.97 0 } 
  ColoredCube { color 0 1 0 position 0 -6.8 3 } 
]} 

Using the PROTO mechanism to implement application-specific compression can result in very small VRML files, but does make it more difficult to edit in general-purpose, graphical VRML tools.


2.6.3.4 Shape hint fields

The ElevationGrid, Extrusion, and IndexedFaceSet nodes each have three SFBool fields that provide hints about the shape such as whether the shape contains ordered vertices, whether the shape is solid, and whether the shape contains convex faces. These fields are ccw, solid, and convex, respectively.

The ccw field defines the ordering of the vertex coordinates of the geometry with respect to user-given or automatically generated normal vectors used in the lighting model equations. If ccw is TRUE, the normals shall follow the right hand rule; the orientation of each normal with respect to the vertices (taken in order) shall be such that the vertices appear to be oriented in a counterclockwise order when the vertices are viewed (in the local coordinate system of the Shape) from the opposite direction as the normal. If ccw is FALSE, the normals shall be oriented in the opposite direction. If normals are not generated but are supplied using a Normal node, and the orientation of the normals does not match the setting of the ccw field, results are undefined.

TIP: See Figure 2-3 for an illustration of the effect of the ccw field on an IndexedFaceSet's default normals.

Figure 2-3: ccw Field

The solid field determines whether one or both sides of each polygon shall be displayed. If solid is FALSE, each polygon shall be visible regardless of the viewing direction (i.e., no backface culling shall be done, and two-sided lighting shall be performed to illuminate both sides of lit surfaces). If solid is TRUE, the visibility of each polygon shall be determined as follows: Let V be the position of the viewer in the local coordinate system of the geometry. Let N be the geometric normal vector of the polygon, and let P be any point (besides the local origin) in the plane defined by the polygon's vertices. Then if (V dot N) - (N dot P) is greater than zero, the polygon shall be visible; if it is less than or equal to zero, the polygon shall be invisible (backface culled).

The convex field indicates whether all polygons in the shape are convex (TRUE). A polygon is convex if it is planar, does not intersect itself, and all of the interior angles at its vertices are less than 180 degrees. Non-planar and self-intersecting polygons may produce undefined results even if the convex field is FALSE.

TIP: It is recommended that you avoid creating nonplanar polygons, even though it is legal within VRML. Since the VRML specification does not specify a triangulation scheme, each browser may triangulate differently. This is especially important when creating objects with a low number of polygons; the triangulation is more apparent. One way to avoid this issue is to generate triangles rather than polygons.

TIP: Default field values throughout VRML were chosen to optimize for rendering speed. You should try to create objects that adhere to the following defaults: solid TRUE, convex TRUE, and ccw TRUE. You should be especially careful if you provide normals for your objects that the orientation of the normals match the setting of the ccw field; getting this wrong can result in completely black surfaces in some renderers.

TECHNICAL NOTE: It might be simpler if VRML simply had backface and twoSide flags to control polygon backface removal and two-sided lighting (although another flag to indicate the orientation of polygons would still be needed). However, the hints chosen allow implementations to perform these common optimizations without tying the VRML specification to any particular rendering technique. Backface removal, for example, should not be done if using a renderer that can display reflections.

2.6.3.5 Crease angle field

The creaseAngle field, used by the ElevationGrid, Extrusion, and IndexedFaceSet nodes, affects how default normals are generated. If the angle between the geometric normals of two adjacent faces is less than the crease angle, normals shall be calculated so that the faces are smooth-shaded across the edge; otherwise, normals shall be calculated so that a lighting discontinuity across the edge is produced. For example, a crease angle of .5 radians means that an edge between two adjacent polygonal faces will be smooth shaded if the geometric normals of the two faces form an angle that is less than .5 radians. Otherwise, the faces will appear faceted. Crease angles must be greater than or equal to 0.0.

TIP: See figure 2-4 for an illustration of the effects of the creaseAngle field. Polgon face a and polyon face b have angle between their normals that is less than the specified creaseAngle and thus the generated normals at the vertex shared by a and b (Na and Nb) are identical and produce a smooth surface effect. However, the angle between the normals of polygon c and d is greater than the specified creaseAngle and thus the generated normals (Nc and Nd) produce a faceted surface effect.

TIP: Specifying a single crease angle for each of your shapes instead of specifying individual normals at each of its vertices is a great bandwidth-saving technique. For almost every shape there is an appropriate crease angle that will produce smooth surfaces and sharp creases in the appropriate places.

TECHNICAL NOTE: An almost infinite number of geometry nodes could have been added to VRML 2.0. It was not easy to decide what should be included and what should be excluded, and additions were kept to a minimum because an abundance of geometry types makes it more difficult to write tools that deal with VRML files. A new geometry was likely to be included if it:

  1. Is much smaller than the equivalent IndexedFaceSet. The Open Inventor IndexedTriangleStripSet primitive was considered and rejected, because it was only (on average) one and one-half to two times smaller than the equivalent IndexedFaceSet. ElevationGrids and Extrusions are typically more than four times smaller than the equivalent IndexedFaceSet.
  2. Is reasonably easy to implement. Computational Solid Geometry (CSG) and trimmed Non-Uniform Rational B-Splines (NURBS) were often-requested features that pass the "much smaller" criteria, but are very difficult to implement robustly.
  3. Is used in a large percentage of VRML worlds. Any number of additional primitive shapes—Torus, TruncatedCylinder, Teapot — could have been added as a VRML primitive, but none of them are used often enough (outside of computer graphics research literature) to justify their inclusion in the standard. In fact, the designers of VRML felt that the Sphere, Cone, Cylinder and Box primitives would not satisfy this criteria, either; they are part of VRML 2.0 only because they were part of VRML 1.0, and it is very difficult to remove any feature once a product or specification is widely used.

Crease angle diagram

Figure 2-4: creaseAngle Field

2.6.4 Bounding boxes

Several of the nodes include a bounding box specification comprised of two fields, bboxSize and bboxCenter. A bounding box is a rectangular parallelepiped of dimension bboxSize centred on the location bboxCenter in the local coordinate system. This is typically used by grouping nodes to provide a hint to the browser on the group's approximate size for culling optimizations. The default size for bounding boxes (-1, -1, -1) indicates that the user did not specify the bounding box and the browser is to compute it or assume the most conservative case. A bboxSize value of (0, 0, 0) is valid and represents a point in space (i.e., an infinitely small box). Specified bboxSize field values shall be >= 0.0 or equal to (-1, -1, -1). The bboxCenter fields specify a position offset from the local coordinate system.

TECHNICAL NOTE: Why does VRML use axis-aligned bounding boxes instead of some other bounding volume representation such as bounding spheres? The choice was fairly arbitrary, but tight bounding boxes are very easy to calculate, easy to transform, and they have a better "worst-case" behavior than bounding spheres (the bounding box of a spherical object encloses less empty area than the bounding sphere of a long, skinny object).


The bboxCenter and bboxSize fields may be used to specify a maximum possible bounding box for the objects inside a grouping node (e.g., Transform). These are used as hints to optimize certain operations such as determining whether or not the group needs to be drawn. If the specified bounding box is smaller than the true bounding box of the group, results are undefined. The bounding box shall be large enough to completely contain the effects of all sound and light nodes that are children of this group. If the size of this group changes over time due to animating children or due to the addition of children nodes, the bounding box shall also be large enough to contain all possible changes. The bounding box shall be large enough to contain the union of the group's children's bounding boxes; it shall not include any transformations performed by the group itself (i.e., the bounding box is defined in the local coordinate system of the group).

TIP: See the illustration in Figure 2-5 of a grouping node and its bounding box. In this figure the grouping node contains three shapes: a Cone, a Cylinder, and a Sphere. The bounding box size is chosen to enclose the three geometries completely.

Bounding box diagram

Figure 2-5: Grouping Node Bounding Boxes

TECHNICAL NOTE: Prespecified bounding boxes help browsers do two things: avoid loading parts of the world from across the network and avoid simulating parts of the world that can't be sensed. Both of these rely on the "out-of-sight-out-of-mind" principle: If the user cannot see or hear part of the world, then there's no reason for the VRML browser to spend any time loading or simulating that part of the world.

For many operations, a VRML browser can automatically calculate bounding volumes and automatically optimize away parts of the scene that aren't perceptible. For example, even if you do not prespecify bounding boxes in your VRML world, browsers can compute the bounding box for each part of the world and then avoid drawing the parts of the scene that are not visible. Since computing a bounding box for part of the world is almost always faster than drawing it, if parts of the world are not visible (which is usually the case), then doing this "render culling" will speed up the total time it takes to draw the world. Again, this can be done automatically and should not require that you prespecify bounding boxes.

However, some operations cannot be automatically optimized in this way because they suffer from a "chicken-and-egg" problem: The operation could be avoided if the bounding box is known, but to calculate the bounding box requires that the operation be -performed!

Delaying loading parts of the world (specified using either the Inline node or an EXTERNPROTO definition) that are not perceptible falls into this category. If the bounding box of those parts of the world is known, then the browser will know if those parts of the world might be perceptible. However, the bounding box cannot be automatically calculated until those parts of the world are loaded.

One possible solution would be to augment the standard Web protocols (such as HTTP) to support a "get bounding box" request; then, instead of asking for an entire .wrl file to be loaded, a VRML browser could just ask the server to send it the bounding box of the .wrl file. Perhaps, eventually, Web servers will support such requests, but until VRML becomes ubiquitous it is unlikely there will be enough demand on server vendors to add VRML-specific features. Also, often the network bottleneck is not transferring the data, but just establishing a connection with a server, and this solution could worsen that bottleneck since it might require two connections (once for the bounding box information and once for the actual data) for each perceptible part of the world.

Extending Web servers to give bounding box information would not help avoiding simulating parts of the world that aren't perceptible, either. Imagine a VRML world that contained a toy train set with a train that constantly traveled around the tracks. If the user is not looking at the train set, then there is no reason the VRML browser should spend any time simulating the movement of the train (which could be arbitrarily complicated and might involve movement of the train's wheels, engine, etc.). But the browser can't determine if the train is visible unless it knows where the train is; and it won't know exactly where the train is unless it has simulated its movement, which is exactly the work we hoped to avoid.

The solution is for the world creator to give the VRML browser some extra information in the form of an assertion about what might possibly happen. In the case of the toy train set, the user can give a maximum possible bounding box for the train that surrounds all the possible movements of the train. Note that if the VRML browser could determine all the possible movements of the train, then it could also do this calculation. However, calculating all possible movements can be very complicated and is often not possible at all because the movements might be controlled by an arbitrary program contained in a Script node. Usually it is much easier for the world creator (whether a computer program or a human being) to tell the browser the maximum possible extent of things.

Note also that the world's hierarchy can be put to very good use to help the browser minimize work. For example, it is common that an object have both a "large" motion through the world and "small" motions of the object's parts (e.g., a toy train moves along its tracks through the world, but may have myriad small motions of its wheels, engine, drive rods, etc.). If the object is modeled this way and appropriate maximum bounding boxes are specified, then a browser may be able to optimize away the simulation of the small motions after it simulates the large motion and determines that the object as a whole cannot be seen.

Once set, maximum bounding boxes cannot be changed. A maximum bounding box specification is an assertion; allowing the assertion to change over time makes implementations that rely on the assertion more complicated. The argument for allowing maximum bounding boxes to be changed is that the world author can often easily compute the bounding box for changing objects and thus offload the VRML browser from the work. However, this would require the VRML browser to execute the code continually to calculate the bounding box. It might be better to extend the notion of a bounding box to the more general notion of a bounding box that is valid until a given time. World authors could give assertions about an object's possible location over a specific interval of time, and the browser would only need to query the world-/creator-defined Script after that time interval had elapsed. In any case, experimentation with either approach is possible by extending a browser with additional nodes defined with the EXTERNPROTO extension mechanism (see Section 2.8, Browser Extensions).


2.6.5 Grouping and children nodes

Grouping nodes have a children field that contains a list of nodes (exceptions to this rule are Inline, LOD, and Switch). Each grouping node defines a coordinate space for its children. This coordinate space is relative to the coordinate space of the node of which the group node is a child. Such a node is called a parent node. This means that transformations accumulate down the scene graph hierarchy.

The following node types are grouping nodes:

The following node types are children nodes:

  • LOD
  • NavigationInfo
  • NormalInterpolator
  • OrientationInterpolator
  • PlaneSensor
  • PointLight
  • PositionInterpolator
  • ProximitySensor
  • ScalarInterpolator
  • Script
  • Shape
  • Sound
  • SpotLight
  • SphereSensor
  • Switch
  • TimeSensor
  • TouchSensor
  • Transform
  • Viewpoint
  • VisibilitySensor
  • WorldInfo
  • PROTO'd children nodes
  • The following node types are not valid as children nodes:

  • ElevationGrid
  • Extrusion
  • ImageTexture
  • IndexedFaceSet
  • IndexedLineSet
  • Material
  • MovieTexture
  • Normal
  • PointSet
  • Sphere
  • Text
  • TextureCoordinate
  • TextureTransform

  • TECHNICAL NOTE: Unlike VRML 1.0, the VRML 2.0 scene graph serves only as a transformation and -spatial-grouping hierarchy. The transformation hierarchy allows the creation of jointed, rigid-body motion figures. The transformation hierarchy is also often used for spatial grouping. Tables and chairs can be defined in their own coordinate systems, grouped to form a set that can be moved around a house, which in turn is defined in its own coordinate system and grouped with other houses to create a neighborhood. Grouping things in this way is not only convenient, it also improves performance in most -implementations.

    The VRML 1.0 scene graph also defined an object property hierarchy. For example, a texture property could be placed at any level of the scene hierarchy and could affect an entire subtree of the hierarchy. VRML 2.0 puts all properties inside the hierarchy's lowest level nodes—a texture property cannot be associated with a grouping node; it can only be associated with one or more Shape nodes.

    This simplified scene graph structure is probably the biggest difference between VRML 1.0 and VRML 2.0, and was motivated by feedback from several different implementors. Some rendering libraries have a simpler notion of rendering state than VRML 1.0, and the mismatch between these libraries and VRML was causing performance problems and implementation complexity.

    VRML 2.0's ability to change the values and topology of the scene graph over time makes it even more critical for the scene graph structure to match existing rendering libraries. It is fairly easy to convert a VRML file to the structure expected by a rendering library once; it is much more difficult to come up with a conversion scheme that efficiently handles a constantly changing scene.

    VRML 2.0's simpler structure means that each part of the scene graph is almost completely self-contained. An implementation can render any part of the scene graph if it knows:

    • what part of the scene graph to render (which children nodes)
    • the transformation for that part of the scene graph (the accumulated transformation of all Transform and Billboard nodes above that part of the scene graph)
    • the currently bound Fog parameters and all light sources that might affect this part of the scene graph

    For example, this makes it much easier for an implementation to render different parts of the scene graph at the same time or to rearrange the order in which it decides to render the scene (e.g., to group objects that use the same texture map, which is faster on some graphics hardware).


    All grouping nodes also have addChildren and removeChildren eventIn definitions. The addChildren event appends nodes to the grouping node's children field. Any nodes passed to the addChildren event that are already in the group's children list are ignored. For example, if the children field contains the nodes Q, L and S (in order) and the group receives an addChildren eventIn containing (in order) nodes A, L, and Z, the result is a children field containing (in order) nodes Q, L, S, A, and Z.

    The removeChildren event removes nodes from the grouping node's children field. Any nodes in the removeChildren event that are not in the grouping node's children list are ignored. If the children field contains the nodes Q, L, S, A and Z and it receives a removeChildren eventIn containing nodes A, L, and Z, the result is Q, S.

    The Inline, Switch and LOD nodes are special group nodes that do not have all of the semantics of the regular grouping nodes (see "3.25 Inline", "3.26 LOD", and "3.46 Switch" for specifics).

    TECHNICAL NOTE: The order of a grouping node's children has no effect on the perceivable result; the children can be rearranged and there will be no change to the VRML world. This was a conscious design decision that simplifies the Open Inventor scene graph by eliminating most of the traversal state and enabling easier integration with rendering libraries (very few rendering libraries today support Inventor's rich traversal state). The net effect of this decision is smaller and simpler implementations, but more burden on the author to share attributes in the scene graph. It is important to note that the order of children is deterministic and cannot be altered by the implementation, since Script nodes may access children and assume that the order does not change.

    TECHNICAL NOTE: The LOD and Switch nodes are not considered grouping nodes because they have different semantics from the grouping nodes. Grouping nodes display all of their children, and the order of children for a grouping node is unimportant, while Switch and LOD display, at most, one of their "children" and their order is very important.


    Note that a variety of node types reference other node types through fields. Some of these are parent-child relationships, while others are not (there are node-specific semantics). Table 2-3 lists all node types that reference other nodes through fields.

    Table 2-3: Nodes with SFNode or MFNode fields

    Node Type Field
    Valid Node Types for Field
    Anchor children Valid children nodes
    Appearance material Material
    texture ImageTexture, MovieTexture, Pixel Texture
    Billboard children Valid children nodes
    Collision children Valid children nodes
    ElevationGrid color Color
    normal Normal
    texCoord TextureCoordinate
    Group children Valid children nodes
    IndexedFaceSet color Color
    coord Coordinate
    normal Normal
    texCoord TextureCoordinate
    IndexedLineSet color Color
    coord Coordinate
    LOD level Valid children nodes
    Shape appearance Appearance
    geometry Box, Cone, Cylinder, ElevationGrid, Extrusion, IndexedFaceSet, IndexedLineSet, PointSet, Sphere, Text
    Sound source AudioClip, MovieTexture
    Switch choice Valid children nodes
    Text fontStyle FontStyle
    Transform children Valid children nodes

    2.6.6 Light sources

    Shape nodes are illuminated by the sum of all of the lights in the world that affect them. This includes the contribution of both the direct and ambient illumination from light sources. Ambient illumination results from the scattering and reflection of light originally emitted directly by light sources. The amount of ambient light is associated with the individual lights in the scene. This is a gross approximation to how ambient reflection actually occurs in nature.

    TECHNICAL NOTE: The VRML lighting model is a gross approximation of how lighting actually occurs in nature. It is a compromise between speed and accuracy, with more emphasis put on speed. A more physically accurate lighting model would require extra lighting calculations and result in slower rendering. VRML's lighting model is similar to those used by current computer graphics software and hardware.

    TECHNICAL NOTE: The LOD and Switch nodes are not considered grouping nodes because they have different semantics from the grouping nodes. Grouping nodes display all of their children, and the order of children for a grouping node is unimportant, while Switch and LOD display, at most, one of their "children" and their order is very important.

    The following node types are light source nodes:

    All light source nodes contain an intensity, a color, and an ambientIntensity field. The intensity field specifies the brightness of the direct emission from the light, and the ambientIntensity specifies the intensity of the ambient emission from the light. Light intensity may range from 0.0 (no light emission) to 1.0 (full intensity). The color field specifies the spectral colour properties of the both direct and ambient light emission, as an RGB value.

    TECHNICAL NOTE: The intensity field is really a convenience; adjusting the RGB values in the color field appropriately is equivalent to changing the intensity of the light. Or, in other words, the light emitted by a light source is equal to intensity × color. Similarly, setting the on field to FALSE is equivalent to setting the intensity and ambientIntensity fields to zero.

    Some photorealistic rendering systems allow light sinks — light sources with a negative intensity. They also sometimes support intensities of greater than 1.0. Interactive rendering libraries typically don't support those features, and since VRML is designed for interactive playback the specification only defines results for values in the 0.0 to 1.0 range.


    PointLight and SpotLight illuminate all objects in the world that fall within their volume of lighting influence regardless of location within the file. PointLight defines this volume of influence as a sphere centred at the light (defined by a radius). SpotLight defines the volume of influence as a solid angle defined by a radius and a cutoff angle. DirectionalLights illuminate only the objects descended from the light's parent grouping node, including any descendent children of the parent grouping nodes.

    TECHNICAL NOTE: A good light source specification is difficult to design. There are two primary problems: first, how to scope light sources so that the "infinitely scalable" property of VRML is maintained and second, how to specify both the light's coordinate system and the objects that it illuminates.

    If light sources are not scoped in some way, then a VRML world that contains a lot of light sources requires that all of the light sources be taken into account when drawing any part of the world. By scoping light sources, only a subset of the lights in the world ever need to be considered, allowing worlds to grow arbitrarily large.

    For PointLight and SpotLight, the scoping problem is addressed by giving them a radius of effect. Nothing outside of the radius is affected by the light. Implementors will be forced to approximate this ideal behavior, because current interactive rendering libraries typically only support light attenuation and do not support a fixed radius beyond which no light falls. Content creators should choose attenuation constants such that the intensity of a light source is very close to zero at the cutoff radius (or, alternatively, choose a cutoff radius based on the attenuation constants).

    A directional light sends parallel rays of light from a particular direction. Attenuation makes no sense for a directional light, since the light is not emanating from any particular location. Therefore, it makes no sense to try to specify a cutoff radius or any other spatial scoping. Instead, DirectionalLight is scoped by its position in the scene hierarchy, illuminating only sibling geometry (geometry underneath the same Group or Transform as the DirectionalLight). Although unrealistic, defining DirectionalLight this way allows efficient implementations and allows content creators a reasonable amount of control over the lighting of their virtual worlds.

    The second problem--defining the light's coordinate system separately from which objects the light illuminates--is addressed by the cutoff radius field of PointLight and SpotLight. Their position in the scene hierarchy determines only their location in space; they illuminate all objects that fall within the cutoff radius of that location. This makes implementing them more difficult, since the position of all point lights and spot lights must be known before anything is drawn. Current interactive rendering hardware and software make it even more difficult, since they support only a small number of light sources (e.g., eight) at once. Implementors can either turn light sources on and off as different pieces of geometry are drawn or can just use a few of the light sources and ignore the rest. The VRML 2.0 specification requires only that eight simultaneous light sources be supported (see Chapter 5, Conformance and Minimum Support Requirements). World creators should bear this in mind and minimize the number of light sources turned on at any given time.

    DirectionalLight does not attempt to decouple its position in the scene hierarchy from the objects that it illuminates. That can result in unrealistic behavior. For example, a directional light that illuminates everything inside a room will not illuminate an object that travels into the room unless that object is in the room's part of the scene hierarchy, and an object that moves outside the room will continue to be lit by the directional light until it is moved outside of the room Group. A better solution for moving objects around the scene hierarchy as their position in the virtual world changes may eventually be needed, but until then content creators will have to use existing mechanisms to get their desired results (e.g., by knowing the Group for each room in their virtual world and using addChildren/removeChildren events to move objects from one Group to another as they travel around the virtual world).


    2.6.7 Sensor nodes

    2.6.7.1 Introduction to sensors

    The following nodes types are sensor nodes:

    Sensors are children nodes in the hierarchy and therefore may be parented by grouping nodes as described in "2.6.5 Grouping and children nodes."

    TECHNICAL NOTE: They are called sensors because they sense changes to something. Sensors detect changes to the state of an input device (TouchSensor, CylinderSensor, Plane-Sensor, SphereSensor), changes in time (TimeSensor), or changes related to the motion of the viewer or objects in the virtual world (ProximitySensor, VisibilitySensor, and Collision group).

    Some often-requested features that did not make it into VRML 2.0 could be expressed as new sensor types. These are object-to-object collision detection, support for 3D input devices, and keyboard support.

    Viewer-object collision detection is supported by the Collision group, but object-to-object collision detection is harder to implement and much harder to specify. Only recently have robust, fast implementations for detecting collisions between any two objects in an arbitrary virtual world become available, and efficient algorithms for object-to-object collision detection is still an area of active research. Even assuming fast, efficient algorithms are widely available and reasonably straightforward to implement, it is difficult to specify precisely which nodes should be tested for collisions and what events should be produced when they collide. Designing a solution that works for a particular application (e.g., a game) is easy; designing a general solution that works for a wide range of applications is much harder.

    Support for input devices like 3D mice, 3D joysticks, and spatial trackers was also an often-requested feature. Ideally, a world creator would describe the desired interactions at a high level of abstraction so that users could use any input device they desired to interact with the world. There might be a Motion3DSensor that gives 3D positions and orientations in the local coordinate system, driven by whatever input device the user happened to be using.

    In practice, however, creating an easy-to-use experience requires knowledge of the capabilities and limitations of the input device being used. This is true even in the well-researched world of 2D input devices; drawing applications treat a pressure-sensitive tablet differently than a mouse.

    One alternative to creating a general sensor to support 3D input devices was to create many different sensors, one for each different device or class of devices. There were two problems with doing this: First, the authors of the VRML 2.0 specification are not experts in the subtleties of all of the various 3D input device technologies and second, it isn't clear that many world creators would use these new sensors since they would restrict the use of their worlds to people that had the appropriate input device (a very small percentage of computer users). It is expected that prototype extensions that -support 3D input devices will be available and proposed for future revisions of the VRML specification.

    Unlike 3D input devices, keyboards are ubiquitous in the computing world. However, there is no KeyboardSensor in the VRML 2.0 standard. Virtual reality purists might argue that this is a good thing since keyboards have no place in immersive virtual worlds (and we should have SpeechSensor and FingerSensor instead), but that isn't the reason for its absence from the VRML specification. During the process of designing KeyboardSensor several difficult design issues arose for which no satisfactory solution was found. In addition, VRML is not designed to be a stand-alone, do-everything standard. It was designed to take advantage of the other standards that have been defined for the Internet whenever possible, such as JPEG, MPEG, Java, HTTP, and URLs.

    The simplest keyboard support would be reporting key-press and key-release events. For example, a world creator might want a platform to move up while a certain key is pressed and to move down when another key is pressed. Or, different keys on the keyboard might be used to "teleport" the user to different locations in the world. Adding support for a single KeyboardSensor of this type in a world would be straightforward, but designing for just a single KeyboardSensor goes against the composability design goals for VRML. It also duplicates functionality that is better left to other standards. For example, Java defines a set of keyboard events that may be received by a Java applet. Rather than wasting time duplicating the functionality of Java inside VRML, defining a general communication mechanism between a Java applet and a VRML world will give this functionality and much more.

    Java also defines textArea and textField components that allow entry of arbitrary text strings. Designing the equivalent functionality for text input inside a 3D world (e.g., fill-in text areas on the walls of a room) would require the definition of a 2D windowing system inside the 3D world. Issues such as input methods for international characters, keyboard focus management, and a host of other issues would have to be reimplemented if a VRML solution were invented. Again, rather than wasting time duplicating the functionality of existing windowing systems, it might be better to define a general way of embedding existing 2D standards into the 3D world. Experimentation along these lines is certainly possible using the current VRML 2.0 standard. The ImageTexture node can point to arbitrary 2D content, and although only the PNG and JPEG image file formats are required, browser implementors could certainly support ImageTexture nodes that pointed to Java applets. They could even map mouse and keyboard events over the texture into the 2D coordinate space of the Java applet to support arbitrary interaction with Java applets pasted onto objects in a 3D world.


    Each type of sensor defines when an event is generated. The state of the scene graph after several sensors have generated events shall be as if each event is processed separately, in order. If sensors generate events at the same time, the state of the scene graph will be undefined if the results depend on the ordering of the events.

    TECHNICAL NOTE: Events generated by sensor nodes are given time stamps that specify exactly when the event occurred. These time stamps should be the exact or ideal time that the event occurred and not the time that the event happened to be generated by the sensor. For example, the time stamp for a TouchSensor's isActive TRUE event generated by clicking the mouse should be the actual time when the mouse button was pressed, even if it takes a few microseconds for the mouse-press event to be delivered to the VRML application. This isn't very important if events are handled in isolation, but can be critical in cases when the sequence or timing of multiple events is important. For example, the world creator might set a double-click threshold on an object. If the user clicks the mouse (or, more generally, activates the pointing device) twice rapidly enough, an animation is started. The browser may happen to receive one click just before it decides to rerender the scene and the other click after it is finished rendering the scene. If it takes the browser longer to render the scene than the double-click threshold and the browser time stamps the click events based on when it gets around to processing them, then the double-click events will be lost and the user will be very frustrated. Happily, modern operating and windowing systems are multithreaded and give the raw device events reasonably accurate time stamps that can be retrieved and used by VRML browsers

    It is possible to create dependencies between various types of sensors. For example, a TouchSensor may result in a change to a VisibilitySensor node's transformation, which in turn may cause the VisibilitySensor node's visibility status to change.

    The following two sections classify sensors into two categories: environmental sensors and pointing-device sensors.

    TIP: If you create a paradoxical or indeterministic situation, your world may behave differently on different VRML browsers. Achieving identical (or at least almost-identical) results on different implementations is the primary reason for defining a VRML specification, so a lot of thought was put into designs that removed any possibilities of indeterministic results. For example, two sensors that generated events at exactly the same time could be given a well-defined order, perhaps based on which was created first or their position in the scene graph. Requiring implementations to do this was judged to be unreasonable, because different implementations will have different strategies for delaying the loading of different parts of the world (affecting the order in which nodes are created) and because the scene graph ordering can change over time. The overhead required to make all possible worlds completely deterministic isn't worth the runtime costs. Indeterministic situations are easy to avoid, can be detected and reported at run-time (so the world creator knows that they have a problem), and are never useful.


    2.6.7.2 Environmental sensors

    The ProximitySensor detects when the user navigates into a specified region in the world. The ProximitySensor itself is not visible. The TimeSensor is a clock that has no geometry or location associated with it; it is used to start and stop time-based nodes such as interpolators. The VisibilitySensor detects when a specific part of the world becomes visible to the user. The Collision grouping node detects when the user collides with objects in the virtual world. Pointing-device sensors detect user pointing events such as the user clicking on a piece of geometry (i.e., TouchSensor). Proximity, time, collision, and visibility sensors are each processed independently of whether others exist or overlap.

    2.6.7.3 Pointing-device sensors

    The following node types are pointing-device sensors:

    A pointing-device sensor is activated when the user locates the pointing device over geometry that is influenced by that specific pointing-device sensor. Pointing-device sensors have influence over all geometry that is descended from the sensor's parent groups. In the case of the Anchor node, the Anchor node itself is considered to be the parent group. Typically, the pointing-device sensor is a sibling to the geometry that it influences. In other cases, the sensor is a sibling to groups which contain geometry (i.e., are influenced by the pointing-device sensor).

    The appearance properties of the geometry do not affect activation of the sensor. In particular, transparent materials or textures shall be treated as opaque with respect to activation of pointing-device sensors.

    TECHNICAL NOTE: It is a little bit strange that pointing device sensors sense hits on all of their sibling geometry. Geometry that occurs before the pointing device sensor in the children list is treated exactly the same as geometry that appears after the sensor in the children list. This is a consequence of the semantics of grouping nodes. The order of children in a grouping node is irrelevant, so the position of a pointing device sensor in the children list does not matter.

    Adding a sensor MFNode field to the grouping nodes as a place for sensors (instead of just putting them in the children field) was considered, but rejected because it added complexity to the grouping nodes, was less extensible, and produced little benefit.


    For a given user activation, the lowest, enabled pointing-device sensor in the hierarchy is activated. All other pointing-device sensors above the lowest, enabled pointing-device sensor are ignored. The hierarchy is defined by the geometry node over which the pointing-device sensor is located and the entire hierarchy upward. If there are multiple pointing-device sensors tied for lowest, each of these is activated simultaneously and independently, possibly resulting in multiple