The Virtual Reality Modeling Language

Version 1.0 Specification


Gavin Bell, Silicon Graphics, Inc.
Anthony Parisi, Intervista Software
Mark Pesce, VRML List Moderator

This document is located at http://www.vrml.org/VRML1.0/vrml10c.html

Revision History

Table of Contents


The Virtual Reality Modeling Language (VRML) is a language for describing multi-participant interactive simulations -- virtual worlds networked via the global Internet and hyper-linked with the World Wide Web. All aspects of virtual world display, interaction and internetworking can be specified using VRML. It is the intention of its designers that VRML become the standard language for interactive simulation within the World Wide Web.

The first version of VRML allows for the creation of virtual worlds with limited interactive behavior. These worlds can contain objects which have hyper-links to other worlds, HTML documents or other valid MIME types. When the user selects an object with a hyper-link, the appropriate MIME viewer is launched. When the user selects a link to a VRML document from within a correctly configured WWW browser, a VRML viewer is launched. Thus VRML viewers are the perfect companion applications to standard WWW browsers for navigating and visualizing the Web. Future versions of VRML will allow for richer behaviors, including animations, motion physics and real-time multi-user interaction.

This document specifies the features and syntax of Version 1.0 of VRML.

VRML Mission Statement

The history of the development of the Internet has had three distinct phases; first, the development of the TCP/IP infrastructure which allowed documents and data to be stored in a proximally independent way; that is, Internet provided a layer of abstraction between data sets and the hosts which manipulated them. While this abstraction was useful, it was also confusing; without any clear sense of "what went where", access to Internet was restricted to the class of sysops/net surfers who could maintain internal cognitive maps of the data space.

Next, Tim Berners-Lee's work at CERN, where he developed the hyper-media system known as World Wide Web, added another layer of abstraction to the existing structure. This abstraction provided an "addressing" scheme, a unique identifier (the Universal Resource Locator), which could tell anyone "where to go and how to get there" for any piece of data within the Web. While useful, it lacked dimensionality; there's no there there within the web, and the only type of navigation permissible (other than surfing) is by direct reference. In other words, I can only tell you how to get to the VRML Forum home page by saying, "http://www.wired.com/", which is not human-centered data. In fact, I need to make an effort to remember it at all. So, while the World Wide Web provides a retrieval mechanism to complement the existing storage mechanism, it leaves a lot to be desired, particularly for human beings.

Finally, we move to "perceptualized" Internetworks, where the data has been sensualized, that is, rendered sensually. If something is represented sensually, it is possible to make sense of it. VRML is an attempt (how successful, only time and effort will tell) to place humans at the center of the Internet, ordering its universe to our whims. In order to do that, the most important single element is a standard that defines the particularities of perception. Virtual Reality Modeling Language is that standard, designed to be a universal description language for multi-participant simulations.

These three phases, storage, retrieval, and perceptualization are analogous to the human process of consciousness, as expressed in terms of semantics and cognitive science. Events occur and are recorded (memory); inferences are drawn from memory (associations), and from sets of related events, maps of the universe are created (cognitive perception). What is important to remember is that the map is not the territory, and we should avoid becoming trapped in any single representation or world-view. Although we need to design to avoid disorientation, we should always push the envelope in the kinds of experience we can bring into manifestation!

This document is the living proof of the success of a process that was committed to being open and flexible, responsive to the needs of a growing Web community. Rather than re-invent the wheel, we have adapted an existing specification (Open Inventor) as the basis from which our own work can grow, saving years of design work and perhaps many mistakes. Now our real work can begin; that of rendering our noospheric space.


VRML was conceived in the spring of 1994 at the first annual World Wide Web Conference in Geneva, Switzerland. Tim Berners-Lee and Dave Raggett organized a Birds-of-a-Feather (BOF) session to discuss Virtual Reality interfaces to the World Wide Web. Several BOF attendees described projects already underway to build three dimensional graphical visualization tools which inter-operate with the Web. Attendees agreed on the need for these tools to have a common language for specifying 3D world description and WWW hyper-links -- an analog of HTML for virtual reality. The term Virtual Reality Markup Language (VRML) was coined, and the group resolved to begin specification work after the conference. The word 'Markup' was later changed to 'Modeling' to reflect the graphical nature of VRML.

Shortly after the Geneva BOF session, the www-vrml mailing list was created to discuss the development of a specification for the first version of VRML. The response to the list invitation was overwhelming: within a week, there were over a thousand members. After an initial settling-in period, list moderator Mark Pesce of Labyrinth Group announced his intention to have a draft version of the specification ready by the WWW Fall 1994 conference, a mere five months away. There was general agreement on the list that, while this schedule was aggressive, it was achievable provided that the requirements for the first version were not too ambitious and that VRML could be adapted from an existing solution. The list quickly agreed upon a set of requirements for the first version, and began a search for technologies which could be adapted to fit the needs of VRML.

The search for existing technologies turned up a several worthwhile candidates. After much deliberation the list came to a consensus: the Open Inventor ASCII File Format from Silicon Graphics, Inc. The Inventor File Format supports complete descriptions of 3D worlds with polygonally rendered objects, lighting, materials, ambient properties and realism effects. A subset of the Inventor File Format, with extensions to support networking, forms the basis of VRML. Gavin Bell of Silicon Graphics has adapted the Inventor File Format for VRML, with design input from the mailing list. SGI has publicly stated that the file format is available for use in the open market, and have contributed a file format parser into the public domain to bootstrap VRML viewer development.

This is a clarified version of the 1.0 specification. No features have been added or changed from the original 1.0 version of the spec. This is a 'bug-fix' release of the spec, correcting misspellings, vague wording and misleading examples, and adding wording to better define the semantics of VRML.

Version 1.0 Requirements

VRML 1.0 is designed to meet the following requirements:

As with HTML, the above are absolute requirements for a network language standard; they should need little explanation here.

Early on the designers decided that VRML would not be an extension to HTML. HTML is designed for text, not graphics. Also, VRML requires even more finely tuned network optimizations than HTML; it is expected that a typical VRML world will be composed of many more "inline" objects and served up by many more servers than a typical HTML document. Moreover, HTML is an accepted standard, with existing implementations that depend on it. To impede the HTML design process with VRML issues and constrain the VRML design process with HTML compatibility concerns would be to do both languages a disservice. As a network language, VRML will succeed or fail independent of HTML.

It was also decided that, except for the hyper-linking feature, the first version of VRML would not support interactive behaviors. This was a practical decision intended to streamline design and implementation. Design of a language for describing interactive behaviors is a big job, especially when the language needs to express behaviors of objects communicating on a network. Such languages do exist; if we had chosen one of them, we would have risked getting into a "language war." People don't get excited about the syntax of a language for describing polygonal objects; people get very excited about the syntax of real languages for writing programs. Religious wars can extend the design process by months or years. In addition, networked inter-object operation requires brokering services such as those provided by CORBA or OLE, services which don't exist yet within WWW; we would have had to invent them. Finally, by keeping behaviors out of Version 1, we have made it a much smaller task to implement a viewer. We acknowledge that support for arbitrary interactive behaviors is critical to the long-term success of VRML; they will be included in Version 2.

Language Specification

The language specification is divided into the following sections:

Language Basics

At the highest level of abstraction, VRML is just a way for objects to read and write themselves. Theoretically, the objects can contain anything -- 3D geometry, MIDI data, JPEG images, anything. VRML defines a set of objects useful for doing 3D graphics. These objects are called Nodes.

Nodes are arranged in hierarchical structures called scene graphs. Scene graphs are more than just a collection of nodes; the scene graph defines an ordering for the nodes. The scene graph has a notion of state -- nodes earlier in the world can affect nodes that appear later in the world. For example, a Rotation or Material node will affect the nodes after it in the world. A mechanism is defined to limit the effects of properties ( separator nodes), allowing parts of the scene graph to be functionally isolated from other parts.

Applications that interpret VRML files need not maintain the scene graph structure internally; the scene graph is merely a convenient way of describing objects.

A node has the following characteristics:

The syntax chosen to represent these pieces of information is straightforward:

DEF objectname objecttype { fields  children }

Only the object type and curly braces are required; nodes may or may not have a name, fields, and children.

Node names must not begin with a digit, and must not contain spaces or control characters, single or double quote characters, backslashes, curly braces, the plus character or the period character.

For example, this file contains a simple world defining a view of a red cone and a blue sphere, lit by a directional light:

#VRML V1.0 ascii
Separator {
    DirectionalLight {
        direction 0 0 -1  # Light shining from viewer into world
    PerspectiveCamera {
        position    -8.6 2.1 5.6
        orientation -0.1352 -0.9831 -0.1233  1.1417
        focalDistance       10.84
    Separator {   # The red sphere
        Material {
            diffuseColor 1 0 0   # Red
        Translation { translation 3 0 1 }
        Sphere { radius 2.3 }
    Separator {  # The blue cube
        Material {
            diffuseColor 0 0 1  # Blue
        Transform {
            translation -2.4 .2 1
            rotation 0 1 1  .9
        Cube {}

General Syntax

For easy identification of VRML files, every VRML file must begin with the characters:

#VRML V1.0 ascii

Any characters after these on the same line are ignored. The line is terminated by either the ASCII newline or carriage-return characters.

The '#' character begins a comment; all characters until the next newline or carriage return are ignored. The only exception to this is within double-quoted SFString and MFString fields, where the '#' character will be part of the string.

Note: Comments and whitespace may not be preserved; in particular, a VRML document server may strip comments and extraneous whitespace from a VRML file before transmitting it. Info nodes should be used for persistent information like copyrights or author information. Info nodes could also be used for object descriptions. New uses of named info nodes for conveying syntactically meaningfull information are deprecated. Use the extension nodes mechanism instead.

Blanks, tabs, newlines and carriage returns are whitespace characters wherever they appear outside of string fields. One or more whitespace characters separates the syntactical entities in VRML files, where necessary.

After the required header, a VRML file contains exactly one VRML node. That node may of course be a group node, containing any number of other nodes.

VRML is case-sensitive; 'Sphere' is different from 'sphere'.

Node names must not begin with a digit, and must not contain spaces or control characters, single or double quote characters, backslashes, curly braces, the sharp (#) character, the plus (+) character or the period character.

Field names start with lower case letters, Node types start with upper case. The remainder of the characters may be any printable ascii (21H-7EH) except curly braces {}, square brackets [], single ' or double " quotes, sharp #, backslash \\ plus +, period . or ampersand &.

Coordinate System

VRML uses a Cartesian, right-handed, 3-dimensional coordinate system. By default, objects are projected onto a 2-dimensional device by projecting them in the direction of the positive Z axis, with the positive X axis to the right and the positive Y axis up. A camera or modeling transformation may be used to alter this default projection.

The standard unit for lengths and distances specified is meters. The standard unit for angles is radians.

VRML worlds may contain an arbitrary number of local (or "object-space") coordinate systems, defined by modeling transformations using Translate, Rotate, Scale, Transform, and MatrixTransform nodes. Given a vertex V and a series of transformations such as:

Translation { translation T }
Rotation { rotation R }
Scale { scaleFactor S }
Coordinate3 { point V } PointSet { numPoints 1 }

the vertex is transformed into world-space to get v' by applying the transformations in the following order:

V' = T·R·S·V (if you think of vertices as column vectors) OR
V' = V·S·R·T (if you think of vertices as row vectors)

Conceptually, VRML also has a "world" coordinate system as well as a viewing or "Camera" coordinate system. The various local coordinate transformations map objects into the world coordinate system. This is where the scene is assembled. The scene is then viewed through a camera, introducing another conceptual coordinate system. Nothing in VRML is specified using these coordinates. They are rarely found in optimized implementations where all of the steps are concatenated. However, having a clear model of the object, world and camera spaces will help authors.


There are two general classes of fields; fields that contain a single value (where a value may be a single number, a vector, or even an image), and fields that contain multiple values. Single-valued fields all have names that begin with "SF", multiple-valued fields have names that begin with "MF". Each field type defines the format for the values it writes.

Multiple-valued fields are written as a series of values separated by commas, all enclosed in square brackets. If the field has zero values then only the square brackets ("[]") are written. The last may optionally be followed by a comma. If the field has exactly one value, the brackets may be omitted and just the value written. For example, all of the following are valid for a multiple-valued field containing the single integer value 1:

[ 1 ]


A single-value field that contains a mask of bit flags. Nodes that use this field class define mnemonic names for the bit flags. SFBitMasks are written to file as one or more mnemonic enumerated type names, in this format:

( flag1 | flag2 | ... )

If only one flag is used in a mask, the parentheses are optional. These names differ among uses of this field in various node classes.

No more than 32 separate flags may be defined for an SFBitMask.


A field containing a single boolean (true or false) value. SFBools may be written as 0 (representing FALSE), 1, TRUE, or FALSE.


Fields containing one (SFColor) or zero or more (MFColor) RGB colors. Each color is written to file as an RGB triple of floating point numbers in ANSI C floating point format, in the range 0.0 to 1.0. For example:

[ 1.0 0. 0.0, 0 1 0, 0 0 1 ]

is an MFColor field containing the three colors red, green, and blue.


A single-value field that contains an enumerated type value. Nodes that use this field class define mnemonic names for the values. SFEnums are written to file as a mnemonic enumerated type name. The name differs among uses of this field in various node classes.


Fields that contain one (SFFloat) or zero or more (MFFloat) single-precision floating point number. SFFloats are written to file in ANSI C floating point format. For example:

[ 3.1415926, 12.5e-3, .0001 ]

is an MFFloat field containing three values.


A field that contain an uncompressed 2-dimensional color or grey-scale image.

SFImages are written to file as three integers representing the width, height and number of components in the image, followed by width*height hexadecimal values representing the pixels in the image, separated by whitespace. A one-component image will have one-byte hexadecimal values representing the intensity of the image. For example, 0xFF is full intensity, 0x00 is no intensity. A two-component image puts the intensity in the first (high) byte and the transparency in the second (low) byte. Pixels in a three-component image have the red component in the first (high) byte, followed by the green and blue components (so 0xFF0000 is red). Four-component images put the transparency byte after red/green/blue (so 0x0000FF80 is semi-transparent blue). A value of 1.0 is completely transparent, 0.0 is completely opaque. Note: each pixel is actually read as a single unsigned number, so a 3-component pixel with value "0x0000FF" can also be written as "0xFF" or "255" (decimal). Pixels are specified from left to right, bottom to top. The first hexadecimal value is the lower left pixel of the image, and the last value is the upper right pixel.

For example,

1 2 1 0xFF 0x00

is a 1 pixel wide by 2 pixel high grey-scale image, with the bottom pixel white and the top pixel black. And:

2 4 3 0xFF0000 0xFF00 0 0 0 0 0xFFFFFF 0xFFFF00

is a 2 pixel wide by 4 pixel high RGB image, with the bottom left pixel red, the bottom right pixel green, the two middle rows of pixels black, the top left pixel white, and the top right pixel yellow.


Fields containing one (SFLong) or zero or more (MFLong) 32-bit integers. SFLongs are written to file as an integer in decimal, hexadecimal (beginning with '0x') or octal (beginning with '0') format. For example:

[ 17, -0xE20, -518820 ]

is an MFLong field containing three values.


A field containing a transformation matrix. SFMatrices are written to file in row-major order as 16 floating point numbers separated by whitespace. For example, a matrix expressing a translation of 7.3 units along the X axis is written as:

1 0 0 0  0 1 0 0  0 0 1 0  7.3 0 0 1


A field containing an arbitrary rotation. SFRotations are written to file as four floating point values separated by whitespace. The 4 values represent an axis of rotation followed by the amount of right-handed rotation about that axis, in radians. For example, a 180 degree rotation about the Y axis is:

0 1 0  3.14159265


Fields containing one (SFString) or zero or more (MFString) ASCII string (sequence of characters). Strings are written to file as a sequence of ASCII characters in double quotes (optional if the string doesn't contain any whitespace). Any characters (including newlines and '#') may appear within the quotes. To include a double quote character within the string, precede it with a backslash. For example:

"One, Two, Three"
"He said, \"Immel did it!\""

are all valid strings.


Field containing a two-dimensional vector. SFVec2fs are written to file as a pair of floating point values separated by whitespace.


Field containing a three-dimensional vector. SFVec3fs are written to file as three floating point values separated by whitespace.


VRML defines several different classes of nodes. Most of the nodes can be classified into one of three categories; shape, property or group. Shape nodes define the geometry in the world. Conceptually, they are the only nodes that draw anything. Property nodes affect the way shapes are drawn. And grouping nodes gather other nodes together, allowing collections of nodes to be treated as a single object. Some group nodes also control whether or not their children are drawn.

Nodes may contain zero or more fields. Each node type defines the type, name, and default value for each of its fields. The default value for the field is used if a value for the field is not specified in the VRML file. The order in which the fields of a node are read is not important; for example, "Cube { width 2 height 4 depth 6 }" and "Cube { height 4 depth 6 width 2 }" are equivalent.

Here are the 36 nodes grouped by type. The first group is the shape nodes. These specify geometry:

The second group are the properties. These can be further grouped into properties of the geometry and its appearance, and matrix or transform properties:

These are the group nodes:

Finally, the following nodes do not fit neatly into any category.