ARB Shading Language Reference

This paragraph does not aim at replacing the official shader specifications. It’s simply a quick reference, listing all useful assembly commands, available program parameters and state variables.

Official ARB Shader Specifications

Shader Assembly Language (ARB/NV) Quick Reference Guide for OpenGL

ARB Vertex Program

Vertex Program Inputs:

Size: 1, 2, 3 or 4 values
Normalized: yes / no (defined in the RED::RenderCode)
Values: byte; unsigned byte; short; unsigned short; int; unsigned int; float; double. This is defined by the format of the corresponding mesh channels.
Syntax: vertex.attrib[0 - 15]

Vertex Program Outputs:

result.position: position in clip coordinates
result.color.primary: primary color
result.color.secondary: secondary color
result.fogcoord: fog coordinates
result.pointsize: point size
result.texcoord[0 - 6]: output texture coordinates

Vertex Program Parameters (Not Exhaustive):

program.env[a]: (x,y,z,w). Program environment parameter number ‘a’.
program.local[a]: (x,y,z,w). Program local parameter ‘a’.

Vertex Program State Matrices (Not Exhaustive):

state.matrix.modelview: Modelview matrix
state.matrix.projection: Projection matrix
state.matrix.mvp: Modelview-projection matrix
state.matrix.program[0]: View matrix (specified in the RED::RenderCode)
state.matrix.program[1]: Model matrix (specified in the RED::RenderCode)

Access to inverse and transpose matrices is possible through the addition of a suffix to the targeted matrix: ‘.inverse’ to get the inverse matrix; ‘.transpose’ to get the transposed matrix; ‘.invtrans’ to get the inverse transposed matrix.

Vertex program temporaries:

TEMP R0
TEMP myValue
PARAM myParam = program.local[10]

Note

Shader performance is better when fewer temporaries are being used. Low end hardware may have limits in the number of temporaries that can be declared.

Vertex program commands: (V: Vector; S: Scalar; U: Texture image unit identifier; T: Texture target)

Instruction	Inputs	Outputs	Description
ABS	V	V	Absolute value. ABS R1, R1; // R1 = \\|R1\\|
ADD	V, V	V	Addition. ADD R0, R1, R2; // R0 = R1 + R2;
DP3	V, V	S, S, S, S	3 components dot product. DP3 R0, R1, R2; // R0 = R1.x \* R2.x + R1.y \* R2.y + R1.z \* R2.z;
DP4	V, V	S, S, S, S	4 components dot product. DP4 R0, R1, R2; // R0 = R1.x \* R2.x + R1.y \* R2.y + R1.z \* R2.z + R1.w \* R2.w;
DPH	V, V	S, S, S, S	Homogeneous dot product. DPH R0, R1, R2; // R0 = R1.x \* R2.x + R1.y \* R2.y + R1.z \* R2.z + R2.w;
DST	V, V	V	Distance vector. Refer to full specification for details.
EX2	S	S, S, S, S	Exponential base 2 (approximate). Refer to full specification for details.
EXP	S	V	Exponential base 2 (approximate). Refer to full specification for details.
FLR	V	V	Floor. Component wise floor operation. Floor is the largest integer value less than or equal to the value. The floor of 2.3 is 2; the floor of -3.6 is -4.0.
FRC	V	V	Fractional part. Generates the fractional portion of each component of the operand to generate a result vector. Subtracts the floor of the operand to its value. The result is always in [0, 1]. FRC R0, R0; // FLR R1, R0; SUB R0, R0, R1;
LG2	S	S, S, S, S	Logarithm base 2 (approximate). Refer to full specification for details.
LIT	V	V	Compute lighting coefficients. Refer to full specification for details.
LOG	S	V	Logarithm base 2 (approximate). Refer to full specifications for details.
MAD	V, V, V	V	Multiply and add. Multiply the two first operands, add the third one. Component wise operation. MAD R0, R1, R2, R3; // R0 = R1 \* R2 + R3
MAX	V, V	V	Maximum. Component wise maximum of the two operands. MAX R0, R1, R2; // R0.x = (R1.x > R2.x) ? R1.x : R2.x; Same for y, z, w.
MIN	V, V	V	Minimum. Component wise minimum of the two operands. MIN R0, R1, R2; // R0.x = (R1.x < R2.x) ? R1.x : R2.x; Same for y, z, w.
MOV	V	V	Move. Vector move. MOV R1, value; // R1 = value;
MUL	V, V	V	Multiply. Component wise multiplication of the two operands. MUL R0, R1, R2; // R0 = R1 \* R2
POW	S, S	S, S, S, S	Exponentiate (approximate). Raise the first operand at the power of the second operand. Result is duplicated in the result vector. POW R0, R1.x, R2.z;
RCP	S	S, S, S, S	Reciprocal. Approximates the reciprocal of the scalar operand, and replicates the result to the four components of the result vector. RCP R0, R0.x; // R0.xyzw = 1.0 / R0.x
RSQ	S	S, S, S, S	Reciprocal square root. Approximates the reciprocal square root of the scalar operand, and replicates the result to the four components of the result vector. RSQ R0, R0.x; // R0 = 1.0 / sqrt( R0.x )
SGE	V, V	V	Set on greater than or equal to. Performs a component wise comparison of the two operands. Each component of the result vector is 1.0 if the corresponding component of the first operand is greater than or equal to the value in the second operand and 0.0 otherwise. SGE R0, R1, R2; // R0.x = (R1.x >= R2.x) ? 1.0 : 0.0; Same for y, z, w.
SLT	V, V	V	Set on less than. Performs a component wise comparison of the two operands. Each component of the result vector is 1.0 if the corresponding component of the first operand is less than that of the second and 0.0 otherwise. SLT R0, R1, R2; // R0.x = (R1.x < R2.x) ? 1.0 : 0.0; Same for y, z, w.
SUB	V, V	V	Subtract. Component wise subtraction of the second operand from the first to yield a result vector. SUB R0, R1, R2; // Is equivalent to ADD R0, R1, -R2;
SWZ	V	V	Extended swizzle. Refer to full specifications for details. This is implicitly used when an operand is written with a full extension suffix, such as the example below: SWZ R0, R0.zwxy; // Is equivalent to: ADD R0, R1, R0; // ADD R0, R1, R0.zwxy;
XPD	V, V	V	Cross product. Refer to full specification for details. XPD R0, R1, R2; // Is equivalent to: MULR R0.xyz, R1.zxyz, R2.yzxy; MADR R0.xyz, R1.yzxy, R2.zxyz, -R0.xyzx;

ARB Pixel / Fragment Program

Pixel Program Inputs:

fragment.texcoord[0-6]: texture coordinates interpolators

fragment.color.primary: Primary color. Clamped to [0, 1]

fragment.color.secondary: Secondary color. Clamped to [0, 1]

fragment.fogcoord: Fog coordinates. Single interpolated value

Pixel Program Outputs:

result.color: Output RGBA color

result.depth: Output fragment depth (in result.depth.z)

A pixel program must output at least one of both values.

Pixel Program Parameters (not exhaustive):

program.env[a]: (x,y,z,w). Program environment parameter number ‘a’

program.local[a]: (x,y,z,w). Program local parameter ‘a’

Pixel program temporaries:

TEMP R0

TEMP myValue

PARAM myParam = program.local[10]

Note that shader performance is better when fewer temporaries are being used. Low end hardware may have limits in the number of temporaries that can be declared.

Pixel Program State Parameters:

The same state information is available as for vertex shaders

Pixel program commands:

Commands that are identical for both vertex and pixel programs are not mentioned again here. This includes: ABS, ADD, DP3, DP4, DPH, DST, EX2, FLR, FRC, LG2, LIT, MAD, MAX, MIN, MOV, MUL, POW, RCP, RSQ, SUB, SWZ, XPD. Pixel program specific commands are here:

Instruction	Inputs	Outputs	Description
CMP	V, V, V	V	Compare. Performs a component wise comparison of the first operand against zero, and copies the value of the second operand or third operand based on the results of the comparison. CMP R0, R1, R2, R3; // R0.x = (R1.x < 0.0) ? R2.x : R3.x; Same applies for y, z, w.
COS	S	S, S, S, S	Cosine with reduction to [-PI, PI]. Approximates the trigonometric cosine of the angle specified by the scalar operand and replicates it to all four components of the result vector.
KIL	V	V	Kill fragment. This function prevents a fragment from receiving any further processing if any component of its source vector is negative. Subsequent stages of the GL pipeline will be skipped for this fragment.
LRP	V, V, V	V	Linear interpolation. Performs a component wise linear interpolation between the second and third operands using the first operand as blend factor. LRP R0, R1, R2, R3; // R0.x = R1.x \* R2.x + (1.0 - R1.x) \* R3.x; Same applies for y, z, w.
SCS	S	S, S, -, -	Sine / Cosine without reduction. Refer to full specifications for details.
SIN	S	S, S, S, S	Sine with reduction to [-PI, PI]. Approximates the trigonometric sine of the angle specified by the scalar operand and replicates it to all four components of the result vector.
TEX	V, U, T	V	Texture sample. Takes the first three components of the source vector to sample from the specified texture target on the specified texture image unit. The resulting sample is mapped to RGBA and written to the result vector. Example: TEX R0, fragment.texcoord[2], texture[3], 2D;
TXB	V, U, T	V	Texture sample with bias. Refer to full specification for details.
TXP	V, U, T	V	Texture sample with projection. Divides the first three components of the source vector by the fourth component of the source vector. Performs a regular TEX after that.

Indirect Shaders

Indirect shading refers to all GPU lighting calculations that are performed after a ray-casting process. The indirect lighting calculation work as all direct lighting workflows except that it does not use the same shaders. The whole indirect lighting workflow is described in the ‘My first indirect shader’ tutorial. We’ll focus here in describing services offered by the HOOPS Luminate to program indirect lighting shaders.

Indirect Vertex Programs

They render quads instead of triangles. An indirect vertex program operates in WCS and transmits the triangle attributes to the pixel program. It rarely does more than that because it does not know the results of the ray versus triangle intersection yet.

Indirect vertex program inputs:

Input position: This is a quad position that must be transformed in DCS as usual.

Triangle attributes bound by the RED::RenderCode used by the indirect shader. Every attribute is repeated 3 times, for each triangle’s vertex. Therefore, if we send vertices and normals, we’ll receive vertex.attrib[1] to vertex.attrib[6] filled respectively with P0, P1, P2, N0, N1, N2, where P stands for position and N for normal.

Last received attribute is the triangle ID. In our previous example, it’s received as vertex.attrib[7].

Indirect vertex programs use the ‘Object to World’ space matrices. These matrices are set on the RED::RenderCode.

Indirect vertex programs outputs:

Triangle ID. Usually transferred using RED::ShaderString::TriangleIDTransfer.

Triangle attributes at P0, P1 - P0 and P2 - P0. Every received attribute should be transferred.

Indirect Pixel Programs

Indirect pixel programs perform the ray versus triangle intersection for the provided attributes. They interpolate the found intersection to perform all lighting calculations now based on the local information. Indirect programs start by feeding the standard intersection calculation code:

psh.TriangleOwnershipTest();
psh.IntersectRayVsTriangle("ray_uv","ray_start","ray_dir",0);
psh.GetRayVsTriangleHitPoint("ray_hit","ray_uv","ray_start","ray_dir");
psh.GetRayVsTriangleUV("ray_uv");

Indirect pixel program inputs:

Triangle ID transferred from the indirect vertex program.

Ray starting position textures (RED_SHADER_INDIRECT_RAY_POS_TEX reference).

Ray direction textures (RED_SHADER_INDIRECT_RAY_DIR_TEX reference).

Triangle attributes: Vertex positions P0, P1 - P0, P2 - P0.

Triangle attributes (optional): UV at P0, P1 - P0, P2 - P0.

Other triangle attributes for interpolation (optional): Normals, tangents, etc

Indirect pixel program outputs:

result.color: Fragment color.

No depth may be output by an indirect pixel program. Depth testing is disabled and depth writing is although disabled for an indirect pixel program.