Embedded.com   

Non-modifiable Lvalues
By Dan Saks, Embedded Systems Programming
Jul 2 2001 (9:27 AM)
URL: http://www.embedded.com/showArticle.jhtml?articleID=9900207

Lvalues actually come in a variety of flavors. If you really want to understand how compilers evaluate expressions, you'd better develop a taste.

An expression is a sequence of operators and operands that specifies a computation. That computation might produce a resulting value and it might generate side effects. An assignment expression has the form:

e1 = e2

where e1 and e2 are themselves expressions. The right operand e2 can be any expression, but the left operand e1 must be an lvalue expression. That is, it must be an expression that refers to an object. As I explained last month ("Lvalues and Rvalues," June 2001, p. 70), the "l" in lvalue stands for "left," as in "the left side of an assignment expression." For example:

int n;

declares n as an object of type int. When you use n in an assignment expression such as:

n = 3;

the n is an expression (a subexpression of the assignment expression) referring to an int object. The expression n is an lvalue. On the other hand:

3 = n;

causes a compilation error, and well it should, because it's trying to change the value of an integer constant. Although the assignment's left operand 3 is an expression, it's not an lvalue. It's an rvalue. An rvalue is simply any expression that is not an lvalue. It doesn't refer to an object; it just represents a value.

Although lvalue gets its name from the kind of expression that must appear to the left of an assignment operator, that's not really how Kernighan and Ritchie defined it. In the first edition of The C Programming Language (Prentice-Hall, 1978), they defined an lvalue as "an expression referring to an object." At that time, the set of expressions referring to objects was exactly the same as the set of expressions eligible to appear to the left of an assignment operator. But that was before the const qualifier became part of C and C++.

The const qualifier renders the basic notion of lvalues inadequate to describe the semantics of expressions. We need to be able to distinguish between different kinds of lvalues. And that's what I'm about to show you how to do. But first, let me recap.

A few key points

The assignment operator is not the only operator that requires an lvalue as an operand. The unary & (address-of) operator requires an lvalue as its sole operand. That is, &n is a valid expression only if n is an lvalue. Thus, an expression such as &3 is an error. The literal 3 does not refer to an object, so it's not addressable.

Not only is every operand either an lvalue or an rvalue, but every operator yields either an lvalue or an rvalue as its result. For example, the binary + operator yields an rvalue. Given integer objects m and n:

m + 1 = n;

is an error. The + operator has higher precedence than the = operator. Thus, the assignment expression is equivalent to:

(m + 1) = n; // error

which is an error because m + 1 is an rvalue.

An operator may require an lvalue operand, yet yield an rvalue result. The unary & is one such operator. For example:

int n, *p;
...
p = &n; // ok
&n = p; // error: &n is an rvalue

On the other hand, an operator may accept an rvalue operand, yet yield an lvalue result, as is the case with the unary * operator. A valid, non-null pointer p always points to an object, so *p is an lvalue. For example:

int a[N];
int *p = a;
...
*p = 3; // ok

Although the result is an lvalue, the operand can be an rvalue, as in:

*(p + 1) = 4; // ok

With this in mind, let's look at how the const qualifier complicates the notion of lvalues.

Lvalues and the const qualifier

A const qualifier appearing in a declaration modifies the type in that declaration, or some portion thereof. For example: int const n = 127;

declares n as object of type "const int." The expression n refers to an object, almost as if const weren't there, except that n refers to an object the program can't modify. For example, an assignment such as:

n = 0; // error, can't modify n

produces a compile-time error, as does:

++n; // error, can't modify n

(I covered the const qualifier in depth in several of my earlier columns. See "Placing const in Declarations," June 1998, p. 19 or "const T vs. T const," February 1999, p. 13, among others.) How is an expression referring to a const object such as n any different from an rvalue? After all, if you rewrite each of the previous two expressions with an integer literal in place of n, as in:

7 = 0; // error, can't modify literal
++7; // error, can't modify literal

they're both still errors. You can't modify n any more than you can an rvalue, so why not just say n is an rvalue, too? The difference is that you can take the address of a const object, but you can't take the address of an integer literal. For example:

int const *p;
...
p = &n; // ok
p = &7; // error

Notice that p declared just above must be a "pointer to const int." If you omitted const from the pointer type, as in:

int *p;

then the assignment:

p = &n; // error, invalid conversion

would be an error. When you take the address of a const int object, you get a value of type "pointer to const int," which you cannot convert to "pointer to int" unless you use a cast, as in:

p = (int *)&n; // (barely) ok

Although the cast makes the compiler stop complaining about the conversion, it's still a hazardous thing to do. (See "What const Really Means," August 1998, p. 11.)

Thus, an expression that refers to a const object is indeed an lvalue, not an rvalue. However, it's a special kind of lvalue called a non-modifiable lvalue-an lvalue that you can't use to modify the object to which it refers. This is in contrast to a modifiable lvalue, which you can use to modify the object to which it refers.

Once you factor in the const qualifier, it's no longer accurate to say that the left operand of an assignment must be an lvalue. Rather, it must be a modifiable lvalue. In fact, every arithmetic assignment operator, such as += and *=, requires a modifiable lvalue as its left operand. For all scalar types:

x += y; // arithmetic assignment

is equivalent to:

x = x + y; // assignment

except that it evaluates x only once. Since the x in this assignment must be a modifiable lvalue, it must also be a modifiable lvalue in the arithmetic assignment. Not every operator that requires an lvalue operand requires a modifiable lvalue. The unary & operator accepts either a modifiable or a non-modifiable lvalue as its operand. For example, given:

int m;
int const n = 10;

&m is a valid expression returning a result of type "pointer to int," and &n is a valid expression returning a result of type "pointer to const int."

What it is that's really non-modifiable

Earlier, I said a non-modifiable lvalue is an lvalue that you can't use to modify an object. Notice that I did not say a non-modifiable lvalue refers to an object that you can't modify-I said you can't use the lvalue to modify the object. The distinction is subtle but nonetheless important, as shown in the following example. Consider:

int n = 0;
int const *p;
...
p = &n;

At this point, p points to n, so *p and n are two different expressions referring to the same object. However, *p and n have different types. As I explained in an earlier column ("What const Really Means"), this assignment uses a qualification conversion to convert a value of type "pointer to int" into a value of type "pointer to const int." Expression n has type "(non-const) int." It is a modifiable lvalue. Thus, you can use n to modify the object it designates, as in:

n += 2;

On the other hand, p has type "pointer to const int," so *p has type "const int." Expression *p is a non-modifiable lvalue. You cannot use *p to modify the object n, as in:

*p += 2;

even though you can use expression n to do it. Such are the semantics of const in C and C++.

In summary

Every expression in C and C++ is either an lvalue or an rvalue. An lvalue is an expression that designates (refers to) an object. Every lvalue is, in turn, either modifiable or non-modifiable. An rvalue is any expression that isn't an lvalue. Operationally, the difference among these kinds of expressions is this:

Again, as I cautioned last month, all this applies only to rvalues of a non-class type. Classes in C++ mess up these concepts even further.

Dan Saks is a high school track coach and the president of Saks & Associates, a C/C++ training and consulting company. You can write to him at dsaks@wittenberg.edu.

Return to July 2001 Table of Contents

Copyright 2003 CMP Media LLC