DevelopTime: C# Volatile Constructs

Volatile Constructs

Back in the early days of computing, software was written using assembly language. Assembly

language is very tedious, because programmers must explicitly state everything—use this CPU register

for this, branch to that, call indirect through this other thing, and so on. To simplify programming,

higher-level languages were introduced. These higher-level languages introduced common useful

constructs, like if/else, switch/case, various loops, local variables, arguments, virtual method calls,

operator overloads, and much more. Ultimately, these language compilers must convert the high-level

constructs down to the low-level constructs so that the computer can actually do what you want it to

do.

In other words, the C# compiler translates your C# constructs into Intermediate Language (IL),

which is then converted by the just-in-time (JIT) compiler into native CPU instructions, which must then

be processed by the CPU itself. In addition, the C# compiler, the JIT compiler, and even the CPU itself

can optimize your code. For example, the following ridiculous method can ultimately be compiled into

nothing:

private static void OptimizedAway() {

// Constant expression is computed at compile time resulting in zero

Int32 value = (1 * 100) - (50 * 2);

// If value is 0, the loop never executes

for (Int32 x = 0; x < value; x++) {

// There is no need to compile the code in the loop since it can never execute

Console.WriteLine("Jeff");

}

In this code, the compiler can see that value will always be 0; therefore, the loop will never execute

and consequently, there is no need to compile the code inside the loop. This method could be

compiled down to nothing. In fact, when JITting a method that calls OptimizedAway, the JITter will try

to inline the OptimizedAway method’s code. Since there is no code, the JITter will even remove the

code that tries to call OptimizedAway. We love this feature of compilers. As developers, we get to

write the code in the way that makes the most sense to us. The code should be easy to write, read, and

maintain. Then compilers translate our intentions into machine-understandable code. We want our

compilers to do the best job possible for us.

When the C# compiler, JIT compiler, and CPU optimize our code, they guarantee us that the

intention of the code is preserved. That is, from a single-threaded perspective, the method does what

we want it to do, although it may not do it exactly the way we described in our source code. However,

the intention might not be preserved from a multithreaded perspective. Here is an example where the

optimizations make the program not work as expected:

internal static class StrangeBehavior {

// As you'll see later, mark this field as volatile to fix the problem

private static Boolean s_stopWorker = false;

public static void Main() {

Console.WriteLine("Main: letting worker run for 5 seconds");

Thread t = new Thread(Worker);

t.Start();

Thread.Sleep(5000);

s_stopWorker = true;

Console.WriteLine("Main: waiting for worker to stop");

t.Join();

}

private static void Worker(Object o) {

Int32 x = 0;

while (!s_stopWorker) x++;

Console.WriteLine("Worker: stopped when x={0}", x);

}

In this code, the Main method creates a new thread that executes the Worker method. This Worker

method counts as high as it can before being told to stop. The Main method allows the Worker thread

to run for 5 seconds before telling it to stop by setting the static Boolean field to true. At this

point, the Worker thread should display what it counted up to, and then the thread will terminate. The

Main thread waits for the Worker thread to terminate by calling Join, and then the Main thread

returns, causing the whole process to terminate.

Looks simple enough, right? Well, the program has a potential problem due to all the optimizations

that could happen to it. You see, when the Worker method is compiled, the compiler sees that

s_stopWorker is either true or false, and it also sees that this value never changes inside the

Worker method itself. So the compiler could produce code that checks s_stopWorker first. If

s_stopWorker is true, then “Worker: stopped when x=0” will be displayed. If s_stopWorker is

false, then the compiler produces code that enters an infinite loop that increments x forever. You

see, the optimizations cause the loop to run very fast because checking s_stopWorker only occurs

once before the loop; it does not get checked with each iteration of the loop.

If you actually want to see this in action, put this code in a .cs file and compile the code using C#’s

/platform:x86 and /optimize+ switches. Then run the resulting EXE file, and you’ll see that the

program runs forever. Note that you have to compile for x86 ensuring that the x86 JIT compiler is used

at runtime. The x86 JIT compiler is more mature than the x64 JIT compiler, so it performs more

aggressive optimizations. The x64 JIT compiler does not perform this particular optimization, and

therefore the program runs to completion. This highlights another interesting point about all of this.

Whether your program behaves as expected depends on a lot of factors, such as which compiler

version and compiler switches are used, which JIT compiler is used, and which CPU your code is

running on. In addition, to see the program above run forever, you must not run the program under a

debugger because the debugger causes the JIT compiler to produce unoptimized code that is easier to

step through.

Let’s look at another example, which has two threads that are both accessing two fields:

internal sealed class ThreadsSharingData {

private Int32 m_flag = 0;

private Int32 m_value = 0;

// This method is executed by one thread

public void Thread1() {

// Note: These could execute in reverse order

m_value = 5;

m_flag = 1;

}

// This method is executed by another thread

public void Thread2() {

// Note: m_value could be read before m_flag

if (m_flag == 1)

Console.WriteLine(m_value);

}

The problem with this code is that the compilers/CPU could translate the code in such a way as to

reverse the two lines of code in the Thread1 method. After all, reversing the two lines of code does

not change the intention of the method. The method needs to get a 5 in m_value and a 1 in m_flag.

From a single-threaded application’s perspective, the order of executing this code is unimportant. If

these two lines do execute in reverse order, then another thread executing the Thread2 method could

see that m_flag is 1 and then display 0.

Let’s look at this code another way. Let’s say that the code in the Thread1 method executes in

program order (the way it was written). When compiling the code in the Thread2 method, the

compiler must generate code to read m_flag and m_value from RAM into CPU registers. It is possible

that RAM will deliver the value of m_value first, which would contain a 0. Then the Thread1 method

could execute, changing m_value to 5 and m_flag to 1. But Thread2’s CPU register doesn’t see that

m_value has been changed to 5 by this other thread, and then the value in m_flag could be read

from RAM into a CPU register and the value of m_flag becomes 1 now, causing Thread2 to again

display 0.

This is all very scary stuff and is more likely to cause problems in a release build of your program

than in a debug build of your program, making it particularly tricky to detect these problems and

correct your code. Now, let’s talk about how to correct your code.

The static System.Threading.Volatile class offers two static methods that look like this:68

public static class Volatile {

public static void Write(ref Int32 location, Int32 value);

public static Int32 Read(ref Int32 location);

}

These methods are special. In effect, these methods disable some optimizations usually performed

by the C# compiler, the JIT compiler, and the CPU itself. Here’s how the methods work:

• The Volatile.Write method forces the value in location to be written to at the point of

the call. In addition, any earlier program-order loads and stores must occur before the call to

Volatile.Write.

• The Volatile.Read method forces the value in location to be read from at the point of the

call. In addition, any later program-order loads and stores must occur after the call to

Volatile.Read.

Important I know that this can be very confusing, so let me summarize it as a simple rule. When

threads are communicating with each other via shared memory, write the last value by calling

Volatile.Write and read the first value by calling Volatile.Read.

So now we can fix the ThreadsSharingData class using these methods:

internal sealed class ThreadsSharingData {

private Int32 m_flag = 0;

private Int32 m_value = 0;

// This method is executed by one thread

public void Thread1() {

// Note: 5 must be written to m_value before 1 is written to m_flag

m_value = 5;

Volatile.Write(ref m_flag, 1);

}

// This method is executed by another thread

public void Thread2() {

// Note: m_value must be read after m_flag is read

if (Volatile.Read(ref m_flag) == 1)

Console.WriteLine(m_value);

}

First, notice that we are following the rule. The Thread1 method writes two values out to fields that

68 There are also overloads of Read and Write that operate on the following types: Boolean, (S)Byte, (U)Int16,

UInt32, (U)Int64, (U)IntPtr, Single, Double, and T where T is a generic type constrained to ‘class’

(reference types).

are shared by multiple threads. The last value that we want written (setting m_flag to 1) is performed

by calling Volatile.Write. The Thread2 method reads two values from fields shared by multiple

threads, and the first value being read (m_flag) is performed by calling Volatile.Read.

But what is really happening here? Well, for the Thread1 method, the Volatile.Write call

ensures that all the writes above it are completed before a 1 is written to m_flag. Since m_value = 5 is

before the call to Volatile.Write, it must complete first. In fact, if there were many variables being

modified before the call to Volatile.Write, they would all have to complete before 1 is written to

m_flag. Note that the writes before the call to Volatile.Write can be optimized to execute in any

order; it’s just that all the writes have to complete before the call to Volatile.Write.

For the Thread2 method, the Volatile.Read call ensures that all variable reads after it start after

the value in m_flag has been read. Since reading m_value is after the call to Volatile.Read, the

value must be read after having read the value in m_flag. If there were many reads after the call to

Volatile.Read, they would all have to start after the value in m_flag has been read. Note that the

reads after the call to Volatile.Read can be optimized to execute in any order; it’s just that the reads

can’t start happening until after the call to Volatile.Read.

C#’s Support for Volatile Fields

Making sure that programmers call the Volatile.Read and Volatile.Write methods correctly is a

lot to ask. It’s hard for programmers to keep all of this in their minds and to start imagining what other

threads might be doing to shared data in the background. To simplify this, the C# compiler has the

volatile keyword, which can be applied to static or instance fields of any of these types: Boolean,

(S)Byte, (U)Int16, (U)Int32, (U)IntPtr, Single, or Char. You can also apply the volatile

keyword to reference types and any enum field so long as the enumerated type has an underlying type

of (S)Byte, (U)Int16, or (U)Int32. The JIT compiler ensures that all accesses to a volatile field are

performed as volatile reads and writes, so that it is not necessary to explicitly call Volatile's static

Read or Write methods. Furthermore, the volatile keyword tells the C# and JIT compilers not to

cache the field in a CPU register, ensuring that all reads to and from the field actually cause the value

to be read from memory.

Using the volatile keyword, we can rewrite the ThreadsSharingData class as follows:

internal sealed class ThreadsSharingData {

private volatile Int32 m_flag = 0;

private Int32 m_value = 0;

// This method is executed by one thread

public void Thread1() {

// Note: 5 must be written to m_value before 1 is written to m_flag

m_value = 5;

m_flag = 1;

}

// This method is executed by another thread

public void Thread2() {

// Note: m_value must be read after m_flag is read

if (m_flag == 1)

Console.WriteLine(m_value);

}

There are some developers (and I am one of them) who do not like C#’s volatile keyword, and

they think that the language should not provide it.69 Our thinking is that most algorithms require few

volatile read or write accesses to a field and that most other accesses to the field can occur normally,

improving performance; seldom is it required that all accesses to a field be volatile. For example, it is

difficult to interpret how to apply volatile read operations to algorithms like this one:

m_amount = m_amount + m_amount; // Assume m_amount is a volatile field defined in a class

Normally, an integer number can be doubled simply by shifting all bits left by 1 bit, and many

compilers can examine the code above and perform this optimization. However, if m_amount is a

volatile field, then this optimization is not allowed. The compiler must produce code to read

m_amount into a register and then read it again into another register, add the two registers together,

and then write the result back out to the m_amount field. The unoptimized code is certainly bigger and

slower; it would be unfortunate if it were contained inside a loop.

Furthermore, C# does not support passing a volatile field by reference to a method. For example,

if m_amount is defined as a volatile Int32, attempting to call Int32’s TryParse method causes

the compiler to generate a warning as shown here:

Boolean success = Int32.TryParse("123", out m_amount);

// The above line causes the C# compiler to generate a warning:

// CS0420: a reference to a volatile field will not be treated as volatile

Finally, volatile fields are not Common Language Specification (CLS) compliant because many

languages (including Visual Basic) do not support them.

DevelopTime

Saturday, February 8, 2014

C# Volatile Constructs

No comments:

Post a Comment