I have a simple performance test, that indirectly calls WriteAsync many times. It performs reasonably as long as WriteAsync is implemented as shown below. However, when I inline WriteByte into WriteAsync, performance degrades by about factor 7.
(To be clear: The only change that I make is replacing the statement containing the WriteByte call with the body of WriteByte.)
Can anybody explain why this happens? I’ve had a look at the differences in the generated code with Reflector, but nothing struck me as so totally different as that it would explain the huge perf hit.
public sealed override async Task WriteAsync(
byte[] buffer, int offset, int count, CancellationToken cancellationToken)
{
var writeBuffer = this.WriteBuffer;
var pastEnd = offset + count;
while ((offset < pastEnd) && ((writeBuffer.Count < writeBuffer.Capacity) ||
await writeBuffer.FlushAsync(cancellationToken)))
{
offset = WriteByte(buffer, offset, writeBuffer);
}
this.TotalCount += count;
}
private int WriteByte(byte[] buffer, int offset, WriteBuffer writeBuffer)
{
var currentByte = buffer[offset];
if (this.previousWasEscapeByte)
{
this.previousWasEscapeByte = false;
this.crc = Crc.AddCrcCcitt(this.crc, currentByte);
currentByte = (byte)(currentByte ^ Frame.EscapeXor);
++offset;
}
else
{
if (currentByte < Frame.InvalidStart)
{
this.crc = Crc.AddCrcCcitt(this.crc, currentByte);
++offset;
}
else
{
currentByte = Frame.EscapeByte;
this.previousWasEscapeByte = true;
}
}
writeBuffer[writeBuffer.Count++] = currentByte;
return offset;
}
asyncmethods are rewritten by the compiler into a giant state machine, very similar to methods usingyield return. All of your locals become fields in the state machine’s class. The compiler currently doesn’t try to optimize this at all, so any optimization is up to the coder.Every local which would have sat happily in a register is now being read from and written to memory. Refactoring synchronous code out of
asyncmethods and into a sync method is a very valid performance optimization — you’re just doing the reverse!