Click here to Skip to main content
15,400,029 members
Articles / Programming Languages / C++
Technical Blog
Posted 23 Oct 2014

Tagged as

Stats

20.5K views
11 bookmarked

Using MemoryStream to Wrap Existing Buffers: Gotchas and Tips

Rate me:
Please Sign up or sign in to vote.
5.00/5 (11 votes)
23 Oct 2014CPOL5 min read
Using MemoryStream to wrap existing buffers: Gotchas and tips

Introduction

A MemoryStream is a useful thing to have around when you want to process an array of bytes as a stream, but there are a few gotchas you need to be aware of, and some alternatives that may be better in a few cases. In Writing High-Performance .NET Code, I mention some situations where you may want to use pooled buffers, but in this article, I will talk specifically about using MemoryStream specifically to wrap existing buffers to avoid additional memory allocations.

There are essentially two ways to use a MemoryStream:

  1. MemoryStream creates and manages a resizable buffer for you. You can write to and read from it however you want.
  2. MemoryStream wraps an existing buffer. You can choose how the underlying buffer is treated.

Let’s look at the constructors of MemoryStream and how they lead to one of those situations.

  • MemoryStream() – The default constructor. MemoryStream owns the buffer and resizes it as needed. The initial capacity (buffer size) is 0.
  • MemoryStream(int capacity) – Same as default, but initial capacity is what you pass in.
  • MemoryStream(byte[] buffer)MemoryStream wraps the given buffer. You can write to it, but not change the size—basically, this buffer is all the space you will have. You cannot call GetBuffer() to retrieve the original array.
  • MemoryStream(byte[] buffer, bool writable)MemoryStream wraps the given buffer, but you can choose whether to make the stream writable at all. You could make it a pure read-only stream. You cannot call GetBuffer() to retrieve the original array.
  • MemoryStream(byte[] buffer, int index, int count) – Wraps an existing buffer, allowing writes, but allows you to specify an offset (aka origin) into the buffer that the stream will consider position 0. It also allows you to specify how many bytes to use after that origin as part of the stream. The stream is read-only. You cannot call GetBuffer() to retrieve the original array.
  • MemoryStream(byte[] buffer, int index, int count, bool writable) – Same as previous, but you can choose whether the stream is read-only. The buffer is still not resizable, and you cannot call GetBuffer() to retrieve the original array.
  • MemoryStream(byte[] buffer, int index, int count, bool writable, bool exposable)– Same as previous, but now you can specify whether the buffer should be exposed via GetBuffer(). This is the ultimate control you’re given here, but using it comes with an unfortunate catch, which we’ll see later.

Stream-Managed Buffer

If MemoryStream is allowed to manage the buffer size itself (the first two constructors above), then understanding the growth algorithm is important. The algorithm as currently coded looks like this:

  1. If requested buffer size is less than the current size, do nothing.
  2. If requested buffer size is less than 256 bytes, set new size to 256 bytes.
  3. If requested buffer size is less than twice the current buffer size, set the new size to twice the current size.
  4. Otherwise set capacity to exactly what was requested.

Essentially, if you’re not careful, you will start doubling the amount of memory you’re using, which may be overkill for some situations.

Wrapping an Existing Buffer

You would want to wrap an existing buffer in any situation where you have an existing array of bytes and don’t want to needlessly copy them, causing more memory allocations and thus a higher rate of garbage collections. For example, you’ve read a bunch of bytes from the wire via HTTP, you’ve got an existing buffer. You need to pass those bytes to a parser that expects a Stream. So far, so good.

However, there is a gotcha here. To illustrate, let’s see a hypothetical example.

You pull some bytes off the wire. The first few bytes are a standard header that all input has, and can be ignored. For the actual content, you have a parser that expects a Stream. Within that stream, suppose there are a subsections of data that have their own parsers. In other words, you have a buffer that looks something like this:

Image 1

To deal with this, you wrap a MemoryStream around the content section, like this:

C++
// comes from network
byte[] buffer = ...
// Content starts at byte 24
MemoryStream ms = new <code>MemoryStream(buffer, 24, buffer.Length – 24, writable:false, publiclyVisible:true);

So far so good, but what if the parser for the sub-section really needs to operate on the raw bytes instead of as a Stream? This should be possible, right? After all, publiclyVisible was set to true, so you can call GetBuffer(), which returns the original buffer. There’s just one (major) problem: You don’t know where you are in that buffer.

This may sound like a contrived situation, but it’s completely possible and I’ve run into it multiple times.

See, when you wrapped that buffer and told MemoryStream to consider byte 24 as the start, it set a private field called origin to 24. If you set the stream’s Position to 24, the index into the array is set to 24. That’s position 0. Unfortunately, MemoryStream will not surface the origin to you. You can’t even deduce it from other properties like Capacity, Length, and Position. The origin just disappears, which means that the buffer you get back from GetBuffer() is useless. I consider this a bug in the .NET Framework—why have the ability to retrieve the original buffer if you don’t surface the offset being used? It may be that it would be more confusing, with an additional property that many people won’t understand.

There are a few ways you can choose to handle this:

  1. Derive a child class from MemoryStream that doesn’t do anything except mimic the underlying constructors and store this offset itself and make it available via a property.
  2. Pass down the offset separately. Works, but feels very annoying.
  3. Instead of using MemoryStream, use an ArraySegment<byte> and wrap that (or parts of it) as needed in a MemoryStream.

Probably the cleanest option is #1. If you can’t do that, convert the deserialization methods to take ArraySegment<byte>, which wraps a buffer, an offset, and a length into a cheap struct, which you can then pass to all deserialization functions. If you need a Stream, you can easily construct it from the segment:

C++
byte[] buffer = ...
var segment = new ArraySegment<byte>(buffer, 24, buffer.Length – 24);
...
var stream = new MemoryStream(segment.Array, segment.Offset, segment.Count);

Essentially, make your master buffer source the ArraySegment, not the Stream.


Check out my latest book, the essential, in-depth guide to performance for all .NET developers:

Writing High-Performance.NET Code by Ben Watson. Available now in print and as an eBook at:

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Ben M Watson
Software Developer (Senior) Microsoft
United States United States
Ben Watson has been a software engineer at Microsoft since 2008. On the Bing platform team, he has built one of the world's leading .NET-based, high-performance server applications, handling high-volume, low-latency requests across thousands of machines for millions of customers. In his spare time, he enjoys geocaching, books of all kinds, classical music, and spending time with his family. He is the author of Writing High-Performance .NET Code and C# 4.0 How-To. He blogs at Philosophical Geek.

Comments and Discussions

 
QuestionWhay about Spans? Pin
Member 122993214-Jul-19 23:53
MemberMember 122993214-Jul-19 23:53 
QuestionInteresting Pin
dommy1A24-Oct-14 6:35
Memberdommy1A24-Oct-14 6:35 
GeneralImage missing Pin
Roger Beaverstock24-Oct-14 2:26
MemberRoger Beaverstock24-Oct-14 2:26 
GeneralRe: Image missing Pin
Ben M Watson24-Oct-14 3:18
MemberBen M Watson24-Oct-14 3:18 
GeneralRe: Image missing Pin
Roger Beaverstock24-Oct-14 4:07
MemberRoger Beaverstock24-Oct-14 4:07 
GeneralMy vote of 5 Pin
den2k8823-Oct-14 22:05
professionalden2k8823-Oct-14 22:05 
GeneralRe: My vote of 5 Pin
Ben M Watson24-Oct-14 3:18
MemberBen M Watson24-Oct-14 3:18 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.