Wednesday, June 27, 2012

Asynchronous Reads Of Large Files And Fighting The Large Object Heap

I was doing some benchmarking of my event loop architecture last recently, particularly in the realm of reading files, and I saw pretty much what I expected to see in most cases. For small files (< 80KB) synchronous file reads were a hair faster overall. This is to be expected, though. The I/O doesn't take very long so just blocking and waiting for the response is going to be faster than offsetting the work to the OS, then waiting for a callback and all of the overhead that entails. For files larger than that, async file reads read multiple files much faster. Again this is to be expected. What I didn't expect, however, were OutOfMemoryExceptions while reading multiple large files asynchronously. I'd check PerfMon and see everything was just fine. Memory-wise I would have gigs to spare. So what was going on? I stepped through and the errors seemed arbitrary about when they'd happen, but not where they would happen. The error always occurred in the same bit if code, a simple allocation of a buffer.

What was happening was I was allocating my buffer to be the exact same size as my FileStream incoming. Which is fine if I'm only doing it once in a while. You see, when allocating an object that is larger than 85,000 bytes, that object goes immediately onto the Large Object Heap, and garbage collection leaves it there for a a lot longer than it normally would. This means that as I'm looping through my files to read, I'm collecting dead space in memory, and once the limits of my allocated space are reached or my LOH gets too fragmented... OutOfMemoryException.

The only good answer here is to use a Stream and read out smaller pieces individually when you're dealing with large files. Also, it's very important to make sure you're doing a good job caching frequently read file data... or any large objects for that matter. In the end it made me question a lot of my I/O use and my buffer allocations. It was certainly a learning experience.

PS: I want to thank Chris Klein from Ares Sportswear for his recommendation to switch the EvenLoop over to a Task and ContinueWith architecture. It sped things up a bit and it cleaned up the code a lot.

No comments:

Post a Comment

This form allows some basic HTML. It will only create links if you wrap the URL in an anchor tag (Sorry, it's the Blogger default)