Wednesday, June 27, 2012

Asynchronous Reads Of Large Files And Fighting The Large Object Heap

I was doing some benchmarking of my event loop architecture last recently, particularly in the realm of reading files, and I saw pretty much what I expected to see in most cases. For small files (< 80KB) synchronous file reads were a hair faster overall. This is to be expected, though. The I/O doesn't take very long so just blocking and waiting for the response is going to be faster than offsetting the work to the OS, then waiting for a callback and all of the overhead that entails. For files larger than that, async file reads read multiple files much faster. Again this is to be expected. What I didn't expect, however, were OutOfMemoryExceptions while reading multiple large files asynchronously. I'd check PerfMon and see everything was just fine. Memory-wise I would have gigs to spare. So what was going on? I stepped through and the errors seemed arbitrary about when they'd happen, but not where they would happen. The error always occurred in the same bit if code, a simple allocation of a buffer.

What was happening was I was allocating my buffer to be the exact same size as my FileStream incoming. Which is fine if I'm only doing it once in a while. You see, when allocating an object that is larger than 85,000 bytes, that object goes immediately onto the Large Object Heap, and garbage collection leaves it there for a a lot longer than it normally would. This means that as I'm looping through my files to read, I'm collecting dead space in memory, and once the limits of my allocated space are reached or my LOH gets too fragmented... OutOfMemoryException.

The only good answer here is to use a Stream and read out smaller pieces individually when you're dealing with large files. Also, it's very important to make sure you're doing a good job caching frequently read file data... or any large objects for that matter. In the end it made me question a lot of my I/O use and my buffer allocations. It was certainly a learning experience.

PS: I want to thank Chris Klein from Ares Sportswear for his recommendation to switch the EvenLoop over to a Task and ContinueWith architecture. It sped things up a bit and it cleaned up the code a lot.

Monday, June 25, 2012

Event Loop Architecture in IIS with C# and ASP.Net

Big Changes


I made a few breaking changes to my event loop architecture... but it's alpha, and AFAIK, I'm the only one using it, so whatever. However these changes were to accommodate the new asynchronous HttpHandler I added to handle sending web requests into the event loop via IIS!

The downside to all of this, is like everything with IIS, it requires a bit more setup than I'd like. To get started with using the new ALE HttpHandler, here's a few steps:

Getting Started With ALE in IIS

  1. Start a new Web Project. You can go ahead and gut the project leaving only the Web.Config and the Global.asax.
  2. Remove cruft from the Web.Config and register the AleHttpHandler.
    <?xml version="1.0"?>
    <configuration>
       <system.web>
          <compilation debug="true" targetFramework="4.0" />
       </system.web>
    
       <system.webServer>
           <validation validateIntegratedModeConfiguration="false"/>
           <modules runAllManagedModulesForAllRequests="true"/>
           <handlers>
              <add verb="*" path="*"
                  name="AleHttpHandler"
                  type="ALE.Web.AleHttpHandler"/>
           </handlers>
       </system.webServer>
    </configuration>
    
  3. Add initialization code to Application_Start in the Global.asax.
    void Application_Start(object sender, EventArgs e)
    {
        // Start the event loop.
        EventLoop.Start();
    
        // Get the ALE server instance and wire up your middlware.
        ALE.Web.Server.Create()
           .Use((req, res) => res.Write("Hello World"))
           .Use((req, res) => res.Write("<br/>No really."));
    }
    
  4. Add tear down code to Application_End in the Global.asax.
    void Application_End(object sender, EventArgs e)
    {
        // Shut down the event loop.
        EventLoop.Stop();
    }
    
  5. Have fun.
This has been such a fun project. I am really thankful for all of the feedback and support I've received from my friends and others both on and offline.

Sunday, June 24, 2012

Added Node.js Connect Style "Middleware" To ALE


Inspired by Connect

I've added middleware functionality to ALE. This was mostly to lay the groundwork for a routed server. The idea borrows heavily from the Node.js module Connect. Basically, this just allows the developer to register a series of methods that will be executed in order as the request is processed. In JavaScript, however, this requires the use of a cumbersome "next()" function object that gets passed to each piece of middleware as a means of calling the next. Since we have events and delegates in C#, that isn't necessary, and it's a little cleaner, IMO.

To implementation looks like so:

EventLoop.Start(() => 
{
   Server.Create()
      .Use(DoSomePrepWork)
      .Use((req, res) => {
         var foo = req.Context.ContextBag.Foo;
         res.Write("Foo: " + foo + "

");
      })
      .Use(DoSomeLogging)
      .Listen("http://*:1337/");
});

Where there would be methods for middleware like so:

public void DoSomePrepwork(IRequest req, IResponse res)
{
   res.Context.ContextBag.Foo = "Wut, wut, wut? Socks and sandles!";
}

public void DoSomeLogging(IRequest req, IResponse res) 
{
   Logger.Log("A request was made to: " + req.Url);
}



What is middleware? How will it be used?


Well, in this case it's a poor use of a term that has sort of stuck when it comes to Node.js. Normally middleware would be considered to be some broker software, like a proxy, or a web api, or something like that. In this case it's just some code that is being executed between some other calls.




Because of what it is, I think it should be obvious how this could be used. It could be used for logging, or reporting, or authentication or additional processing of incoming requests or any sort. I realize at this point the server really only does one thing, and that there's no reason you couldn't just code your pre-processing and post-processing directly into the body of a single processing delegate... But I added this to lay the ground work for a routing implementation that will be coming in the near future.

Thursday, June 21, 2012

ALE - Added A Non-Blocking Web-Sockets Implementation

Continuing Work

Tonight after the kids went to bed I put together a non-blocking Web Sockets implementation. After a lot of trial and error I've finally got it sending and receiving over Web Sockets, at least with Chrome. The changes have been pushed to github. As of this post that would be version v0.0.3.0 ... still very alpha. I'm trying to get the method signatures to be something that flows well for what this architecture does. It's sort of a hard thing to do, IMO. Here's what I've come up with for Web Sockets in ALE, it's loosely based on what Node.js does:

EventLoop.Start(() => {
   Net.CreateServer((socket) => {
      socket.Send("Wee! sockets!");
      socket.Recieve((text) => {
          //do something here.
      });
   }).Listen("127.0.0.1",1337,"http://origin/");
});


I think I would like to get the Listen method's signature down to just two parameters that are both string representations of URIs, rather than 3 seperate arguments. I'm also not sure I'm good with how I've implemented the Receive method, which is a little abnormal in the C# world. What it's doing is actually "binding" to an "event", which is really just putting an Action in a List<Action> to be queued to the EventLoop when something is received.

I probably should go back in and clean up the code quite a bit, add comments etc. I've been sprinting so fast and hard whipping this architecture up that I haven't been very diligent with code cleanup or Unit Tests. Mostly just functional testing at this point.

Well... bed time for now...

Attempting To Peer Up The Skirt of Non-Blocking I/O in Windows

First A Thank You

Thanks to some fantastic advice I received from Wim Coenen and svick in my other blog post about my little event loop architecture, I've gone back and made some changes that take advantage of .NET's built in Async I/O, rather than my silly BeginInvoked actions.

Great advice like this is why I started this blog. So I can learn.

A Little Digging

Anyhow, after their comments, I started thinking, "How in the world can any I/O be done asynchronously and not block at least one thread?". I mean, a non-blocking main worker thread, sure, but non-blocking I/O? What is there some sort of magic I don't know about where by the OS will just "know" where to start my code back up again?  Well after some digging I came across I/O Completion Ports, which apparently are such magic. Well... mostly.

My (limited) Understanding of I/O Completion Ports and .NET Async I/O

Bearing in mind I just learned about all of this recently, here's my dumbed down version of what's going on here: Microsoft's asynchronous I/O calls in .NET (e.g: FileStream.BeginRead or Socket.BeginReceive) leverage I/O Completion Ports. An  I/O Completion Port is created with a file handle (which are endpoint handles for anything I/O... e.g. sockets, named pipes, file access, etc) and additional information (seemingly in the form of a pointer to something like an object or a method) about what to call upon completion. I followed the calls around in DotPeek starting at FileStream.BeginRead, which only gives me a very fuzzy idea that this is what is happening, when combined with the specs of I/O Completion Ports mentioned above. This is because most of the real magic happens inside native calls I can't see. Frankly, even if I could see them I doubt I have the smarts the figure out what they're doing quickly, or the patience to try to unravel the mystery.

ALE Updated to 0.0.2.1

As I stated above, I completely overhauled my implementation of non-blocking I/O to leverage .NET's asynchronous I/O methods. Everything should still be up on github. I've also made some changes to how the EventLoop is used. Now to start the EventLoop and being working with it it looks something like this:

EventLoop.Start(() => {
   Server.Create((req, res) => {
      res.Write("<h1>Hello World!</h1>");
   }).Listen("http://*:80/");
});

I also removed a lot of other asynchronous helpers that would block. I think my future implementations of Async calls that don't leverage .NET native async stuff will simply Pend more events on the event loop. I'll probably implement that soon.

Thanks again to the people who gave me feedback via email, facebook, in person, etc!

Tuesday, June 19, 2012

An Event Loop Architecture in C# similar to Node.js

I Really Like Node.js


As some of you know I've been working with Node.js quite a bit recently. There are a lot of things I like about it, and a few things I don't. I'm not going to get into all of that now. I think it's a great tool, and I'll continue to use it. But thinking about it's shortcomings had me wondering if an architecture like that was possible in my "home language" of C#.

So while I was showering, or pooping, or some bathroom related event (all of my best thoughts are usually in a bathroom I think), I came up with an idea of how I could implement an Event Loop in C#. The idea is simple enough, start a thread (or maybe more than one if I want) that pulls Actions off of a FIFO Queue and executes them. The second step would then be to implement I/O in a non-blocking way.

Why An Event Loop?


Well, simply put, it's non-blocking. Which means threads are used to their fullest potential. With other, synchronous architectures, threads chug along doing their thing until they need to wait for I/O (reading something from a disk, waiting for something over the network, waiting for user input, etc). At that point they block, which means they're sitting their doing nothing. With an Event Loop architecture, the thread is plugging away at bits of code (events) that have queued up. When one of those nasty I/O steps needs to be done, the architecture sends that off to some other thread in a thread pool, then continues processing what's next in the queue. The end result is that the main processing thread does not block and uses its thread to it's fullest potential.

I realize there are much, much, better explanations of the Event Loop architecture out there, and I've probably butchered the explanation to some degree, but I think that gets the general idea across.

Introducing ALE


So I created a new project, which I've open sourced on github called ALE (Another Looping Event)... I'm not sure it's the best name, and I'm up for suggestions, but ALE... beer... beer is good. Seemed good enough. Especially for what started off as a simple toy project or proof of concept.

This project thus far represents a brainstorm. A lot of fun effort, but it's an experiment at this stage, very alpha. What I'm really hoping for is your feedback. If I'm doing something wrong, tell me. If I could do something better, tell me. If you'd like to contribute, let me know. This has been a fun project to play with and I'm pretty excited about it.


Here is a basic example of how you can implement a simple web server:

UPDATE: I've added an asynchronous http handler so IIS can handle the incoming requests and send them into the event loop. More about that here.

using ALE.Http;

//Start the EventLoop and start the web server.
EventLoop.Start(() => {
   Server.Create((req, res) => {
      res.Write("Hello World");
   }).Listen("http://*:1337/");
});


Here's an example of some file system interaction:

using ALE.FileSystem;

//Start the EventLoop so it can process events.
EventLoop.Start(() => {
   //Read all of the text from a file, asynchronously.
   File.ReadAllText(@"C:\Foo.txt", (text) => {
      Console.WriteLine(text);
   });
});


An example of a simple web socket server.

using ALE.Tcp;

EventLoop.Start(() => {
   //Create a new web socket server.
   Net.CreateServer((socket) => {
      //send data to the client.
      socket.Send("Wee! sockets!");
      //set up callbacks for receiving data.
      socket.Recieve((text) => {
          //do something here.
      });
   }).Listen("127.0.0.1",1337,"http://origin/");
});

I've gone through the trouble of adding a few other things, a basic SqlClient implementation, for example. In the near future I'm planning on implementing some Entity Framework integration, Web Server middleware (ala Node.js's Express modules), web request routing, and Web Sockets.

But first thing is first, I'm pleading my much smarter developer friends to tell me what I'm doing wrong. I realize it's light on documentation (there is really none, haha)... but have a look and play around. It's pretty simplistic.

---
EDIT: Comments and recommendations below are dead on and I did a little research and wrote a bit about my findings here.

---
EDIT 2: Updated the syntax to reflect current project state.

---
EDIT 3: Updated the examples to include a web socket server.

Thursday, June 14, 2012

Don't Like My Language? Well You're A Dumbass

Okay, I kid, I kid. Sort of...

Experts Become Polyglots

This last weekend I attended the Pittsburgh Tech Fest. I met a lot of good people, and I heard a lot of really good talks. It started off with a keynote speaker named "Doc" Norton, whose named inspired shuddering fears of an old anti-virus tool I used to use that would eat up all of my CPU at random... but I digress: He had a very good talk to start off the day. He spoke about what it means to be an expert programmer and talked briefly about becoming a "polyglot" (or a speaker of many languages) of programming. He talked about how when someone becomes an expert in one language, it's often a good idea for them to move on and learn other languages, because when the do that, they learn new paradigms, new methods and new ideas for solving problems. In the end, it actually makes them better at all languages they use to be well versed in more than one language.

This really spoke to me, because while I definitely don't believe I'm an "expert" in any one language, since reaching a certain point with C#, I've been trying to dive more into other languages and get a deeper understanding of them. The first language I decided to do that with has been JavaScript. This is mostly because I know it's something I'll use whenever I'm doing web development.

JavaScript/Node.js Haters

Since starting to really try to master JavaScript, I've been finding myself running into a lot of other people's opinions on JavaScript and particularly JavaScript when it comes to Node.js. It seems that, especially in circles outside of Microsoft developers, there is a very low opinion of JavaScript and Node.js. At least two of the speakers at the Pittsburgh Tech Fest, that were talking about nothing to do with Node.js, made it a point to mention they didn't like Node.js. When pressed on their reasoning, they offered up mostly uninformed opinion, but one thing they both said was "because it's JavaScript". When asked why that was a hindrance, they'd say something to the effect of "the scoping is terrible" or "there's no strong typing".

Languages Are Like Onions... 

Sorry, I just watched Shrek with daughter the other day. The point I want to make about "other languages": Programming languages, all of them, have a specification. That specification may include things like dynamic typing (no strong typing), or a different scoping of variables than the language you're used to. This doesn't make that language "incorrect" in it's own implementation, it just makes it different from your favorite language. Each of these features are generally implemented by some very smart people for very specific reasons. The choice to use loose typing, for example, is just a choice saying "hey, I trust the developer that is using this knows what he's doing and doesn't need his hand held in the form of compiler errors while dealing with multiple data types". The choice to scope variables to functions, rather than any set of curly brackets, is just that, a choice.

Your Generalizations Are Bad... Allow Me To Generalize

A general dislike for an entire programming language, in this author's opinion, just amounts to a lack of understanding of that language. All too often I think it probably amounts to a little insecurity as well. Or fear of the unknown. It's hard to say. Some languages are verbose to the point of being a little annoying (CGI or VB comes to mind). Some languages don't have very good frameworks associated with them (ASP Classic anyone?). Some languages are downright archaic (COBOL, RPG make my eyes bleed). Some languages are stuck in one OS environment (C#, VB.Net, and I'm not counting Mono for now). Other languages just lack features and want you to throw interfaces on everything (I'm looking at you, Java). Some languages feel like they've been cobbled together haphazardly from remnants of PERL and a thousand open source projects (*cough*PHP*cough*). Offended? Oh no? I didn't touch on your favorite language? Damn. I'll try harder next time. I was just trying to illustrate, a complaint can be made about any language.

Can't We All Just Get Along?

... the point is, they're all good languages. All of them. And I'm happy to learn more about them and try them out, and you should be too. They all have strong points and weak points. What language should you choose for your next project? Whatever you want.

Sunday, June 10, 2012

Helpful JQuery Selectors: Select By Regular Expression, >, < and More

I've created a small library of custom JQuery selectors I thought I would share.

The code below adds the following selectors to JQuery:
  • :btwn(x,y) - selects elements from a collection of elements between two indices and y. So if $('div') was to return ten elements, $('div:btwn(1,4)') would get the three divs between indexes 1 and 4. Another way to do this with the default JQuery selectors is like so: $('div:gt(0):lt(3)') which gets all indexes after 1, then gets all indexes from that set before 3. It's just a different way to approach the problem.
  • :regex(test) -  selects elements whose text contents match the regular expression in test. This value can be surrounded by quotes if need be.
  • :startsWith(str) - selects elements whose text contents start with str. This value can be surrounded by quotes if need be.
  • :endsWith(str) - selects elements whose text contents end with str. This value can be surrounded by quotes if need be.
  • :attrgt(name, val) - selects elements by testing the attribute named name to see if it is greater than the value supplied as val.
  • :attrlt(name, val) - selects elements by testing the attribute named name to see if it is less than the value supplied as val.
  • :attrgte(name, val) - selects elements by testing the attribute named name to see if it is greater than or equal to the value supplied as val.
  • :attrlte(name, val) - selects elements by testing the attribute named name to see if it is less than or equal to the value supplied as val.
  • :attrregex(name, test) - selects elements by testing the attrbute named name with the regular expression supplied to test. The regular expression may be surrounded with quotes.

Download the minified source or view the development files on github.