4.6 KiB
Source: https://ak.ari.lt/notice/AnLMnvNZU6b7xrMbOS
Writing this post capped and allat since I don't want to get lost in my train of thought at the moment.
I want to talk a little about Vessel.
A quick reminder what vessel is: Vessel is a simple web application microframework focusing on hookability and low-level control, yet keeping it efficient, simple, and powerful like C.
Well, since Vessel is a web application framework, similar to Flask, I kind of have to deal with HTTP requests, and since Vessel supports 2 versions of HTTP (1.0 and 1.1) I have to support all features of those protocols. 1.1 is easy enough - it is one request per socket and other simplistic features which can allow people to set up a simple web server in C in just a few lines.
Of course, 1.1 is mostly http 1.0 with some fancy stuff, which is where the problem is. The fancy stuff includes chunked content and keep-alive connections. Which, on their own, isn't much of a problem, but once you try to combine them, they tend to clash. This is because we don't truly know the length of chunked content, which is pretty much the whole point of it.
Handling keep-alive connections requires us to be able to parse concatenated HTTP requests since that's how 1.1 handles keep-alive requests, which is not much of a problem when we know the content length:
[headers]\r\n
[known content length bytes]
But, with chunked content we don't know the length of the content, therefore, we cannot easily skip it.
Well, you could say 'just read the chunks vro', and you would be right, but it's not possible because of how Vessel abstraction layers are structured as well as limitations of sockets. Vessel, at its core, is just a very customizable socket server implementation, where web stuff (HTTP/1.x) are handled in a separate hook called WebServer. In the web server hook we set up basic path and header parsing as well as other primitive HTTP stuff, although, we don't touch the content at all and assume that the handler function (usually a route function) will handle the content appropriately, but, the thing is that the routes are not required to handle all the content.
Which is where the problem is, the web server has to continue reading HTTP requests yet if the route doesn't handle the content appropriately it will end up reading garbage or insecure data, which sucks. But, as we are in the web server hook we can't read the data (since loading multiple GBs into memory for instance would be terrible for the server, and the route handler could not read the content since it would be popped off the queue. MSG_PEEK
only gets us so far), we cannot peek far enough (due to socket limitations), and we don't know the length of the data (since chunks are dynamic) so we cannot know how many bytes to skip.
Also, there's the problem of generally having to manually handle chunked data, which is annoying.
Therefore, I think I need to add another abstraction layer to Vessel lmao. Essentially, an HTTPStream. We currently have 3 main abstraction layers when dealing with everything:
- File (raw file descriptor essentially, for cross-compat)
- Stream (buffered file descriptor with some fancy stuff, like
FILE*
but very customised) - SocketServer (which manages and curates all the Stream-s)
Well, I believe this calls for another abstraction layer:
- HTTPStream (which should handle HTTP content, I.e. chunks and stuff)
This would help us since we would be able to track the stream progress, if it is chunked data, where we stopped reading, and we can also handle chunked data appropriately. After the hook has been run, we can just call HTTPStream_skip_content()
or something, and having not just a raw TCP stream essentially, we can also have context data, for instance, bytes read, last chunk, next chunk, etc. With this abstraction, we can effectively implement skipping of content, therefore, we can implement keep-alive connections :D
Having this abstraction would also allow us to handle chunked content in a nice way where the routes won't need to deal with it and the HTTPStream, dealing with a raw TCP stream, could do that.
HTTPStream would most likely wrap Stream with just some context information and data pre-processing so it'd be a pretty lightweight wrapper allowing us to access context data to allow an easier implementation of keep-alive connections, therefore, allowing us to support more HTTP/1.1
TL;DR keep-alive sucks, although, adding another abstraction layer will be helpful, so I think I will do that :)
I'm not looking for any help, I just kind of wanted to put my thoughts somewhere and also share the progress and direction Vessel is heading in.