The topics this week continue to be stream related.
When building an API, some teams have a strict rule to always use ToList in a LINQ query and materialize the query results before sending the results back to an HTTP client. There are some advantages to this eager approach:
1) You’ll fail fast, compared to deferred execution.
2) You’ll never worry about controller code using IQueryable operators that can destroy a query plan.
However, every application may require a different set of rules. For APIs that move heavy amounts of data, chewing up memory to materialize results can lead to high memory usage, paging, and sluggish performance. Let’s look at an example using the following, simple model.
public class Widget { public int Id { get; set; } public string Name { get; set; } }
Imagine we have 1 million widgets to move across the Internet. We’ll use the following controller action to respond to a request for all the widgets.
[HttpGet] public List<Widget> Get() { var model = _widgetData.GetAll(); return model; }
Notice the return type of List<Widget>, meaning the GetAll method has already placed every widget into memory. ASP.NET Core handles the request well. For the client, the 35 megabytes of JSON streams out of the server with a chunked encoding, which is ideal.
HTTP/1.1 200 OK Transfer-Encoding: chunked Content-Type: application/json; charset=utf-8 Server: Kestrel
The downside is looking at the memory consumed on the server. After just three requests, the dotnet server process is using over 650MB of memory without serving any other requests. Move 4 million widgets at a time and the process is over 1.3GB.
Assuming your data source can stream results (like the firehose cursor of SQL Server), you can keep more memory available on the server by moving data with IEnumerable or IQueryable all the way to the client.
[HttpGet] public IEnumerable<Widget> Get() { // GetAll now returns IEnumerable, // not using ToList, but perhaps using // AsEnumerable var model = _widgetData.GetAll(); return model; }
With multiple inbound requests, moving even 4 million Widgets requires less than 250MB of heap space over time.
Streaming yields a considerable savings in a world where memory is still relatively precious. Don’t let development rules overrule your context or take away your precious resources, and always use benchmarks and tests to determine how your specific application will behave.
Still catching up on Pluralsight announcements from last year ...
The "Everything .NET Developers Should Know About Microsoft Azure" channel is a list of the Azure courses I made last year. 10+ hours of content, with more to come and updates this year.
I've seen people working hard to stream content in ASP.NET Core. The content could be a video file or other large download. The hard work usually involves byte arrays and memory streams.
Some of this hard work was necessary in 1.0, but there are a number of features in 2.0 that make the extra work unnecessary.
The ASP.NET static files middleware will add ETag headers and can respond to partial range requests. The transfer is efficient. Here's a cURL request for bytes 0 to 10 of a file named sample.mp4 that lives in the wwwroot folder.
$ curl -sD - -o /dev/null -H "Range: bytes=0-10" localhost:51957/sample.mp4 HTTP/1.1 206 Partial Content Content-Length: 11 Content-Type: video/mp4 Content-Range: bytes 0-10/10630899 Last-Modified: Wed, 14 May 2014 23:39:15 GMT Accept-Ranges: bytes ETag: "1cf6fcdb79b6573"
In the output you can see the 206 partial content response and Accept-Ranges header.
There are many scenarios where serving files with static files middleware is problematic. One scenario is when the files require authorization checks. It is possible to enforce authorization checks on static files, but another approach is to use controller actions. Imagine serving up the following HTML:
<h2>File Stream</h2> <video autoplay controls src="/stream/samplevideo"></video>
The video source points to the following controller action.
[Authorize(Policy = "viewerPolicy")] public IActionResult SampleVideoStream() { var path = Path.Combine(pathToVideos, "sample.mp4"); return File(System.IO.File.OpenRead(path), "video/mp4"); }
FileStreamResult and VirtualFileResult also support Range headers with a 206 response, although they won't send ETags without using some additional code. In many cases, however, the behavior is good enough to move forward without creating a lot of additional work.
Catching up on some Pluralsight announcements...
My ASP.NET Core Fundamentals Course has been re-recorded for ASP.NET Core 2.0.
Some pieces have changed dramatically. For example, I give only a brief overview of the ASP.NET Identity framework and spend more time on integrating with an OpenID Connect provider. New topics include how to use the Razor pages feature in v2. Enjoy!
It’s easy to miscommunicate intentions.
Teams starting to build HTTP based APIs need to think in terms of resources and the operations they want a client to perform on those resources. The resource name should appear in the URL. The operation is defined by the HTTP method. These concepts are important to understand when building APIs for the outside world to consume. Sometimes the frameworks and abstractions we use make it easy to forget the ultimate goal of an HTTP API.
Here's an example.
I came across a bit of code that looks something like the following. The idea is to expose a resource with associated child resources. In my example, the parent is an invoice, the children are line items on the invoice.
public class InvoiceController : Controller { [Route("GetAll/{invoiceId}")] [HttpGet] public IEnumerable<LineItem> GetByInvoiceId(string invoiceId) { // ... return all items for the invoice } [Route("Get/{lineItemId}")] [HttpGet] public LineItem GetByLineItemId(string lineItemId) { // .. return a specific line item } }
From a C# developers perspective, the class name and method names are reasonable. However, an HTTP client only interacts with the API using HTTP messages. If I am putting together code to send an HTTP GET request to /getall/87, I'm not going to feel comfortable knowing what resource I'm interacting with. The URL (remember the R stands for resource) does not identity the resource in any manner, only an operation (GET, which should be handled by the HTTP message instead of appearing in the URL).
One of the keys to building an effective HTTP API is to map the resources in your domain to a set of URLs. There are various subtleties to take into account, but in general you'll build a more effective API if you think about the interface you are exposing to clients first, and then figure out how to implement the interface. In this scenario, I'd think about each invoice being a resource, and each line item being a nested resource inside a specific invoice. Thinking about the interface first, I'd try to expose an API like so:
GET /invoices/3 <- get details on the invoice with an ID of 3
GET /invoices/3/lineitems <- get all the line items for an invoice with an id of 3
GET /invoices/3/lineitems/87 <- get the line item with an id of 87 (which is inside the invoice with an id of 3)
There is also nothing wrong with giving a nested resource a top level URL. In other words, a single resource can have multiple locators. It all depends on how the client will need to use the API.
GET /lineitems/87 <- get the line item with an id of 87
In the end, this interface is more descriptive compared to:
GET getall/87
And the controller could look like the following (not all endpoints listed above are implemented):
[Route("invoices")] public class InvoiceController : Controller { [Route("{invoiceId}/lineitems")] [HttpGet] public IEnumerable<LineItem> GetByInvoiceId(string invoiceId) { // ... return all line items for given invoice } [Route("{invoiceId}/lineitems/{lineItemId}")] [Route("~/lineitems/{lineItemId}"] [HttpGet] public LineItem GetByLineItemId(string lineItemId) { // .. return a specific item } }
Just remember contract first. Implementation later.
Platform database services in Azure, like Azure SQL and Cosmos DB, both need to charge each of us according to our utilization.
Azure SQL measures utilization using a Database Throughput Unit (DTU). A DTU is a relative number we can use to compare pricing tiers. As a relative number, a single DTU remains intangible and mysterious.
Cosmos DB measures utilization using a Request Unit (RU). An RU is well defined, and there is even a calculator available to estimate RU provisioning.
Cosmos includes the resource charge for every operation as an HTTP response header. It is not immediately obvious how to access the request charge when using the .NET SDK, but as the following code shows, you can access a RequestCharge property as part of any IFeedResponse. The following code collects the charges over the course of iterating a paged query.
var query = client.CreateDocumentQuery<T>( UriFactory.CreateDocumentCollectionUri(DatabaseId, CollectionId), new FeedOptions { MaxItemCount = -1 }) .AsDocumentQuery(); var results = new List<T>(); var totalRequestCharge = 0.0; while (query.HasMoreResults) { var response = await query.ExecuteNextAsync<T>(); totalRequestCharge += response.RequestCharge; results.AddRange(response); }
The DefaultFiles middleware of ASP.NET Core is often misunderstood.
app.UseDefaultFiles();
The DefaultFiles middleware by itself will not serve any files. DefaultFiles will only re-write the request Path value to match a file name, if the request points to a directory where a default file exists.
You still need the StaticFiles middleware to serve the default file.
app.UseDefaultFiles(); app.UseStaticFiles();
It’s also important to know that DefaultFiles, like StaticFiles, will work in the web root path by default (wwwroot). If you want both pieces of middleware to work with a different (or second) folder of files, you’ll need to give both pieces of middleware a file provider that points to the folder.
var clientPath = Path.Combine(env.ContentRootPath, "client"); var fileprovider = new PhysicalFileProvider(clientPath); app.UseDefaultFiles(new DefaultFilesOptions { DefaultFileNames = new [] { "foo.html" }, FileProvider = fileprovider }); app.UseStaticFiles(new StaticFileOptions { FileProvider = fileprovider });
In most cases, the middleware you want to use for these scenarios is the FileServer middleware, which combines DefaultFiles and StaticFiles.
var clientPath = Path.Combine(env.ContentRootPath, "client"); var fileprovider = new PhysicalFileProvider(clientPath); var fileServerOptions = new FileServerOptions(); fileServerOptions.DefaultFilesOptions .DefaultFileNames = new[] { "foo.html" }; fileServerOptions.FileProvider = fileprovider; app.UseFileServer(fileServerOptions);