Friday, 20 July 2018

Prevent hotlinking of CDN items to harvesting sites with Azure


CDNs are great

They provide a relatively cheap way of hosting static content, usually have all the caching configurable globally and individually, ensuring that maximum use is made of client caching and 304 returns; they provide multiple sites (or caches) globally, often called a Point of Presence (PoP) and which makes requests much faster globally than attempting to serve them from an origin server.

Reducing Application Burden

They also remove work from the application server than mostly doesn't care about serving static stuff when it would rather utilise CPU and disk for important logical things! This should allow maximum resource and minimum complexity for your application, which will be running on higher-cost and possibly load-balanced infrastructure.

This is both an advantage and a disadvantage.

Removing Application State from resources

The problem with moving resources to a static, generally public, endpoint takes away the concept of state and largely any protection you have against people downloading your content. This is mostly OK from a pure privacy point-of-view since your data is accessible to anyone who uses your application anyway, you can't hide anything ultimately but there are two issues that would be easiest to control via the application logic and which are much harder when you use a CDN.

Large Content Costs

One tricky problem is large content. Imagine you have some help videos hosted as part of your application. The design is for a single user to perhaps watch it once and that's it. Not too expensive and not too much bandwidth and can easily be contained inside the application logic to possibly restrict too many downloads per user etc. What happens when this is moved to a CDN? Anyone can download the video lots of times and that could add up to a lot of dollars, especially for a small company where $500 a month is a lot. Why would someone do that? Either because you have put a valuable resource on a public server and people simply like it and download it without you getting any value in the process, like you might if you had ads on your application but also, a bad person might simply want to make you pay loads of money by downloading your big content.

CDNs are not really designed to have application-style logic to prevent this kind of abuse. Any complexity or rate limiting would make the CDN slower and more costly to run and also, in most cases, you would not be able to identify what is real traffic and what isn't. You cannot easily get around this risk without having a specialist video hosting platform that can handle the abuse without you needing to do anything but you CAN do something on the Premium Azure CDN endpoint using token authentication.

I have not set this up personally but the idea is simple. Content that should only be given to specific users is obtained as a URL with a token from the web application. The token is then used by the CDN to prove that the client has permission to obtain the item on the basis that it has something that can only be signed by the web application. So what? What if someone copies it with the token? The token can embed certain properties that are used in conjunction with the token to ensure the request is legitimate, such as the IP address of the valid user at that point in time (or country, url, host etc. see here)

This solves this issue very easily but there is some setup required and you need to use the rules engine to apply the requirement to whichever paths/resources you want to.

Hotlinking

This can be a really annoying issue for anyone who generates valuable content such as images, forum posts etc. Since images are often a single URL, it is hard to stop someone who realises that your site has loads of cool images from simply embedding your images into its pages. People end up on their site because of your images and they get all the revenue as a result.

This is frighteningly common and very hard to prevent because the web is kind of designed to be public and if someone can see your image on your site, a simple link is all that it takes to steal. Even if you know someone is doing it, taking action in any meaningful way is hard, even if the perpetrator is in your own country.

So what can we do? We can use the Azure premium CDN once again to set up rules. You get to these by clicking the Manage button in the Azure portal which takes you to a very simple looking interface at cdn.windowsazure.com inside which you need to select either the Application Delivery Network if you are fronting an application or the HTTP Large menu if you are using normal CDN. Each of these has a Rules Engine option inside it.

Setting up a rule is described here but is fairly self-explnatory if a little clunky looking!

If you give your rule a useful name and then choose the option to base the rule on from the list which defaults to Always (confusingly) which means there is no logic, just do it to all requests. If you selected e.g. Device instead, the options for that come up for you to edit and fortunately, most are documented with little info icons. In this selection, you could choose referrer if your content should only appear in your site and not anyone elses. Note, this wouldn't stop someone from creating a local site on their PC with the same domain as yours and downloading everything but they would not be able to hotlink from a site with a different URL and to physically have to find, download and host the content somewhere else is harder, more expensive and exposes the attacker to a much more blatant crime than "I thought I could just link to the content because it was public".

If you are not sure what referrer origins to add, go to the Advanced HTTP Reports under the Analytics menu, choose HTTP Large Platform and click By Referrer in the left-hand-side, this will list all referrers. You should include the ones you know are yours, which hopefully is fairly easy!

As always, do a step at a time and test, test, test.
Post a Comment