I have recently needed to run varnish (A very fast web cache for busy sites) in a situation that also required use of HTTPS on the box. Unfortunately, Varnish does not not handle crypto, which is probably a good thing given how easy it is for programmers to make mistakes in their code, rendering the security useless!

Whilst recipes for Stunnel and Varnish together exist, information on running them on the same box whilst still presenting the original source IP to varnish for logging/load balancing purposes was scarce – the below configuration “worked for me”, at least on Debian 7.0. (Wheezy) You will need the xt_mark module which should be part of most distributions, but I found was missing from some hosted boxes and VMs with custom kernels. The specific versions of software we are running are based on:

  • Linux 3.13.5
  • Varnish 3.0.6
  • STunnel 5.24

If you are running Varnish 4, the VCL here will require rewriting as it has been updated from version 3.

IPTables – mark traffic from source port 8088 for routing
iptables -t mangle -A OUTPUT -p tcp -m multiport --sports 8088 -j MARK --set-xmark 0x1/0xffffffff

Routing configuration – anything marked by IPTables, send back to the local box. These two can be added under iface lo as “post-up” commands if you’re on a Debian box.
ip rule add fwmark 1 lookup 100
ip route add local 0.0.0.0/0 dev lo table 100

STunnel configuration. The connect IP MUST be an IP on the box other than loopback, i.e. it will not work if you specify 127.0.0.1.

[https]
accept = 443
connect = 10.1.1.1:8088
transparent = source

From default.vcl:
import std;

sub vcl_recv {
// Set header variables in a sensible way.
remove req.http.X-Forwarded-Proto;

if (server.port == 8088) {
set req.http.X-Forwarded-Proto = “https”;
} else {
set req.http.X-Forwarded-Proto = “http”;
}

set req.http.X-Forwarded-For = client.ip;
std.collect(req.http.X-Forwarded-For);
}

sub vcl_hash {
// SSL data returned may be different from non-SSL.
// (E.g. including https:// in URLs)
hash_data(server.port);
}

Clojure Meetup as part of the Cambridge Non-dysfunctional Programmers group, in Metails office at 50 St Andrews St, Cambridge

Clojure Meetup as part of the Cambridge Non-dysfunctional Programmers group, in Metails office at 50 St Andrews St, Cambridge

This month’s meetup of the Cambridge NonDysFunctional Programmers will be hosted here at Metail’s Cambridge office next Thursday (26th November) from 6.30pm. I (Ray Miller) will be giving a hands-on introduction to web development with Clojure, where attendees get to implement their first Clojure web application from the ground up.

Along the way, we’ll learn about the Compojure routing library, Ring requests and responses, middleware, and generating HTML with hiccup. Time permitting, we’ll also cover interacting with a relational database and using buddy to add session-based authentication and authorization to our application.

The theme running through the tutorial is implementation of an ad server (an example I shamelessly stole from Dan Benjamin’s Meet Sinatra screencast). This demo application delivers a Javascript snippet to embed random ads in a web page and tracks user click-throughs. It also provides an administrative interface for reporting and managing ads. If you’ve already seen Dan’s screencast, you’ll see how Clojure compares with Sinatra to implement the same application.

See the Meetup page for full details and to sign up.

 

Shortly after I joined Metail in late summer 2011 there was a typical English bit of weather; namely an apocalyptic downfall of rain just as I was leaving work on a Wednesday. I was immediately transported back to a Kenyan balcony on which I had spent many happy hours as a proper colonial – sipping a G&T and watching the sun go down. The temperature was right, the sun was low in the sky, the tree was in full bloom (well it had leaves on it at any rate) and a tropical storm was in the air. Thinking that it was only fair to share this experience with those who had not been fortunate enough to have exposure to the original I packed my bag on Friday with a selection of gins, tonic and appropriate accoutrements.

This isn't www.xkcd.com

Unpacking the essentials after moving office

This went down well with colleagues and every Friday since we have endeavoured to gather and raise a glass to the end of the week. It has provided a great opportunity to relax and to meet other team mates and their partners and children. It also allowed teams to bond and cross-team conversations to happen. We also got the chance to hear the result of Nick’s nimble fingers (this harks back to the halcyon days when there was room to swing a cat and strum a guitar upstairs in 16), and share in the occasional sing song.

As the team has expanded I have not been able to support the whole cost so have asked for contributions of £10 a month towards the cost and welcome any suggestions or requests for particular drinks. As well as widening the group who make the drinks each week. We have progressed from simple gins and tonics to brambles, why nots, and even non gin-based drinks – manhattans, daiquiris, sidecars and orange brulées spring immediately to mind. Most of our shopping is done at Cambridge Wine Merchants who are always happy to help us out with ideas or substitutions for ingredients.

Oh and finally like any good dealer – your first few hits are free.

 

This is not http://www.smbc-comics.com/

Ace of Clubs Daiquiris made by Ian Taylor and photographed by Andrew Dunn

For anyone thinking of setting up something similar, the following price structure was designed to be as inclusive as possible and has been in place as the office head count has more than quadrupled. It has kept the club in the black and allowed it to provide something alcoholic and non-alcoholic for everyone at Christmas. Membership of the club is purely optional.

Members: £10 a month (first month free)

Interns: drink free

Partners: drink free

Guests: drink free

Non-Members from the office: drink free

Most of Metail’s try-it-on tech is delivered as a single page JavaScript application that embeds inside retailer sites, calling back to Metail-hosted services in Europe over HTTPS/JSON. In this post, Nick Day describes some of the more difficult trade-offs that we needed to make to make the best use of Content Distribution Networks to improve the speed of our app.

tl;dr When you’re using a CDN to accelerate the delivery of a 3rd party HTTPS / AJAX app to users far away, you end up having to choose your devils. We chose the one called “pre-flight OPTIONS request”.

We’re currently working with Dafiti, a Brazilian company whose user base is almost exclusively in-country. Our web-application and web-cache servers are based in the UK, so to aid user-experience we’ve moved as much as possible of our static and dynamic content so it’s served through a CDN (CloudFront, which handily has edge nodes in São Paulo).  

The last piece of this “CDN-ifying” has been to serve the initial HTML page from CloudFront.  Naturally, this has a large impact on page load speeds as the browser can do nothing else until this resource has been obtained; so fulfilling the request from a nearby CloudFront node is hugely preferable to having it trot under the Atlantic to hit our servers and then back.  However, as this page is now being served from the CDN’s domain, browser security constraints mean that any AJAX requests being made from it must either:

  1. also go through the CDN domain, or
  2. be served directly from your service domain using cross-origin resource sharing (CORS)

For cacheable content obtained through AJAX (in our case things like garment information) the preferred option is clear; serve it through the CDN.  The decision is more involved for content that you don’t want to or can’t cache, like requests that you need to be up-to-date (e.g. account details) or update requests. In this case your choice is for the lesser of two evils.

Serving through the CDN:

  • you’re making every one of these requests take an indirect route, as the CDN will be passing them all on for your servers to handle
  • on top of that there’s an extra hit; SSL de/encryption must happen in the CDN for these requests to be forwarded
  • you have to manage potentially complex CDN config to do the right thing for each request.  You wouldn’t have to do this with your typical CDN-cached content, but in practise you may need to add cookies and headers to a whitelist for the CDN to forward.  CloudFront makes this problem extra-fun by lacking a programmatic way to update/version your distribution configuration, making it easy to take down your service accidentally, and hard to roll back.
  • you’re paying (real money) each time you route one of these requests (somewhat unnecessarily) through a CDN

Alternatively, if you send these requests directly to your service domain:

  • you’ll need to implement and maintain CORS handling for those endpoints being requested from other domains
  • for requests deemed “non-simple” by the CORS spec (methods other than GET/HEAD or POST; using Content-Type other than a very strict set, which surprisingly does not include JSON; or using custom headers) the browser will automatically make a pre-flight OPTIONS request to check that it is able to make the request before actually making the one you desire.  This is a big deal if your longer-than-you’d-like cross-Atlantic request suddenly becomes two!  It was this point that almost pushed me toward taking the CDN route for these requests.

Fortunately for us, there are two saving graces with this latter choice of sending the requests directly to our service domain:

  1. All of the requests in our app that require a pre-flight request are non-blocking, that is that they don’t require a user to wait before viewing part of, or accessing functionality in the app. They’re all “update in the background” kind of requests. Granted, if these take too long you could imagine timing issues creeping in, but we should be coding defensively to guard against those kind of asynchronous problems anyway.
  2. The CORS spec allows you to specify a caching max-age for the OPTIONS requests. Of course, since you’re not going through a CDN, this caching will be on a per-user basis against the web-cache. At the moment, unfortunately for us, even a cache hit must come to Europe. The answer here is that it’s probably going to make sense for us to move web-caches closer to where the users are, even if the apps stay rooted in the UK.  Another slightly niggling point is that some browsers currently have an enforced maximum cache period that’s quite short for pre-flight requests (e.g. Chrome is 5 minutes).  As this is shorter than our average user session, it means that it’s likely a user will have to make those same requests more than once per session.  From half a world away, that’s not ideal.

Taking all of these points into consideration, we favoured sending requests directly to our service domain; preferring to take the hit with the “background” requests that require a pre-flight in order to improve performance of most of our non-caching requests.  

As hosts of the excellent Data Insights Cambridge meetup, we’re excited for the upcoming talk by Dr Sacha Krstulovic from Audio Analytic Ltd. (who also happen to be our downstairs neighbours!). Dr Krstulovic will be speaking about ‘Filling the sensory gap in Big Data’.

What is Filling the sensory gap in Big Data?

Data analysis creates value by uncovering relationships between various types of data. Whilst there is lots of thought put into developing new data analysis techniques, this talk, on the other hand, focuses on the nature of the data itself. As a matter of fact, the evolution of technology across time has unlocked access to data layers of an increasingly rich and complex nature. As a consequence, new value propositions have arisen not only from new analysis techniques, but also from treating new types of data. Nowadays, this movement has reached as deep as allowing to think of humans as having an equivalent data self in various computer systems. However, at the cutting edge frontier of this movement, there is still an opportunity that is untapped, that of creating value from exploring one of the richest and hardest to reach data sets: human sensory data. As a case study, Audio Analytic is exploring this opportunity in the domain of acoustic data.

The Speaker

Dr Sacha Krstulovic

Dr Sacha Krstulovic is the VP of Technology at Audio Analytic, a company building a significant leadership in automatic sound recognition. Before joining AA, Sacha was a Senior Research Engineer at Nuance’s Advanced Speech Group (Nuance ASG), where he worked on pushing the limits of large scale speech recognition services such as Voicemail-to-Text and Voice-Based Mobile Assistants (Apple Siri type services). Prior to that, he was a Research Engineer at Toshiba Research Europe Ltd., developing novel Text-To-Speech synthesis approaches able to learn from data. He is the author and co-author of two book chapters, two international patents and several articles in international journals and conferences.

The meetup is scheduled for Thursday, November 12, 2015 at 7:00 pm at 50 St Andrew’s St, CB2 3AH. We hope to see you there, just sign up for it on the Data Insights Cambridge meetup page.