As hosts of the excellent Data Insights Cambridge meetup, we’re excited for the upcoming talk by Dr Sacha Krstulovic from Audio Analytic Ltd. (who also happen to be our downstairs neighbours!). Dr Krstulovic will be speaking about ‘Filling the sensory gap in Big Data’.

What is Filling the sensory gap in Big Data?

Data analysis creates value by uncovering relationships between various types of data. Whilst there is lots of thought put into developing new data analysis techniques, this talk, on the other hand, focuses on the nature of the data itself. As a matter of fact, the evolution of technology across time has unlocked access to data layers of an increasingly rich and complex nature. As a consequence, new value propositions have arisen not only from new analysis techniques, but also from treating new types of data. Nowadays, this movement has reached as deep as allowing to think of humans as having an equivalent data self in various computer systems. However, at the cutting edge frontier of this movement, there is still an opportunity that is untapped, that of creating value from exploring one of the richest and hardest to reach data sets: human sensory data. As a case study, Audio Analytic is exploring this opportunity in the domain of acoustic data.

The Speaker

Dr Sacha Krstulovic

Dr Sacha Krstulovic is the VP of Technology at Audio Analytic, a company building a significant leadership in automatic sound recognition. Before joining AA, Sacha was a Senior Research Engineer at Nuance’s Advanced Speech Group (Nuance ASG), where he worked on pushing the limits of large scale speech recognition services such as Voicemail-to-Text and Voice-Based Mobile Assistants (Apple Siri type services). Prior to that, he was a Research Engineer at Toshiba Research Europe Ltd., developing novel Text-To-Speech synthesis approaches able to learn from data. He is the author and co-author of two book chapters, two international patents and several articles in international journals and conferences.

The meetup is scheduled for Thursday, November 12, 2015 at 7:00 pm at 50 St Andrew’s St, CB2 3AH. We hope to see you there, just sign up for it on the Data Insights Cambridge meetup page.

As part of the Visualization Team at Metail, I spend a large proportion of my time staring at renders of models, hoping that too much virtual flesh isn’t being exposed. It’s a onerous task and any automation to make my life easier is always appreciated. Can we get computers to make sure “naughty bits” aren’t being accidentally shown to our customers?

Detecting Flesh Colours

The diversity of flesh colours is quite large; the following is a small selection of MeModel renders:

Feet Skin Colours

Feet Skin Colours

The range of colours considered “fleshy” is further compounded by the synthesised lighting environment: shadows tend to push the colours towards black, highlights towards white.

The first question to consider is which colour space to use for detecting flesh colours. There’s quite a bit of informal debate about this. Some propose perceptual spaces such as HSL or HSV, but I’m going to stick my neck out and say that there’s no reason not to use plain, old RGB.1 Or, more precisely, to overcome the effects of fluctuating lighting levels, some form of normalized RGB. For example:

M = max(RGB) or 1 iff G ≡ B ≡ 0
R* = R ÷ M
G* = G ÷ M
B* = B ÷ M

Two observations can be made at this point:

  1. If G ≡ B ≡ 0, we’re in such a dark place that we cannot tell whether we’re looking at flesh or not; and
  2. The dominant colour channel in all (healthy) human skin is red (green-skinned lizards and blue Venusians are unsupported), so MR.

Typically, we further assume that the least dominant colour channel is blue. This leads to a pleasing heuristic:

(RGB) is flesh only if R > G > B

Marrying this up with a hue/brightness wheel gives us a generous segment of potentially “fleshy” colours:

Hue/Brightness Wheel

Hue/Brightness Wheel

There are various refinements to this linear programming technique to further partition the colour space, including:

  • J. Kovac, P. Peer, and F. Solina, “Human skin colour clustering for face detection” in Proceedings of EUROCON 2003. Computer as a Tool. The IEEE Region 8, 2003
  • A. A.-S. Saleh, “A simple and novel method for skin detection and face locating and tracking” in Asia-Pacific Conference on Computer-Human Interaction 2004 (APCHI 2004), LNCS 3101, 2004
  • D. B. Swift, “Evaluating graphic image files for objectionable content” US Patent US 6895111 B1, 2006
  • G. Osman, M. S. Hitam and M. N. Ismail, “Enhanced skin colour classifier using RGB Ratio model” in International Journal on Soft Computing (IJSC) Vol.3, No.4, November 2012

For example, one in-house heuristic we tried could be coded in JavaScript as:

function IsSkin(rgba) {
  if ((rgba.a > 0.9) && (rgba.r > rgba.g) && (rgba.g > rgba.b)) {
    var g = rgba.g / rgba.r;
    if ((g > 0.6) && (g < 0.9)) {
      var b = rgba.b / rgba.r;
      var expected_b = g * 1.28 - 0.35;
      return Math.abs(b - expected_b) < 0.05;
    }
  }
  return false;
}

However, there are some fundamental flaws with merely classifying each pixel as either “fleshy” or “non-fleshy”:

  1. The portion of the colour space that is taken up by human hair, although larger, overlaps the space taken up by flesh tones. This makes distinguishing between long hair lying on top of clothes and actual flesh very difficult.
  2. Many clothes or parts of clothes are deliberately designed to be flesh-coloured or “nude” looking.
  3. You do not know if the “fleshy” pixels are at naughty locations of the body or not.
  4. As the body shape parameters of the MeModel changes, the location of the “naughty bits” changes.

We’ll address these issues in the next part. However, even with fairly simplistic heuristics, the techniques discussed thus far reduce the number of images that have to be pored over by humans, as opposed to automated systems, by up to 90%.

Footnotehsv

The maximal range of hues generally considered as skin tones by most flesh pixel detectors is 0° ≤ H ≤ 60° (red to yellow).

This range equates to R ≥ G ≥ B, as can be seen in the chart on the right.

Furthermore, the saturation and value (or lightness) components of HSV (or HSL) detectors are usually discarded. Therefore, the “flesh predicate” can be constructed purely from relative comparisons between RG and B, or pairwise differences thereof. This is why I believe the RGB colour space to be no less accurate than other colour spaces.

AWS Meetup

On the 22nd October Metail sponsored and hosted the third Cambridge AWS User Group (who have a few photos of the event up). It was great to welcome the meetup organiser Jon Green, AWS tech evangelist Ian Massingham (the main speaker for the evening), and around 25 others to our office “town hall” for drinks, snacks (including honey roasted cashews), networking and of course the talks.

Ian had two spots on the schedule. The first was a round up of the announcements made last week in at AWS re:Invent. I’m not going to produce my own recap of Ian’s summary, A Cloud Guru gave a nice summary of the keynotes at re:Invent in their medium blog post and there’s a nice summary with follow on links over at trendmicro. There were a few questions from the audience last night all of which Ian answered to their satisfaction. Seems AWS is doing something right 🙂 My favourite bit of the talk was during the overview of AWS’ now beta IoT platform (apparently the first thing AWS themselves have referred to as a platform) and the consumer stories from re:Invent. Here the focus was largely around an application in farming, mentioning the possibility of fitting monitoring devices to animals and being able to buzz them to wake them up. As someone who grew up in the valleys of Wales surrounded by sheep I found the thought of experimentally setting the state of your flock to be awake quite amusing. The scale of farming in the US is slightly larger than what I’m used so that’s a lot more sheep to wake up through patchy signal coverage ;).

Ian’s second talk was a demo of the new EC2 container service. I’ve played with docker before but it’s not my area of expertise however the audience seemed suitably impressed and there was a good bit of discussion at the end.

Having bribed my way onto the agenda by getting Metail to host (and doing the literal leg work to get the drinks and snacks to the office) it was good to give a talk to an engaged audience. I opted to go into some depth on the tracking and batch processing steps of our data architecture and skimmed over the insights and how we actually drive the pipeline. I’m hoping to get invited back to go in to a bit more depth. Elastic MapReduce is a slightly more niche service in AWS, and perhaps being overtaken by the use of real time systems such as AWS’ kinesis family. The talk itself is up on slideshare and you can see me in action on the @CambridgeAWS twitter account 🙂

We are often complimented on the selection of beers which we put on at the meetups at Metail. This is in large part owing to the excellent choice found at the King’s Parade branch of Cambridge Wine Merchants; it makes a nice lunch time outing to go over and choose the selection for the evening’s meetup. Having said that, a little data insight from our event hosting is that diet coke is the most popular drink.

Photo cc Ian M.

I’m heading off to the 6th 3D Body Scanning Technologies conference (3DBST) for two days next week (27, 28 October).

Each day will host parallel sessions about 3D human body scanning and its industrial applications. According to the conference programme, this year they focus on

  • 3D body scanning technologies
  •  body measurements and its application in fashion industry
  •  medical applications
  •  scanning for health and sport
  •  anthropometric studies & surveys
  •  body scanning assessment & use

As you can imagine, there’s plenty there that could be relevant to what we are doing in Metail R&D to create better MeModels and clothing visualisation. I’m particularly looking forward to the following topics and tech talks:

  • Mobile body modelling solution 
    • Precise and Automatic Anthropometric Measurement Extraction Using Template Registration (DFKI, Germany)
    • Challenges of Designing a 3D Camera for Mobile Handsets and Tablets (Alces Technology, USA)
    • Volume Extraction from Body Scans for Bra Sizing (University Erlangen-Nuremberg, Erlangen, Germany)
  • Low cost scanning solutions
    • From Handheld Tablets to Retail Booths – New 3D Body Scanning Solutions (Size Stream, USA)
    • Low-Cost Data-Driven 3D Reconstruction and its Application (IBV, Spain)
    • Unlocking the Body as a Digital Platform and the Rise of Reliable Consumer Scanners (Body Labs Inc., USA)
  • Cloth/garment modelling
    • 3D Product Development for Loose-Fitting Garments Based on Parametric Human Models (TU Dresden and ITU, Germany)
    • Determination of the Air Gap Thickness Underneath the Garment for Lower Body Using 3D Body Scanning
    • A Dense Surface Motion Capture System for Accurate Acquisition of Cloth Deformation (Surrey uni, UK)

Apart from the tech talks there are demos from scanning companies such as Styku (USA), 3dMD (UK/USA), Size Stream (USA), VITRONIC (Germany), Sizzy (France), IBV (Spain), TechMed 3D (Canada).

If you’re going to the conference and want to meet up for a coffee or even a refreshing beer after a long tiring conference, please contact me at dongjoe@metail.com

 

We’re delighted to be hosting and sponsoring the 3rd Cambridge AWS User Group meetup tonight. Where we’re going to be getting a run down of this year’s AWS re:Invent and I’ll give an introduction to Metail’s data pipeline with a focus on how we’re using and configuring some of AWS’ services to power Metail’s data insights. As sponsors we’ll be providing beer, soft drinks and wine as well as some snacks (I’m a big fan of honey roasted cashews and in charge of purchasing, so feel free to comment on this post with any special requests).

It’s the first of these events we’re hosting here at Metail and I’ve used it as an excuse to write a presentation for the group. Although the meetup is moving around various hosts and sponsors in Cambridge you can expect to see it back at Metail in future, and hopefully a follow up/continuation of my talk in a future meetup.

Here’s the summary of the event from the user group page:

Docker, containers and the like. Plus, those of us who made it back from Vegas alive report on our findings.

The BIG news…if you’re in the Internet of Things business, a database or data analytics specialist, are into Docker containers, or use Kinesis or Lambda, there were some major announcements – and you will definitely want to be here to hear them!

This meeting is hosted, and snacks and drinks sponsored, by Metail.

We’ve got a lot to get through, so the meeting will start at 7pm sharp, with doors open from 6:30 – please be prompt!

 

The presentations:

Notices and news – Jon Green (Adeptium Consulting) and Brief Intro to Metail – Gareth Rogers (Metail)

• re:Invent 2015 Debrief – Ian Massingham (AWS) (and/or Jon Green)  – all the gory details and new announcements!

•  Metail’s Data Pipeline and AWS – Gareth Rogers

Docker and Amazon Container Service  – Ian Massingham

• [Possibly] From Zero to Hero in AWS – follow-up – Tom Clark

Hope to see you later! We also host the Cambridge R Users Group, Data Insights Cambridge, DevOps Cambridge and Cambridge NonDysFunctional Programmers so if you can’t make it, there are plenty of other opportunities to come along to Metail for some drinks, snacks and tech chat.