GCHQ throws down the gauntlet
A while back I found the GCHQ Director’s Christmas card, which came in the form of a nonogram. GCHQ has a history of puzzle setting and even hiring people through puzzles. The WWII codebreakers were hired through crosswords and other puzzles in the newspaper, which was featured in The Imitation Game.
I was new to nonograms, but quickly found out they’re a “paint by numbers” puzzle where hints for rows and columns give you series of segments (of varying lengths) to colour in. Applying all the clues together logically lets you work out whether each cell is filled in or empty progressively until you reach the final solution.
Typically you then have a badly pixelated picture of something and a sense of accomplishment. With GCHQ’s puzzle, you end up with a QR code that leads you to the next puzzle. So the picture is not very pretty and the sense of accomplishment is short lived. Here’s what GCHQ’s best Christmas wishes look like:
And I thought I was bad at Christmas cards.
I love data and data visualizations. They can give deep insight into problems and behaviours, and they can make you interested in something you previously thought dull.
I came across the article “How to Catch a Spoofer” from Bloomberg, by Matthew Leising, Mira Rojanasakul and Adam Pearce. The article gives a fascinating view into the trading activity on the Chicago futures exchange, and how to identify “spoofing” within trading activity. The visualizations take a combination of a difficult concept and large volume of data, and extract genuine, novel insights.
I once had an interview question asking for an algorithm for a Roomba that ensures it covers every square of a room divided into grid cells, given that the room shape and location of obstacles are unknown. It’s similar to the idea of solving a maze, except that instead of getting to a specific point, you’re trying to visit every point in the room – to clean it!
It’s a pretty common problem, but I hadn’t seen it in the guise of a physical robot before. Running a Depth First Search covers every piece of floor easily enough, but casting it as a physical device that has to move implies a large cost to popping back up the stack that’s generated during DFS. There’s a lot of backtracking in a DFS based approach for a Roomba, so it makes for a slower vacuuming job.
It made me wonder whether there was some better approach than DFS that would be more efficient.
Princeton’s Algorithms II course includes an assignment on finding Boggle words. Briefly, Boggle is a game where you have a two dimensional grid of random letters and players try to find as many real words as they can from the board by stringing together neighbouring letters.
This post looks at how tweaking the initial implementation can give a 2x speedup, but picking the right data structure gives a 4,200x speedup.
I’ve been working on the Algorithms II course from Princeton on Coursera lately, which has been pretty interesting. One of the assignments is on the topic of seam carving, which is a technique for arbitrarily resizing images while maintaining as much of the important detail as possible. It’s also referred to as image re-targeting, coming from the idea of re-targeting an image for a phone/tablet screen with a different resolution and aspect ratio.
Seam carving prioritises important details during resizing
The problem with traditional resizing of images where the aspect ratio is changing is that it requires either cropping or stretching the original image. Stretching makes the main features look warped, especially people, while cropping loses detail – especially if there’s no dead zone on one side that can be cropped away, like an empty sky.
Instead of scaling all parts of the image equally, it can be advantageous to remove more of the dead zone pixels, such as gaps between people in photographs. This is why seam carving has been used to implement Content Aware Scaling in Photoshop. There’s a good example on that page, too.
Seam carving achieves this by identifying the “energy” of each pixel, which is a measurement of the how important to the image the detail of each pixel is. It then identifies horizontal or vertical seams through the image and removes the seam with the least total energy. The seams don’t have to be straight lines, allowing the algorithm to remove paths of low detail pixels that follow the shape of the image.
I recently ran a workshop on Rx, and as part of preparing I had to hunt down some useful resources for learning. I found that things had come a long way since the early days, when people mostly learned from videos of Wes Dyer, Jeff van Gogh and Bart de Smet drawing on whiteboards at Microsoft, and Lee Campbell’s blog.
With all this new exposure, there are now a lot more resources for learning Rx. Here are some that I think are particularly useful.
StackOverflow, like Wikipedia, is a dangerous place. It’s a place where you go to learn something you didn’t know you wanted to learn. In both cases, the risk is: are you going to resurface, half an hour later, much the wiser on the French Revolution, or perhaps some nitty gritty feature of the Garbage Collector? Or are you going to come out with an enhanced knowledge of the manufacturing technique used to create left handed drinking straws in the early 20th century (and the life of the mother of the creator, and the university they went to, and the wartime history of the town they grew up in)?
So it was a lucky happening that I bumped into a thread on StackOverflow a while back regarding how Math.Pow is implemented in the .Net framework. It’s one of these methods which leave you cold when you navigate to their implementation with Resharper/dotPeek – because it’s just a stub with a MethodImplOptions.InternalCall attribute on it. It’s basically a marker saying “this code is implemented with magic”.
But, as Eric Lippert says – there is no magic. Magic is just a place holder for a deeper understanding, in the same way as MethodImplOptions.InternalCall is a place holder for a lower level implementation.
The post leads on to a discussion of how methods marked with MethodImplOptions.InternalCall are actually implemented, where the implementation is, and how the jitter can even substitute some functions directly with a single machine code instruction.
In the post you can see the Sin, Cos, Sqrt, Round functions are eligible to be implemented as a direct machine code instruction. As Hans points out, this can lead to C# floating point code that’s actually faster than the same code in C++ that uses library functions.
So on this case, I managed to escape from StackOverflow intact, and with an increased knowledge of MethodImpl.InternalCall, which is something I had been curious about for over 10 years.
The Reactive Extensions (Rx) library is one of those tools that opens up your mind to new ways of thinking about your how your data and application work together. In this post I’m going to talk about an easy technique you can use to replay time series data as if it were live.
Replaying time with HistoricalScheduler
Schedulers in Rx are a deep subject, but one of the things schedulers are responsible for is managing how time based operations run. All time based operations in Rx, like Timeout, Delay, etc allow you to provide your own scheduler.
HistoricalScheduler inherits from VirtualTimeScheduler, and that name gives us an idea what we can do with it. VirtualTimeScheduler gives us control of time – at least as far as our code goes. You can’t go back in time and buy stocks to become rich, but you can make your code think it’s doing that.
I had the opportunity to run a workshop on the Reactive Extensions library with the Sydney Women Who Code group earlier this year. Rx is something I’m passionate about – it really has the power to change the way you approach your data and application design to open up a lot of possibilities.
This post gives some details about the workshop, but is mostly about the process of creating it and some insights into how it went. The workshop and code are on github if you want to check it out.
I’m a software developer, living in Australia. I love software for what you can do with it, and how much you can learn from it. I love the power of information. Information has behaviour and meaning. It moves and flows, and it has patterns. Good software is software that harmonizes with these patterns to reveal something new and powerful.
I’m a .Net developer, and I spend a lot of time with LINQ and the Reactive Extensions framework, which are both great tools for letting you explore and manipulate data.
I also have a growing interest in wine. Since returning to Australia from London, I’ve been getting to know our wines better. I love the idea of finding a wine with a personal story behind it, and putting it away to be opened up a decade later. All of which is another way of saying that I’ve acquired a wine fridge and one of my favourite hobbies is keeping it full. As a friend said to me, wine fridges that are full have greater thermal inertia, and are therefore more efficient. So really it’s just environmentally friendly…