Slide 1

Slide 1 text

Hello everyone, and thanks for having me here. My name is Alex, and since I was young, at least according to surviving evidence, I’ve been interested in 2 things:

Slide 2

Slide 2 text

… Computers …

Slide 3

Slide 3 text

… And photography ….

Slide 4

Slide 4 text

Nowadays I work as a designer in a lovely company down in London called Webcredible, and one of the things that keeps fascinating me is the the intersection between people, computers, and images

Slide 5

Slide 5 text

– the images that we create and most often store, manipulate and share using computers. I think there’s a few interesting challenges and opportunities in this space, and that’s what I’d like to talk about today. But first …

Slide 6

Slide 6 text

No content

Slide 7

Slide 7 text

These are some of the most well known paleolithic cave paintings in Lascaux in France, estimated to be more than 17000 years old. Nobody can really tell for sure who created them and why, but whoever did must have had a reason. For many thousands of years, painting was all people could do in order to create images. Then a magic technique known as photography came along.

Slide 8

Slide 8 text

This is the earliest photo that still survives today, shot between 1826-1827, it’s simply the view outside the photographer’s window. The materials used at that time weren’t quite as sensitive to light as you’d want them to be, so this exposure took an estimated 8 hours. Not quite what you’d call an instagram … However, even though we’ve had photos since 1826, it took much longer until it was commonplace to work with photos on a computer screen. Even when the World Wide Web was 'invented' 23 years ago, around 1990, it was a text-based world.

Slide 9

Slide 9 text

This is a screenshot of the first web browser, showing some of the websites from the early 90s. Lots of text, zero images. Of course, you could be forgiven for not including any images – at that era, digital cameras were little more than prototypes.

Slide 10

Slide 10 text

This is one of the first digital SLR cameras. It cost 30,000 US$, had 200MB of storage for 1MP photos, and needed a suitcase of electronics to go with it, weighing a total of 25kgs. So it’s no surprise that it would be 2 more years after the invention of the www, when in 1992 the first photographic image would be uploaded into what we now call a website

Slide 11

Slide 11 text

This is the image in question – a pop music group started by four women working in CERN, the physics institute where the web started from. The year 1992 marks another milestone.

Slide 12

Slide 12 text

It’s the year when the image file format we all know as JPEG was finalised. JPEG is now the most popular and most efficient format for sharing photographic images online. And it took a bit longer until people agreed how to embed images in webpages.

Slide 13

Slide 13 text

This is Marc Andreessen in 1993, proposing the IMG tag for embedding images in HTML. The full discussion thread goes on for quite a while, during which they discuss some interesting things that still haven’t been implemented in web browsers. It’s all available to read online, if you’re interested in seeing how standards are made, and what the web would have might have looked like like if embedding images was implemented in a different way. -- Since those discussions have taken place, 20 years have passed, and nowadays we seem to be infatuated with images. We create images anywhere and everywhere. In fact creating images has become a way of interacting with the world around us.

Slide 14

Slide 14 text

Are you an athlete parading in the Olympic Stadium during the opening ceremony? Why not take a few photos to show your friends you were really there. It doesn’t matter if you’re Chinese …

Slide 15

Slide 15 text

… or Spanish – I watched the whole thing and there were people holding cameras in almost all delegations!

Slide 16

Slide 16 text

Maybe you’ve come to an event featuring your favourite science fiction TV characters – why not take some photos of them too?

Slide 17

Slide 17 text

Or maybe you’re going to a concert – enough people are now “watching” a concert through their cameraphones that artists are starting to complain!

Slide 18

Slide 18 text

Or maybe you’ve just escaped a crash-landed airplane – surely that’s a photo opportunity? -- Of course we don’t just create many more images – we also share them more than ever. It’s become our way of sharing our everyday lives.

Slide 19

Slide 19 text

350 million photos a day is rather significant when you have ~600 million active users per day

Slide 20

Slide 20 text

2 of the largest startup acquisitions in the last year, Instagram & Tumblr, facilitate sharing of images, static or moving. 1.1 billion = 1.1 instagrams ! Even social networks originally focused on textual information, such as Twitter, quickly felt the need to create and standardise ways for people to share images.

Slide 21

Slide 21 text

And compared to the good old days of film photography, we have reached an interesting point. It’s never been easier to preserve images for an indefinite amount of time. Because visual information can now be so efficiently compressed and because storage is so cheap, we can make as many copies of an image as we like, with no degradation over time. But it’s not just that we’re creating loads of images, the way we’re creating them is also changing

Slide 22

Slide 22 text

We're looking into a future where images might be created in an unattended, subconscious way, with little direct human intervention. The obvious device that's received a lot of publicity in terms of image making in the future is Google Glass. Unless you’ve been away from earth in the last few months, you’ve probably seen this:

Slide 23

Slide 23 text

It’s funny that people have focused so much on Google Glass because, at least in its current incarnation, isn't programmed in any way to record everything around you. It’s actually quite cumbersome to take a photo: you have to give it a voice command: 'ok glass, take a picture' which I think is unlikely to help you capture any very spontaneous and interesting moments. But there's a couple of other products both released in the last year, that are more interesting.

Slide 24

Slide 24 text

Memoto is a wearable camera that’s clips onto your clothes and simply takes 1 photo every 30 seconds – when you plug it back to a computer, it both charges and uploads all its photos to the cloud

Slide 25

Slide 25 text

Autographer is another similar wearable camera, which is meant to be a bit smarter. It has a number of sensors in addition to the camera, monitoring things like changes in light level and temperature, and it only takes photos when it thinks that something interesting is happening. For example, when it detects a sudden change in temperature, it might assume you’ve gone from outdoors to indoors, so it will attempt to take a photo. Another thing that’s likely to change in the future is that whatever images you create won’t need to be coming from your own perspective.

Slide 26

Slide 26 text

Reuters put up a lot of these last summer for the Olympics. They’re remotely controlled cameras with robotic mounts. The "photographer", if you would still want to use this term, is sitting on a laptop, watching a live view from the camera, and can move it and trigger it at any time. In fact, one photographer can watch multiple cameras at any time. If you don’t quite have the budget of Reuters, maybe you can try something simpler

Slide 27

Slide 27 text

This is a prototype of a throwable ball camera – it’s actually 36 cameras arranged along the surface of a sphere made out of foam. You just throw it up in the air, and when it reaches its highest point, just before it starts falling down again, it triggers all cameras at the same time and creates a panoramic photo – a bit like that:

Slide 28

Slide 28 text

This is just a percentage of what you can see – you can pan the panorama around and look at all other sides or up into the sky even if you wish. So it’s a bit like “shoot first, frame later” So if a photographer doesn’t need to be there to actually press the shutter, how long until we no longer need a photographer, the cameras shoot automatically and you just scour through later and pick best shots? I decided to give that a try. One of my other passions apart from photography is cycling, and if you combine the two, I also like taking nice photos of cyclists. Outside our office in London there’s a rather busy road, with many cyclists passing every day, and some of them have all sorts weird and interesting bikes. Now I could of course just sit outside our office with a camera and take photos, but I do have a day job to do. So I did this instead:

Slide 29

Slide 29 text

I got a webcam pointing out of the window, and put together a small image recognition algorithm that detects when there’s a bicycle in the frame and takes a photo. It’s not that hard to do – all you need is to find if there’s 2 circles, that can’t be too large or too small, and can’t be too close or too far from each other. And there you go, you have a bicycle, and you can take a photo of it. Leave this running for a day, and you end up with hundred of photos – and that’s the problem. When we manage to create a mountain of visual information – what do we do with it next? Did we stop storing our photos in a physical shoebox, only do end up with a digital one?

Slide 30

Slide 30 text

To find out how to deal with all that, I want to first take a step back and talk about -  why images are interesting -  and what sets them apart from other types of content that we’re perhaps more used to deal with. But before we get into the differences, let’s start with a similarity: images, like any other piece of content, are created for many different reasons.

Slide 31

Slide 31 text

If you look at an example screenshot of a photo stream provided by Apple, you’d be forgiven to think that all people use their phone cameras for is to take photos of their friends while on holiday or strolling around. The reality is always a bit different.

Slide 32

Slide 32 text

Here’s a random screenshot from one point in my photo stream.. Back in 2005, a group of researchers from Microsoft Research started studying early adopters of camera phones, tried to classify their photos and understand what drives people to take photos

Slide 33

Slide 33 text

2 dimensions … Now, even though people might have a very specific intention when they create an image, this intention isn’t always easy to distinguish. Images are much more generic - they're very much open to interpretation by whoever looks at them.

Slide 34

Slide 34 text

Here’s a random photo I got off Flickr the other day. Taken out of context, it can mean anything. Who knows what the person who photographed it wanted to say? - Perhaps there’s something interesting about all these people who have gathered in the park. -  Perhaps it’s about the guy on the left playing badminton. - Maybe he just wanted to show that the weather is good. - Or maybe the photographer just shot this to send it to a friend and plan where on this park they were going to meet up. It’s actually a photo of Central Park in New York, taken on Memorial Day. But even with this extra context, it’s still difficult to narrow down why this image was created.

Slide 35

Slide 35 text

This image could represent any of these 2 statements. And this is what we mean when we say “a picture is worth a thousand words”. It’s just it’s not always easy to know which of these thousand words are the most important ones. Since images are very generic when taken out of context, we can often create our own context to lead to a specific interpretation. Take this image for example …

Slide 36

Slide 36 text

It’s just a photo of some colourful Crocs sandals, right? Now let’s put something else next to it.

Slide 37

Slide 37 text

We’ve managed to create a rather humorous picture mocking Apple’s new iPhone colours and cases. In fact that’s how LOLCATS and much of the visual humor in the internet is created – by juxtaposing images with text or with other images, in combinations that create a humorous connotations. Another interesting attribute of images is that they’re believable. You’ve probably heard this phrase …

Slide 38

Slide 38 text

When are you more likely to believe me – if I just say to you: “I’ve had a camera since I was very young”, or if I show you this photo?

Slide 39

Slide 39 text

Just to survive in our day to day lives, our brains have to process a lot of visual information, and they don't usually have time to look at it in detail or question it - they assume it's true. For all you know, it might have been somebody else in this photo – I doubt any of you made a serious effort to compare my face and check if it was really me on this photo. Because images are so believable, there’s also high value in faking them.

Slide 40

Slide 40 text

This seems to be a favourite technique of totalitarian regimes. Stalin was known to routinely have photos altered to remove people he’d fallen out with. More recently, the Iranians wanted to make their missile test appear more impressive, so they just used Clone Brush in Photoshop to add an extra missile. So there’s lots of interesting things about images, but there’s also some issues. Because the web started as a hyper-text project, images were always a bit of second-class citizens.

Slide 41

Slide 41 text

One of the very foundations of the web, hyperlinks, don't always work well with images. What people usually expect to get when they click or tap on an image is an enlarged version of that image

Slide 42

Slide 42 text

In some cases, for example in Facebook, it's also possible to click on different people inside an image and go to their profiles. But it's not always clear if the image or any region inside the image is clickable at all - there's no blue underline, or any other obvious design convention to delineate boundaries. Anything too obvious will probably end up being intrusive and compete with the aesthetics of images. Facebook has gotten around this by only showing these links in a special mode, or only when you’re hovering over an image, but it’s still an unresolved issue.

Slide 43

Slide 43 text

Another issue with images compared to text is scaling to different screen sizes and resolutions. Text has a linear structure: a series of words with convenient gaps in between. Whether you mange to fit in 10 or 20 words in a column of text, people will be able to read and understand it. With images, important details that are shown crystal clear on a large screen might get easily lost when you scale it down to a smaller screen. James Chudley from cxPartners has written an interesting blog post about this which I encourage you to read in full. He created these examples that show how the problem could be tackled in some cases: that’s by picking the most important detail in an image and zooming in instead of scaling down.

Slide 44

Slide 44 text

So if images are so interesting, but also so multifaceted, how do we tame them. What can we do to design for a world a lot of the information we share is visual? As information architects, part of our strategy has always been to try and gather as much metadata as possible about each piece of data – and to devise ways of searching and browsing around using that metadata.

Slide 45

Slide 45 text

So where do we get all that metadata? One way is of course to get people to create them, for example allow them to give tiles and tags to their photos. In practice, this happens very rarely in a private context. Only professional photographers regularly sit and tag their photos, because they have an obvious benefit if their photos can be found and used. Most of us, when even when we share a photo online, rarely bother to add a lot of meaningful information. But fortunately nowadays photos come with a lot of metadata embedded from the point where they’re produced – the camera itself.

Slide 46

Slide 46 text

All this data (which you might have heard of as EXIF tags) is usually embedded by default, and usually stays with the photo unless it's removed by the user or by some badly-made image processing software. That's not to say that you should place absolute trust on any of this information as there's no way you can validate it and it's trivial to change it - I could take a photo of you now, and make it appear like it was taken on the other side of the world. You may think this metadata is trivial and not a lot to help you organise an image collection, but you can actually put it to very good use. The most obvious example you can see is the Camera Roll in iOS 7.

Slide 47

Slide 47 text

Going from a linear structure to grouping a series of photos by location and time gives a very good approximation of the different things you were doing when you took these photos. It’s such an obvious thing once you’ve seen it, and it requires so little processing, that it makes you wonder why it wasn’t done earlier. Even just using the time in photos, you can get some pretty inspiring uses.

Slide 48

Slide 48 text

This is a tool called “Photo time capsule”, built by an amazing photo blog called photojojo. Once you subscribe and give them the link to your photostream on flickr, they’ll send you every couple of weeks a selection of the photos you’ve uploaded exactly one year ago. This summer I got one of these in my mailbox, reminding me that last summer I was in Copenhagen with my wife, and we were doing some late night cycling. Reminiscing is an important reason behind creating such images, and reminiscing is all about time. -- Unfortunately the straightforward metadata stops here – if you want to gather more information you’ll need to process the image in some way.

Slide 49

Slide 49 text

You could focus on purely visual characteristics, for example extracting the colour (or colours) of the image, whether it's overall a dark or bright image and so one. This is useful if you can think of a reason to search or filter images in this way, but in the end it doesn't give you that many hints about the meaning of the image. There’s 2 other things that computers nowadays can extract from images in a pretty reliable way – text, and faces.

Slide 50

Slide 50 text

Text recognition algorithms can scour through images and identify pieces of text that exist in them. This is especially useful for images in the “functional” category that I mentioned before, for photos that were taken just because it was quicker and easier to photograph than to scan something. This is why it’s one of the most popular features of Evernote, a piece of software that aids note- taking in any form.

Slide 51

Slide 51 text

Face detection, simply recognising the presence of a face in an image is so straightforward that it's now possible even in the cheapest compact cameras out there. Face detection offers us an important cue about the meaning of a photo. If there's only 1 face in an image, and it takes a significant proportion of the frame, we might be able to assume that the image is someone's portrait. If on the other hand we detect 20 faces in an image, it might be a photo of a crowd. Or a group photo.

Slide 52

Slide 52 text

Face recognition is a bit more complex for computers. We're probably still a few years away from a solution that could recognise hundreds of thousands or millions of people with any degree of reliability. But if you're looking to to recognise people out of a limited set, for example which of my friends are in this photo, there's commercially available software that works well enough, such as iPhoto on the Mac, Picasa from Google etc.

Slide 53

Slide 53 text

One of the holy grails of image processing is being able to recognise all objects in an image. For example being able to take this image and recognise that it contains a Macbook, a table, and a cup of coffee. Again, even though there have been successful examples, you usually have to limit your search to specific objects under controlled circumstances. If for example you were looking for the Starbucks logo, you could detect it with a reasonable degree of confidence. But if it's difficult to recognise arbitrary objects, another approach could be to add man-made objects to the environment that can be easily recognised by computers. A good example is a QR codes.

Slide 54

Slide 54 text

This is a product sketch presented a couple of years ago by the design studio BERG from London. This little QR code up there is supposed to be generated by an e-paper display, and provides a unique representation of the time and location where the photo was shot, in a format that can be recognised by computers - hence the title “clocks for robots”. BERG envisaged that the metadata provided in this barcode can trigger something in your smartphone, for example launch an app when you take the photo. It can also trigger something when the photo is uploaded to a 3rd party service – for example applying some tags to the photo. Now when you mention all these possibilities aroun automatic photo capture and tagging, there’s one concern that consistently comes up: privacy.

Slide 55

Slide 55 text

So what about privacy? Isn't the world going to be a worse place when we all walk around with a camera, able to take and share photos without anyone noticing? To start with, this isn't a very new concern. Let me show you what Google Glass looked like in the 1880s:

Slide 56

Slide 56 text

To be fair, small cameras of that era could only take a limited number of photos to change the glass plate or the film inside them. But they still caused a stir ...

Slide 57

Slide 57 text

There were comic songs written, mocking such devices ...

Slide 58

Slide 58 text

And lots of upset people demanding that something should happen I haven’t seen any comedies about Google Glass yet, but I’ve definitely seen a lot of anger. A few weeks after Google Glass was released, a campaign group called "Stop the cyborgs" was founded. Source: http://www.billjayonphotography.com/The%20Camera%20Fiend.pdf

Slide 59

Slide 59 text

They created signs like this and according to their website they want to "encourage as many places as possible to become ‘Surveillance free zones’” This is a pointless request, and one that's eventually unenforceable, as image capture devices are becoming smaller and more invisible. A couple of weeks ago, while I was finalising this talk, yet another wearable device with a camera emerged: the Samsung Galaxy Gear "smart watch"

Slide 60

Slide 60 text

Good luck trying to spot and ban people wearing this. I guess you could introduce airport-style screening in the entrance of your venue, which of course ends up being a more intrusive behaviour than what you're trying to prevent. Or unless you try and impose controls on the manufacture of photographic devices, where it’s impossible to draw a line. I don’t think you’ll ever get any “stop the cyborgs” signs going up, in fact I thin there’s another category of signs that’s going to become obsolete.

Slide 61

Slide 61 text

If there is actually a museum of obsolete signs out there, they should be prepared to add this sign to their collection. So if this ends up in the museum, what next? Is there anything we should restrict?

Slide 62

Slide 62 text

In fact, a lot of people out there are quite reasonable and don’t need a legal threat to comply – more of a nudge. I think we can provide this nudge by analysing what is being photographed and shared.

Slide 63

Slide 63 text

This is what Instagram (or any similar photo sharing service) could look like if it wanted to make people think about privacy. It could even use face recognition to learn which of your friends don’t like their photo posted for the whole world to see. Or use other metadata like location, for example to make sure that photos you take inside or near your house are only available to your friends, not the whole world. There is no blanket rule that applies to everyone, but at the same time we’ve seen that people won’t realistically sort through all their photos and apply privacy controls. If we give them a chance to do it more easily, it might just work. And finally, if we accept that people will continue carrying a camera with them wherever they go, then the camera becomes an opportunity.

Slide 64

Slide 64 text

Some people, including some artists as we saw earlier, view this as an annoyance – from the perspective of a designer, I see it as an opportunity. We have a lot of input devices pointing at something. What’s the best use for them? Could we get people to take back something more than just a blurry video? Could use them to show something interesting and enhance their experience? Could the whole concept of a live music event be different if everyone has a screen on them? It’s for us to try and find out. One thing is for sure:

Slide 65

Slide 65 text

No content

Slide 66

Slide 66 text

No content