Photo forensics from lighting shadows and reflections

In this era of generative AI, how do we determine whether an image is human-created or machine-made? For the Content Authenticity Initiative (CAI), Professor Hany Farid shares the techniques used for identifying real and synthetic images — including analyzing shadows, reflections, vanishing points, environmental lighting, and GAN-generated faces. A world-renowned expert in the field of misinformation, disinformation, and digital forensics as well as an advisor to the CAI, Professor Farid explores the limits of these techniques and their part in a larger ecosystem needed to regain trust in the visual record. 

This is the fifth article in a six-part series.

More from this series:

Part 1: From the darkroom to generative AI

Part 2: How realistic are AI-generated faces?

Part 3: Photo forensics for AI-generated faces

Part 4: Photo forensics from lighting environments


by Hany Farid

Where there is light, there are shadows. The relationship between an object, its shadow, and the illuminating light source(s) is geometrically simple, and yet it’s deceptively difficult to get just right in a manipulated or synthesized image. In the image below, the bottle's cast shadow is clearly incongruous with the shape of the bottle. Such obvious errors in a shadow are easy to spot, but more subtle differences can be harder to detect.

CGI model credit to Jeremy Birn, Lighting and Rendering in Maya.

Below are two images in which the bottle and its cast shadow are slightly different. (The rest of the scene is identical). Can you tell which is consistent with the lighting in the rest of the scene?

The geometry of cast shadows is dictated by the 3D shape and location of an object and the illuminating light(s). This relationship is wonderfully simple: A point on an object, its corresponding shadow, and the light source responsible for the shadow all lie on a single line.

Because (in the absence of lens distortion) straight lines in the 3D scene are imaged to straight lines in the 2D image, this basic constraint holds in the image. Locate any point on a shadow and its corresponding point on the object, and draw a line through them. Repeat for as many clearly defined shadow and object points as possible, and for an authentic image, all the lines will intersect at one point — the location of the illuminating light.

Below are the results of this simple geometric analysis applied to the two images above, clearly revealing the fake.

From shadows to reflections

As a student taking the subway to work each morning, I would stare out the window and watch the tunnel walls whiz by superimposed atop reflections of my fellow passengers. On occasion someone’s reflection would catch my eye, and I would look back into the subway car to get a better view. But I would invariably not see them where I thought they would be given the position of their reflection on the window. As a budding image and vision scientist this really bothered me — how could it be that it was so hard to reason about something so simple as a reflection in a window?

Below are two images of the same basic scene. The reflection of the table and garbage can are slightly different; everything else is the same. Can you tell which is correct?

The geometry of reflections is fairly straightforward. Consider standing in front of a mirror and looking at your reflection. An imaginary straight line connects each point on your body with its reflection. These lines are perpendicular to the mirror’s surface and are parallel to one another. When photographed at an oblique angle, however, these imaginary lines will not remain parallel but will converge to a single point. This is the same reason why railroad tracks that are parallel in the world appear to converge in a photograph.

This geometry of reflections suggests a simple forensic technique for verifying the integrity of reflections. Locate any point on an object and its corresponding point on the reflection, and draw a line through them. Repeat for as many clearly defined object and reflection points as possible. As you do this, you will find that all the lines should intersect at one point. 

Below are the results of this simple geometric analysis, which clearly reveals the second reflection to be the fake.

Notice that this geometric analysis is exactly the same as that used to analyze shadows. The reason is that in both cases we are exploiting basic principles of perspective projection that dictate the projection of straight lines.

In practice, there are some limitations to a manual application of this geometric analysis. We must take care to select appropriately matched points on the shadow/reflection and the object. We can best achieve this when the object has a distinct shape, like the corner of a cube or the tip of a cone. Depending on the scene geometry, the constraint lines may be nearly parallel, making the computation of their intersection vulnerable to slight error in selecting matched points. And, it may be necessary to remove any lens distortion in the image that causes straight lines to be imaged as curved lines that will then no longer intersect at a single point.

From Adobe Photoshop to generative AI

When we first developed these techniques, we were primarily concerned with images that were manipulated with Photoshopto, for example, digitally add a person or object and its shadow or reflection. Because our own visual system is not particularly sensitive to inconsistent shadows and reflections (see item 1 in the Further Reading section below) and because it is difficult to precisely match the 3D scene geometry within 2D photo editing software, this technique has proven quite effective at uncovering fakes.

Classic computer-generated imagery is produced by modeling 3D scene geometry, the surrounding illumination, and a virtual camera. As a result, rendered images accurately capture the geometry and physics of natural scenes, including shadows and reflections. In contrast, AI-generated imagery is produced by learning the statistical distribution of natural scenes from a large set of real images. Without an explicit 3D model of the world, it is natural to wonder how accurately synthesized content captures the geometry of shadows and reflections. 

Although we are still early in the age of generative AI, today's AI-generated images seem to struggle to produce perspectively correct shadows and reflections. Below is a typical example, generated using OpenAI's DALL-E2, in which the shadows are inconsistent (yellow lines), the reflections are impossibly mismatched and missing, and the shadows in the reflection are oriented in exactly the wrong direction. (See item 2 in the Further Reading section below for a more detailed analysis.

Even in the absence of explicit 3D modeling of objects, surfaces, and lighting — as found in traditional CGI-rendering — AI-generated images exhibit many of the properties of natural scenes. At the same time, cast shadows and reflections in mirrored surfaces are not fully consistent with the expected perspective geometry of natural scenes. 

The trend in generative AI has been that increasingly larger synthesis engines yield increasingly more realistic images. As such, it may just be a matter of time before generative AI will learn to create images with full-blown perspective consistency. Until that time, however, this geometric forensic analysis may prove useful.

Further reading:

1. S.J. Nightingale, K.A. Wade, H. Farid, and D.G. Watson. Can People Detect Errors in Shadows and Reflections? Attention, Perception, & Psychophysics, 81(8):2917-2943, 2019.

2. H. Farid. Perspective (In)consistency of Paint by Text. arXiv:2206.14617, 2022. 

More from this series:

Part 1: From the darkroom to generative AI

Part 2: How realistic are AI-generated faces?

Part 3: Photo forensics for AI-generated faces

Part 4: Photo forensics from lighting environments

Author bio: Professor Hany Farid is a world-renowned expert in the field of misinformation, disinformation, and digital forensics. He joined the Content Authenticity Initiative (CAI) as an advisor in June 2023. The CAI is a community of media and tech companies, non-profits, academics, and others working to promote adoption of the open industry standard for content authenticity and provenance.

Professor Farid teaches at the University of California, Berkeley, with a joint appointment in electrical engineering and computer sciences at the School of Information. He’s also a member of the Berkeley Artificial Intelligence Lab, Berkeley Institute for Data Science, Center for Innovation in Vision and Optics, Development Engineering Program, and Vision Science Program, and he’s a senior faculty advisor for the Center for Long-Term Cybersecurity. His research focuses on digital forensics, forensic science, misinformation, image analysis, and human perception.

He received his undergraduate degree in computer science and applied mathematics from the University of Rochester in 1989, his M.S. in computer science from SUNY Albany, and his Ph.D. in computer science from the University of Pennsylvania in 1997. Following a two-year post-doctoral fellowship in brain and cognitive sciences at MIT, he joined the faculty at Dartmouth College in 1999 where he remained until 2019.

Professor Farid is the recipient of an Alfred P. Sloan Fellowship and a John Simon Guggenheim Fellowship, and he’s a fellow of the National Academy of Inventors.

Previous
Previous

Passive versus active photo forensics in the age of AI and social media

Next
Next

Photo forensics from lighting environments