Directory traversal (also known as file path traversal) is a web security vulnerability that allows an attacker to read arbitrary files on the server that is running an application.
Consider the following URL: randomwebsite111.com/loadImage?filename=cutekitty18.png
The loadImage URL takes a filename parameter and returns the contents of the specified file. The image files themselves are stored on disk in the location /var/www/images/. To return an image, the application appends the requested filename to this base directory and uses a filesystem API to read the contents of the file. In the above case, the application reads from the following file path: /var/www/images/cutekitty18.png
The application implements no defenses against directory traversal attacks, so an attacker can request the following URL to retrieve an arbitrary file from the server's filesystem:
randomwebsite111.com/loadImage?filename=../../../etc/passwd
This causes the application to read from the following file path:
/var/www/images/../../../etc/passwd
On Windows, both ../ and ..\ are valid directory traversal sequences, and an equivalent attack to retrieve a standard operating system file would be: randomwebsite111.com/loadImage?filename=......\windows\win.ini
You might be able to use an absolute path from the filesystem root, such as filename=/etc/passwd, to directly reference a file without using any traversal sequences.
You might be able to use nested traversal sequences, such as ....// or ..../
You might be able to use various non-standard encodings:
- . = %2e
- / = %2f
- \ = %5c
-
16-bit:
- . = %u002e
- / = %u2215
- \ = %u2216
-
Double URL:
- . = %252e
- / = %252f
- \ = %255c
-
UTF-8:
- . = %c0%2e, %e0%40%ae, %c0ae
- / = %c0%af, %e0%80%af, %c0%2f
- \ = %c0%5c, %c0%80%5c
example: ..%c0%af or ..%252f
If an application requires that the user-supplied filename must end with an expected file extension, such as .png, then it might be possible to use a null byte to effectively terminate the file path before the required extension.
For example: filename=../../../etc/passwd%00.png
The most effective way to prevent file path traversal vulnerabilities is to avoid passing user-supplied input to filesystem APIs altogether. Many application functions that do this can be rewritten to deliver the same behavior in a safer way.
If it is considered unavoidable to pass user-supplied input to filesystem APIs, then two layers of defense should be used together to prevent attacks:
- The application should validate the user input before processing it. Ideally, the validation should compare against a whitelist of permitted values. If that isn't possible for the required functionality, then the validation should verify that the input contains only permitted content, such as purely alphanumeric characters.
- After validating the supplied input, the application should append the input to the base directory and use a platform filesystem API to canonicalize the path. It should verify that the canonicalized path starts with the expected base directory.
$File file = new File(BASE_DIRECTORY, userInput);
if (file.getCanonicalPath().startsWith(BASE_DIRECTORY)) {
// process file
}