CWE-172

Name

Encoding Error

Status

Draft

Published

2006-07-19
00h00 +00:00

Modified

2025-12-11
00h00 +00:00

Official links

CWE Mitre.org

Notifications for a CWE

Stay informed of any changes for a specific CWE.

Notifications manage

Custom alerts

Activate your personalized alerts!

To activate your alerts, you just need to be logged in to your free account. If you’re not logged in yet, choose one of the options below.

Notifications for a CWE

Stay informed of any changes for a specific CWE.

Parameters

You can specify a title that will be retrieved in the alerts that will be sent out.

Specify the CWE ID you wish to monitor.

Planning

Month

Next run calculation

Day

Weekday

Hour

Minute

Creation date

Last execution

Next execution

Name: Encoding Error

The product does not properly encode or decode the data, resulting in unexpected values.

General Informations

Modes Of Introduction

Implementation

Applicable Platforms

Language

Class: Not Language-Specific (Undetermined)

Common Consequences

Scope	Impact	Likelihood
Integrity	Unexpected State

Observed Examples

References	Description
CVE-2004-1315	Forum software improperly URL decodes the highlight parameter when extracting text to highlight, which allows remote attackers to execute arbitrary PHP code by double-encoding the highlight value so that special characters are inserted into the result.
CVE-2004-1939	XSS protection mechanism attempts to remove "/" that could be used to close tags, but it can be bypassed using double encoded slashes (%252F)
CVE-2001-0709	Server allows a remote attacker to obtain source code of ASP files via a URL encoded with Unicode.
CVE-2005-2256	Hex-encoded path traversal variants - "%2e%2e", "%2e%2e%2f", "%5c%2e%2e"

Potential Mitigations

Phases : Implementation
Phases : Implementation
While it is risky to use dynamically-generated query strings, code, or commands that mix control and data together, sometimes it may be unavoidable. Properly quote arguments and escape any special characters within those arguments. The most conservative approach is to escape or filter all characters that do not pass an extremely strict allowlist (such as everything that is not alphanumeric or white space). If some special characters are still needed, such as white space, wrap each argument in quotes after the escaping/filtering step. Be careful of argument injection (CWE-88).
Phases : Implementation
Inputs should be decoded and canonicalized to the application's current internal representation before being validated (CWE-180). Make sure that the application does not decode the same input twice (CWE-174). Such errors could be used to bypass allowlist validation schemes by introducing dangerous inputs after they have been checked.

Vulnerability Mapping Notes

Justification : This CWE entry is a Class and might have Base-level children that would be more appropriate
Comment : Examine children of this entry to see if there is a better fit

Related Attack Patterns

CAPEC-ID	Attack Pattern Name
CAPEC-120	Double Encoding The adversary utilizes a repeating of the encoding process for a set of characters (that is, character encoding a character encoding of a character) to obfuscate the payload of a particular request. This may allow the adversary to bypass filters that attempt to detect illegal characters or strings, such as those that might be used in traversal or injection attacks. Filters may be able to catch illegal encoded strings, but may not catch doubly encoded strings. For example, a dot (.), often used in path traversal attacks and therefore often blocked by filters, could be URL encoded as %2E. However, many filters recognize this encoding and would still block the request. In a double encoding, the % in the above URL encoding would be encoded again as %25, resulting in %252E which some filters might not catch, but which could still be interpreted as a dot (.) by interpreters on the target.
CAPEC-267	Leverage Alternate Encoding An adversary leverages the possibility to encode potentially harmful input or content used by applications such that the applications are ineffective at validating this encoding standard.
CAPEC-3	Using Leading 'Ghost' Character Sequences to Bypass Input Filters Some APIs will strip certain leading characters from a string of parameters. An adversary can intentionally introduce leading "ghost" characters (extra characters that don't affect the validity of the request at the API layer) that enable the input to pass the filters and therefore process the adversary's input. This occurs when the targeted API will accept input data in several syntactic forms and interpret it in the equivalent semantic way, while the filter does not take into account the full spectrum of the syntactic forms acceptable to the targeted API.
CAPEC-52	Embedding NULL Bytes An adversary embeds one or more null bytes in input to the target software. This attack relies on the usage of a null-valued byte as a string terminator in many environments. The goal is for certain components of the target software to stop processing the input when it encounters the null byte(s).
CAPEC-53	Postfix, Null Terminate, and Backslash If a string is passed through a filter of some kind, then a terminal NULL may not be valid. Using alternate representation of NULL allows an adversary to embed the NULL mid-string while postfixing the proper data so that the filter is avoided. One example is a filter that looks for a trailing slash character. If a string insertion is possible, but the slash must exist, an alternate encoding of NULL in mid-string may be used.
CAPEC-64	Using Slashes and URL Encoding Combined to Bypass Validation Logic This attack targets the encoding of the URL combined with the encoding of the slash characters. An attacker can take advantage of the multiple ways of encoding a URL and abuse the interpretation of the URL. A URL may contain special character that need special syntax handling in order to be interpreted. Special characters are represented using a percentage character followed by two digits representing the octet code of the original character (%HEX-CODE). For instance US-ASCII space character would be represented with %20. This is often referred as escaped ending or percent-encoding. Since the server decodes the URL from the requests, it may restrict the access to some URL paths by validating and filtering out the URL requests it received. An attacker will try to craft an URL with a sequence of special characters which once interpreted by the server will be equivalent to a forbidden URL. It can be difficult to protect against this attack since the URL can contain other format of encoding such as UTF-8 encoding, Unicode-encoding, etc.
CAPEC-71	Using Unicode Encoding to Bypass Validation Logic An attacker may provide a Unicode string to a system component that is not Unicode aware and use that to circumvent the filter or cause the classifying mechanism to fail to properly understanding the request. That may allow the attacker to slip malicious data past the content filter and/or possibly cause the application to route the request incorrectly.
CAPEC-72	URL Encoding This attack targets the encoding of the URL. An adversary can take advantage of the multiple way of encoding an URL and abuse the interpretation of the URL.
CAPEC-78	Using Escaped Slashes in Alternate Encoding This attack targets the use of the backslash in alternate encoding. An adversary can provide a backslash as a leading character and causes a parser to believe that the next character is special. This is called an escape. By using that trick, the adversary tries to exploit alternate ways to encode the same character which leads to filter problems and opens avenues to attack.
CAPEC-80	Using UTF-8 Encoding to Bypass Validation Logic This attack is a specific variation on leveraging alternate encodings to bypass validation logic. This attack leverages the possibility to encode potentially harmful input in UTF-8 and submit it to applications not expecting or effective at validating this encoding standard making input filtering difficult. UTF-8 (8-bit UCS/Unicode Transformation Format) is a variable-length character encoding for Unicode. Legal UTF-8 characters are one to four bytes long. However, early version of the UTF-8 specification got some entries wrong (in some cases it permitted overlong characters). UTF-8 encoders are supposed to use the "shortest possible" encoding, but naive decoders may accept encodings that are longer than necessary. According to the RFC 3629, a particularly subtle form of this attack can be carried out against a parser which performs security-critical validity checks against the UTF-8 encoded form of its input, but interprets certain illegal octet sequences as characters.

Notes

Partially overlaps path traversal and equivalence weaknesses.
This is more like a category than a weakness.
Many other types of encodings should be listed in this category.

Submission

Name	Organization	Date	Date release	Version
PLOVER		2006-07-19 +00:00	2006-07-19 +00:00	Draft 3

Modifications

Name	Organization	Date	Comment
Eric Dalci	Cigital	2008-07-01 +00:00	updated Potential_Mitigations, Time_of_Introduction
CWE Content Team	MITRE	2008-09-08 +00:00	updated Maintenance_Notes, Relationships, Relationship_Notes, Taxonomy_Mappings
CWE Content Team	MITRE	2009-07-27 +00:00	updated Potential_Mitigations
CWE Content Team	MITRE	2010-12-13 +00:00	updated Description
CWE Content Team	MITRE	2011-03-29 +00:00	updated Potential_Mitigations
CWE Content Team	MITRE	2011-06-01 +00:00	updated Common_Consequences, Description
CWE Content Team	MITRE	2011-06-27 +00:00	updated Common_Consequences
CWE Content Team	MITRE	2012-05-11 +00:00	updated Related_Attack_Patterns, Relationships
CWE Content Team	MITRE	2012-10-30 +00:00	updated Potential_Mitigations
CWE Content Team	MITRE	2013-02-21 +00:00	updated Potential_Mitigations
CWE Content Team	MITRE	2014-07-30 +00:00	updated Relationships
CWE Content Team	MITRE	2015-12-07 +00:00	updated Relationships
CWE Content Team	MITRE	2017-11-08 +00:00	updated Applicable_Platforms
CWE Content Team	MITRE	2019-01-03 +00:00	updated Related_Attack_Patterns
CWE Content Team	MITRE	2019-06-20 +00:00	updated Relationships
CWE Content Team	MITRE	2020-02-24 +00:00	updated Potential_Mitigations, Relationships
CWE Content Team	MITRE	2020-06-25 +00:00	updated Potential_Mitigations
CWE Content Team	MITRE	2023-01-31 +00:00	updated Description
CWE Content Team	MITRE	2023-04-27 +00:00	updated Relationships
CWE Content Team	MITRE	2023-06-29 +00:00	updated Mapping_Notes
CWE Content Team	MITRE	2023-10-26 +00:00	updated Observed_Examples
CWE Content Team	MITRE	2025-12-11 +00:00	updated Weakness_Ordinalities

CWE-172 Detail