CWE-172 Detail

CWE-172

Encoding Error
Draft
2006-07-19 00:00 +00:00
2023-10-26 00:00 +00:00

Alerte pour un CWE

Stay informed of any changes for a specific CWE.
Alert management

Encoding Error

The product does not properly encode or decode the data, resulting in unexpected values.

Informations

Modes Of Introduction

Implementation

Applicable Platforms

Language

Class: Not Language-Specific (Undetermined)

Common Consequences

Scope Impact Likelihood
IntegrityUnexpected State

Observed Examples

Reference Description
CVE-2004-1315Forum software improperly URL decodes the highlight parameter when extracting text to highlight, which allows remote attackers to execute arbitrary PHP code by double-encoding the highlight value so that special characters are inserted into the result.
CVE-2004-1939XSS protection mechanism attempts to remove "/" that could be used to close tags, but it can be bypassed using double encoded slashes (%252F)
CVE-2001-0709Server allows a remote attacker to obtain source code of ASP files via a URL encoded with Unicode.
CVE-2005-2256Hex-encoded path traversal variants - "%2e%2e", "%2e%2e%2f", "%5c%2e%2e"

Potential Mitigations

Phases : Implementation

Assume all input is malicious. Use an "accept known good" input validation strategy, i.e., use a list of acceptable inputs that strictly conform to specifications. Reject any input that does not strictly conform to specifications, or transform it into something that does.

When performing input validation, consider all potentially relevant properties, including length, type of input, the full range of acceptable values, missing or extra inputs, syntax, consistency across related fields, and conformance to business rules. As an example of business rule logic, "boat" may be syntactically valid because it only contains alphanumeric characters, but it is not valid if the input is only expected to contain colors such as "red" or "blue."

Do not rely exclusively on looking for malicious or malformed inputs. This is likely to miss at least one undesirable input, especially if the code's environment changes. This can give attackers enough room to bypass the intended validation. However, denylists can be useful for detecting potential attacks or determining which inputs are so malformed that they should be rejected outright.


Phases : Implementation
While it is risky to use dynamically-generated query strings, code, or commands that mix control and data together, sometimes it may be unavoidable. Properly quote arguments and escape any special characters within those arguments. The most conservative approach is to escape or filter all characters that do not pass an extremely strict allowlist (such as everything that is not alphanumeric or white space). If some special characters are still needed, such as white space, wrap each argument in quotes after the escaping/filtering step. Be careful of argument injection (CWE-88).
Phases : Implementation
Inputs should be decoded and canonicalized to the application's current internal representation before being validated (CWE-180). Make sure that the application does not decode the same input twice (CWE-174). Such errors could be used to bypass allowlist validation schemes by introducing dangerous inputs after they have been checked.

Vulnerability Mapping Notes

Rationale : This CWE entry is a Class and might have Base-level children that would be more appropriate
Comments : Examine children of this entry to see if there is a better fit

Related Attack Patterns

CAPEC-ID Attack Pattern Name
CAPEC-120 Double Encoding
The adversary utilizes a repeating of the encoding process for a set of characters (that is, character encoding a character encoding of a character) to obfuscate the payload of a particular request. This may allow the adversary to bypass filters that attempt to detect illegal characters or strings, such as those that might be used in traversal or injection attacks. Filters may be able to catch illegal encoded strings, but may not catch doubly encoded strings. For example, a dot (.), often used in path traversal attacks and therefore often blocked by filters, could be URL encoded as %2E. However, many filters recognize this encoding and would still block the request. In a double encoding, the % in the above URL encoding would be encoded again as %25, resulting in %252E which some filters might not catch, but which could still be interpreted as a dot (.) by interpreters on the target.
CAPEC-267 Leverage Alternate Encoding
An adversary leverages the possibility to encode potentially harmful input or content used by applications such that the applications are ineffective at validating this encoding standard.
CAPEC-3 Using Leading 'Ghost' Character Sequences to Bypass Input Filters
Some APIs will strip certain leading characters from a string of parameters. An adversary can intentionally introduce leading "ghost" characters (extra characters that don't affect the validity of the request at the API layer) that enable the input to pass the filters and therefore process the adversary's input. This occurs when the targeted API will accept input data in several syntactic forms and interpret it in the equivalent semantic way, while the filter does not take into account the full spectrum of the syntactic forms acceptable to the targeted API.
CAPEC-52 Embedding NULL Bytes
An adversary embeds one or more null bytes in input to the target software. This attack relies on the usage of a null-valued byte as a string terminator in many environments. The goal is for certain components of the target software to stop processing the input when it encounters the null byte(s).
CAPEC-53 Postfix, Null Terminate, and Backslash
If a string is passed through a filter of some kind, then a terminal NULL may not be valid. Using alternate representation of NULL allows an adversary to embed the NULL mid-string while postfixing the proper data so that the filter is avoided. One example is a filter that looks for a trailing slash character. If a string insertion is possible, but the slash must exist, an alternate encoding of NULL in mid-string may be used.
CAPEC-64 Using Slashes and URL Encoding Combined to Bypass Validation Logic
This attack targets the encoding of the URL combined with the encoding of the slash characters. An attacker can take advantage of the multiple ways of encoding a URL and abuse the interpretation of the URL. A URL may contain special character that need special syntax handling in order to be interpreted. Special characters are represented using a percentage character followed by two digits representing the octet code of the original character (%HEX-CODE). For instance US-ASCII space character would be represented with %20. This is often referred as escaped ending or percent-encoding. Since the server decodes the URL from the requests, it may restrict the access to some URL paths by validating and filtering out the URL requests it received. An attacker will try to craft an URL with a sequence of special characters which once interpreted by the server will be equivalent to a forbidden URL. It can be difficult to protect against this attack since the URL can contain other format of encoding such as UTF-8 encoding, Unicode-encoding, etc.
CAPEC-71 Using Unicode Encoding to Bypass Validation Logic
An attacker may provide a Unicode string to a system component that is not Unicode aware and use that to circumvent the filter or cause the classifying mechanism to fail to properly understanding the request. That may allow the attacker to slip malicious data past the content filter and/or possibly cause the application to route the request incorrectly.
CAPEC-72 URL Encoding
This attack targets the encoding of the URL. An adversary can take advantage of the multiple way of encoding an URL and abuse the interpretation of the URL.
CAPEC-78 Using Escaped Slashes in Alternate Encoding
This attack targets the use of the backslash in alternate encoding. An adversary can provide a backslash as a leading character and causes a parser to believe that the next character is special. This is called an escape. By using that trick, the adversary tries to exploit alternate ways to encode the same character which leads to filter problems and opens avenues to attack.
CAPEC-80 Using UTF-8 Encoding to Bypass Validation Logic
This attack is a specific variation on leveraging alternate encodings to bypass validation logic. This attack leverages the possibility to encode potentially harmful input in UTF-8 and submit it to applications not expecting or effective at validating this encoding standard making input filtering difficult. UTF-8 (8-bit UCS/Unicode Transformation Format) is a variable-length character encoding for Unicode. Legal UTF-8 characters are one to four bytes long. However, early version of the UTF-8 specification got some entries wrong (in some cases it permitted overlong characters). UTF-8 encoders are supposed to use the "shortest possible" encoding, but naive decoders may accept encodings that are longer than necessary. According to the RFC 3629, a particularly subtle form of this attack can be carried out against a parser which performs security-critical validity checks against the UTF-8 encoded form of its input, but interprets certain illegal octet sequences as characters.

Notes

Partially overlaps path traversal and equivalence weaknesses.
This is more like a category than a weakness.
Many other types of encodings should be listed in this category.

Submission

Name Organization Date Date Release Version
PLOVER 2006-07-19 +00:00 2006-07-19 +00:00 Draft 3

Modifications

Name Organization Date Comment
Eric Dalci Cigital 2008-07-01 +00:00 updated Potential_Mitigations, Time_of_Introduction
CWE Content Team MITRE 2008-09-08 +00:00 updated Maintenance_Notes, Relationships, Relationship_Notes, Taxonomy_Mappings
CWE Content Team MITRE 2009-07-27 +00:00 updated Potential_Mitigations
CWE Content Team MITRE 2010-12-13 +00:00 updated Description
CWE Content Team MITRE 2011-03-29 +00:00 updated Potential_Mitigations
CWE Content Team MITRE 2011-06-01 +00:00 updated Common_Consequences, Description
CWE Content Team MITRE 2011-06-27 +00:00 updated Common_Consequences
CWE Content Team MITRE 2012-05-11 +00:00 updated Related_Attack_Patterns, Relationships
CWE Content Team MITRE 2012-10-30 +00:00 updated Potential_Mitigations
CWE Content Team MITRE 2013-02-21 +00:00 updated Potential_Mitigations
CWE Content Team MITRE 2014-07-30 +00:00 updated Relationships
CWE Content Team MITRE 2015-12-07 +00:00 updated Relationships
CWE Content Team MITRE 2017-11-08 +00:00 updated Applicable_Platforms
CWE Content Team MITRE 2019-01-03 +00:00 updated Related_Attack_Patterns
CWE Content Team MITRE 2019-06-20 +00:00 updated Relationships
CWE Content Team MITRE 2020-02-24 +00:00 updated Potential_Mitigations, Relationships
CWE Content Team MITRE 2020-06-25 +00:00 updated Potential_Mitigations
CWE Content Team MITRE 2023-01-31 +00:00 updated Description
CWE Content Team MITRE 2023-04-27 +00:00 updated Relationships
CWE Content Team MITRE 2023-06-29 +00:00 updated Mapping_Notes
CWE Content Team MITRE 2023-10-26 +00:00 updated Observed_Examples