CWE-20 Detail

CWE-20

Improper Input Validation
HIGH
Stable
2006-07-19 00:00 +00:00
2024-11-19 00:00 +00:00

Alerte pour un CWE

Stay informed of any changes for a specific CWE.
Alert management

Improper Input Validation

The product receives input or data, but it does not validate or incorrectly validates that the input has the properties that are required to process the data safely and correctly.

Extended Description

Input validation is a frequently-used technique for checking potentially dangerous inputs in order to ensure that the inputs are safe for processing within the code, or when communicating with other components. When software does not validate input properly, an attacker is able to craft the input in a form that is not expected by the rest of the application. This will lead to parts of the system receiving unintended input, which may result in altered control flow, arbitrary control of a resource, or arbitrary code execution.

Input validation is not the only technique for processing input, however. Other techniques attempt to transform potentially-dangerous input into something safe, such as filtering (CWE-790) - which attempts to remove dangerous inputs - or encoding/escaping (CWE-116), which attempts to ensure that the input is not misinterpreted when it is included in output to another component. Other techniques exist as well (see CWE-138 for more examples.)

Input validation can be applied to:

  • raw data - strings, numbers, parameters, file contents, etc.
  • metadata - information about the raw data, such as headers or size

Data can be simple or structured. Structured data can be composed of many nested layers, composed of combinations of metadata and raw data, with other simple or structured data.

Many properties of raw data or metadata may need to be validated upon entry into the code, such as:

  • specified quantities such as size, length, frequency, price, rate, number of operations, time, etc.
  • implied or derived quantities, such as the actual size of a file instead of a specified size
  • indexes, offsets, or positions into more complex data structures
  • symbolic keys or other elements into hash tables, associative arrays, etc.
  • well-formedness, i.e. syntactic correctness - compliance with expected syntax
  • lexical token correctness - compliance with rules for what is treated as a token
  • specified or derived type - the actual type of the input (or what the input appears to be)
  • consistency - between individual data elements, between raw data and metadata, between references, etc.
  • conformance to domain-specific rules, e.g. business logic
  • equivalence - ensuring that equivalent inputs are treated the same
  • authenticity, ownership, or other attestations about the input, e.g. a cryptographic signature to prove the source of the data

Implied or derived properties of data must often be calculated or inferred by the code itself. Errors in deriving properties may be considered a contributing factor to improper input validation.

Note that "input validation" has very different meanings to different people, or within different classification schemes. Caution must be used when referencing this CWE entry or mapping to it. For example, some weaknesses might involve inadvertently giving control to an attacker over an input when they should not be able to provide an input at all, but sometimes this is referred to as input validation.

Finally, it is important to emphasize that the distinctions between input validation and output escaping are often blurred, and developers must be careful to understand the difference, including how input validation is not always sufficient to prevent vulnerabilities, especially when less stringent data types must be supported, such as free-form text. Consider a SQL injection scenario in which a person's last name is inserted into a query. The name "O'Reilly" would likely pass the validation step since it is a common last name in the English language. However, this valid name cannot be directly inserted into the database because it contains the "'" apostrophe character, which would need to be escaped or otherwise transformed. In this case, removing the apostrophe might reduce the risk of SQL injection, but it would produce incorrect behavior because the wrong name would be recorded.

Informations

Modes Of Introduction

Architecture and Design
Implementation :

REALIZATION: This weakness is caused during implementation of an architectural security tactic.

If a programmer believes that an attacker cannot modify certain inputs, then the programmer might not perform any input validation at all. For example, in web applications, many programmers believe that cookies and hidden form fields can not be modified from a web browser (CWE-472), although they can be altered using a proxy or a custom program. In a client-server architecture, the programmer might assume that client-side security checks cannot be bypassed, even when a custom client could be written that skips those checks (CWE-602).


Applicable Platforms

Language

Class: Not Language-Specific (Often)

Common Consequences

Scope Impact Likelihood
AvailabilityDoS: Crash, Exit, or Restart, DoS: Resource Consumption (CPU), DoS: Resource Consumption (Memory)

Note: An attacker could provide unexpected values and cause a program crash or excessive consumption of resources, such as memory and CPU.
ConfidentialityRead Memory, Read Files or Directories

Note: An attacker could read confidential data if they are able to control resource references.
Integrity
Confidentiality
Availability
Modify Memory, Execute Unauthorized Code or Commands

Note: An attacker could use malicious input to modify data or possibly alter control flow in unexpected ways, including arbitrary command execution.

Observed Examples

Reference Description
CVE-2024-37032Large language model (LLM) management tool does not validate the format of a digest value (CWE-1287) from a private, untrusted model registry, enabling relative path traversal (CWE-23), a.k.a. Probllama
CVE-2022-45918Chain: a learning management tool debugger uses external input to locate previous session logs (CWE-73) and does not properly validate the given path (CWE-20), allowing for filesystem path traversal using "../" sequences (CWE-24)
CVE-2021-30860Chain: improper input validation (CWE-20) leads to integer overflow (CWE-190) in mobile OS, as exploited in the wild per CISA KEV.
CVE-2021-30663Chain: improper input validation (CWE-20) leads to integer overflow (CWE-190) in mobile OS, as exploited in the wild per CISA KEV.
CVE-2021-22205Chain: backslash followed by a newline can bypass a validation step (CWE-20), leading to eval injection (CWE-95), as exploited in the wild per CISA KEV.
CVE-2021-21220Chain: insufficient input validation (CWE-20) in browser allows heap corruption (CWE-787), as exploited in the wild per CISA KEV.
CVE-2020-9054Chain: improper input validation (CWE-20) in username parameter, leading to OS command injection (CWE-78), as exploited in the wild per CISA KEV.
CVE-2020-3452Chain: security product has improper input validation (CWE-20) leading to directory traversal (CWE-22), as exploited in the wild per CISA KEV.
CVE-2020-3161Improper input validation of HTTP requests in IP phone, as exploited in the wild per CISA KEV.
CVE-2020-3580Chain: improper input validation (CWE-20) in firewall product leads to XSS (CWE-79), as exploited in the wild per CISA KEV.
CVE-2021-37147Chain: caching proxy server has improper input validation (CWE-20) of headers, allowing HTTP response smuggling (CWE-444) using an "LF line ending"
CVE-2008-5305Eval injection in Perl program using an ID that should only contain hyphens and numbers.
CVE-2008-2223SQL injection through an ID that was supposed to be numeric.
CVE-2008-3477lack of input validation in spreadsheet program leads to buffer overflows, integer overflows, array index errors, and memory corruption.
CVE-2008-3843insufficient validation enables XSS
CVE-2008-3174driver in security product allows code execution due to insufficient validation
CVE-2007-3409infinite loop from DNS packet with a label that points to itself
CVE-2006-6870infinite loop from DNS packet with a label that points to itself
CVE-2008-1303missing parameter leads to crash
CVE-2007-5893HTTP request with missing protocol version number leads to crash
CVE-2006-6658request with missing parameters leads to information exposure
CVE-2008-4114system crash with offset value that is inconsistent with packet size
CVE-2006-3790size field that is inconsistent with packet size leads to buffer over-read
CVE-2008-2309product uses a denylist to identify potentially dangerous content, allowing attacker to bypass a warning
CVE-2008-3494security bypass via an extra header
CVE-2008-3571empty packet triggers reboot
CVE-2006-5525incomplete denylist allows SQL injection
CVE-2008-1284NUL byte in theme name causes directory traversal impact to be worse
CVE-2008-0600kernel does not validate an incoming pointer before dereferencing it
CVE-2008-1738anti-virus product has insufficient input validation of hooked SSDT functions, allowing code execution
CVE-2008-1737anti-virus product allows DoS via zero-length field
CVE-2008-3464driver does not validate input from userland to the kernel
CVE-2008-2252kernel does not validate parameters sent in from userland, allowing code execution
CVE-2008-2374lack of validation of string length fields allows memory consumption or buffer over-read
CVE-2008-1440lack of validation of length field leads to infinite loop
CVE-2008-1625lack of validation of input to an IOCTL allows code execution
CVE-2008-3177zero-length attachment causes crash
CVE-2007-2442zero-length input causes free of uninitialized pointer
CVE-2008-5563crash via a malformed frame structure
CVE-2008-5285infinite loop from a long SMTP request
CVE-2008-3812router crashes with a malformed packet
CVE-2008-3680packet with invalid version number leads to NULL pointer dereference
CVE-2008-3660crash via multiple "." characters in file extension

Potential Mitigations

Phases : Architecture and Design
Consider using language-theoretic security (LangSec) techniques that characterize inputs using a formal language and build "recognizers" for that language. This effectively requires parsing to be a distinct layer that effectively enforces a boundary between raw input and internal data representations, instead of allowing parser code to be scattered throughout the program, where it could be subject to errors or inconsistencies that create weaknesses. [REF-1109] [REF-1110] [REF-1111]
Phases : Architecture and Design
Use an input validation framework such as Struts or the OWASP ESAPI Validation API. Note that using a framework does not automatically address all input validation problems; be mindful of weaknesses that could arise from misusing the framework itself (CWE-1173).
Phases : Architecture and Design // Implementation
Understand all the potential areas where untrusted inputs can enter your software: parameters or arguments, cookies, anything read from the network, environment variables, reverse DNS lookups, query results, request headers, URL components, e-mail, files, filenames, databases, and any external systems that provide data to the application. Remember that such inputs may be obtained indirectly through API calls.
Phases : Implementation

Assume all input is malicious. Use an "accept known good" input validation strategy, i.e., use a list of acceptable inputs that strictly conform to specifications. Reject any input that does not strictly conform to specifications, or transform it into something that does.

When performing input validation, consider all potentially relevant properties, including length, type of input, the full range of acceptable values, missing or extra inputs, syntax, consistency across related fields, and conformance to business rules. As an example of business rule logic, "boat" may be syntactically valid because it only contains alphanumeric characters, but it is not valid if the input is only expected to contain colors such as "red" or "blue."

Do not rely exclusively on looking for malicious or malformed inputs. This is likely to miss at least one undesirable input, especially if the code's environment changes. This can give attackers enough room to bypass the intended validation. However, denylists can be useful for detecting potential attacks or determining which inputs are so malformed that they should be rejected outright.


Phases : Architecture and Design

For any security checks that are performed on the client side, ensure that these checks are duplicated on the server side, in order to avoid CWE-602. Attackers can bypass the client-side checks by modifying values after the checks have been performed, or by changing the client to remove the client-side checks entirely. Then, these modified values would be submitted to the server.

Even though client-side checks provide minimal benefits with respect to server-side security, they are still useful. First, they can support intrusion detection. If the server receives input that should have been rejected by the client, then it may be an indication of an attack. Second, client-side error-checking can provide helpful feedback to the user about the expectations for valid input. Third, there may be a reduction in server-side processing time for accidental input errors, although this is typically a small savings.


Phases : Implementation
When your application combines data from multiple sources, perform the validation after the sources have been combined. The individual data elements may pass the validation step but violate the intended restrictions after they have been combined.
Phases : Implementation
Be especially careful to validate all input when invoking code that crosses language boundaries, such as from an interpreted language to native code. This could create an unexpected interaction between the language boundaries. Ensure that you are not violating any of the expectations of the language with which you are interfacing. For example, even though Java may not be susceptible to buffer overflows, providing a large argument in a call to native code might trigger an overflow.
Phases : Implementation
Directly convert your input type into the expected data type, such as using a conversion function that translates a string into a number. After converting to the expected data type, ensure that the input's values fall within the expected range of allowable values and that multi-field consistencies are maintained.
Phases : Implementation

Inputs should be decoded and canonicalized to the application's current internal representation before being validated (CWE-180, CWE-181). Make sure that your application does not inadvertently decode the same input twice (CWE-174). Such errors could be used to bypass allowlist schemes by introducing dangerous inputs after they have been checked. Use libraries such as the OWASP ESAPI Canonicalization control.

Consider performing repeated canonicalization until your input does not change any more. This will avoid double-decoding and similar scenarios, but it might inadvertently modify inputs that are allowed to contain properly-encoded dangerous content.


Phases : Implementation
When exchanging data between components, ensure that both components are using the same character encoding. Ensure that the proper encoding is applied at each interface. Explicitly set the encoding you are using whenever the protocol allows you to do so.

Detection Methods

Automated Static Analysis

Some instances of improper input validation can be detected using automated static analysis.

A static analysis tool might allow the user to specify which application-specific methods or functions perform input validation; the tool might also have built-in knowledge of validation frameworks such as Struts. The tool may then suppress or de-prioritize any associated warnings. This allows the analyst to focus on areas of the software in which input validation does not appear to be present.

Except in the cases described in the previous paragraph, automated static analysis might not be able to recognize when proper input validation is being performed, leading to false positives - i.e., warnings that do not have any security consequences or require any code changes.


Manual Static Analysis

When custom input validation is required, such as when enforcing business rules, manual analysis is necessary to ensure that the validation is properly implemented.

Fuzzing

Fuzzing techniques can be useful for detecting input validation errors. When unexpected inputs are provided to the software, the software should not crash or otherwise become unstable, and it should generate application-controlled error messages. If exceptions or interpreter-generated error messages occur, this indicates that the input was not detected and handled within the application logic itself.

Automated Static Analysis - Binary or Bytecode

According to SOAR, the following detection techniques may be useful:

Cost effective for partial coverage:
  • Bytecode Weakness Analysis - including disassembler + source code weakness analysis
  • Binary Weakness Analysis - including disassembler + source code weakness analysis

Effectiveness : SOAR Partial

Manual Static Analysis - Binary or Bytecode

According to SOAR, the following detection techniques may be useful:

Cost effective for partial coverage:
  • Binary / Bytecode disassembler - then use manual analysis for vulnerabilities & anomalies

Effectiveness : SOAR Partial

Dynamic Analysis with Automated Results Interpretation

According to SOAR, the following detection techniques may be useful:

Highly cost effective:
  • Web Application Scanner
  • Web Services Scanner
  • Database Scanners

Effectiveness : High

Dynamic Analysis with Manual Results Interpretation

According to SOAR, the following detection techniques may be useful:

Highly cost effective:
  • Fuzz Tester
  • Framework-based Fuzzer
Cost effective for partial coverage:
  • Host Application Interface Scanner
  • Monitored Virtual Environment - run potentially malicious code in sandbox / wrapper / virtual machine, see if it does anything suspicious

Effectiveness : High

Manual Static Analysis - Source Code

According to SOAR, the following detection techniques may be useful:

Highly cost effective:
  • Focused Manual Spotcheck - Focused manual analysis of source
  • Manual Source Code Review (not inspections)

Effectiveness : High

Automated Static Analysis - Source Code

According to SOAR, the following detection techniques may be useful:

Highly cost effective:
  • Source code Weakness Analyzer
  • Context-configured Source Code Weakness Analyzer

Effectiveness : High

Architecture or Design Review

According to SOAR, the following detection techniques may be useful:

Highly cost effective:
  • Inspection (IEEE 1028 standard) (can apply to requirements, design, source code, etc.)
  • Formal Methods / Correct-By-Construction
Cost effective for partial coverage:
  • Attack Modeling

Effectiveness : High

Vulnerability Mapping Notes

Rationale : CWE-20 is commonly misused in low-information vulnerability reports when lower-level CWEs could be used instead, or when more details about the vulnerability are available [REF-1287]. It is not useful for trend analysis. It is also a level-1 Class (i.e., a child of a Pillar).
Comments : Consider lower-level children such as Improper Use of Validation Framework (CWE-1173) or improper validation involving specific types or properties of input such as Specified Quantity (CWE-1284); Specified Index, Position, or Offset (CWE-1285); Syntactic Correctness (CWE-1286); Specified Type (CWE-1287); Consistency within Input (CWE-1288); or Unsafe Equivalence (CWE-1289).

Related Attack Patterns

CAPEC-ID Attack Pattern Name
CAPEC-10 Buffer Overflow via Environment Variables
This attack pattern involves causing a buffer overflow through manipulation of environment variables. Once the adversary finds that they can modify an environment variable, they may try to overflow associated buffers. This attack leverages implicit trust often placed in environment variables.
CAPEC-101 Server Side Include (SSI) Injection
An attacker can use Server Side Include (SSI) Injection to send code to a web application that then gets executed by the web server. Doing so enables the attacker to achieve similar results to Cross Site Scripting, viz., arbitrary code execution and information disclosure, albeit on a more limited scale, since the SSI directives are nowhere near as powerful as a full-fledged scripting language. Nonetheless, the attacker can conveniently gain access to sensitive files, such as password files, and execute shell commands.
CAPEC-104 Cross Zone Scripting
An attacker is able to cause a victim to load content into their web-browser that bypasses security zone controls and gain access to increased privileges to execute scripting code or other web objects such as unsigned ActiveX controls or applets. This is a privilege elevation attack targeted at zone-based web-browser security.
CAPEC-108 Command Line Execution through SQL Injection
An attacker uses standard SQL injection methods to inject data into the command line for execution. This could be done directly through misuse of directives such as MSSQL_xp_cmdshell or indirectly through injection of data into the database that would be interpreted as shell commands. Sometime later, an unscrupulous backend application (or could be part of the functionality of the same application) fetches the injected data stored in the database and uses this data as command line arguments without performing proper validation. The malicious data escapes that data plane by spawning new commands to be executed on the host.
CAPEC-109 Object Relational Mapping Injection
An attacker leverages a weakness present in the database access layer code generated with an Object Relational Mapping (ORM) tool or a weakness in the way that a developer used a persistence framework to inject their own SQL commands to be executed against the underlying database. The attack here is similar to plain SQL injection, except that the application does not use JDBC to directly talk to the database, but instead it uses a data access layer generated by an ORM tool or framework (e.g. Hibernate). While most of the time code generated by an ORM tool contains safe access methods that are immune to SQL injection, sometimes either due to some weakness in the generated code or due to the fact that the developer failed to use the generated access methods properly, SQL injection is still possible.
CAPEC-110 SQL Injection through SOAP Parameter Tampering
An attacker modifies the parameters of the SOAP message that is sent from the service consumer to the service provider to initiate a SQL injection attack. On the service provider side, the SOAP message is parsed and parameters are not properly validated before being used to access a database in a way that does not use parameter binding, thus enabling the attacker to control the structure of the executed SQL query. This pattern describes a SQL injection attack with the delivery mechanism being a SOAP message.
CAPEC-120 Double Encoding
The adversary utilizes a repeating of the encoding process for a set of characters (that is, character encoding a character encoding of a character) to obfuscate the payload of a particular request. This may allow the adversary to bypass filters that attempt to detect illegal characters or strings, such as those that might be used in traversal or injection attacks. Filters may be able to catch illegal encoded strings, but may not catch doubly encoded strings. For example, a dot (.), often used in path traversal attacks and therefore often blocked by filters, could be URL encoded as %2E. However, many filters recognize this encoding and would still block the request. In a double encoding, the % in the above URL encoding would be encoded again as %25, resulting in %252E which some filters might not catch, but which could still be interpreted as a dot (.) by interpreters on the target.
CAPEC-13 Subverting Environment Variable Values
The adversary directly or indirectly modifies environment variables used by or controlling the target software. The adversary's goal is to cause the target software to deviate from its expected operation in a manner that benefits the adversary.
CAPEC-135 Format String Injection
An adversary includes formatting characters in a string input field on the target application. Most applications assume that users will provide static text and may respond unpredictably to the presence of formatting character. For example, in certain functions of the C programming languages such as printf, the formatting character %s will print the contents of a memory location expecting this location to identify a string and the formatting character %n prints the number of DWORD written in the memory. An adversary can use this to read or write to memory locations or files, or simply to manipulate the value of the resulting text in unexpected ways. Reading or writing memory may result in program crashes and writing memory could result in the execution of arbitrary code if the adversary can write to the program stack.
CAPEC-136 LDAP Injection
An attacker manipulates or crafts an LDAP query for the purpose of undermining the security of the target. Some applications use user input to create LDAP queries that are processed by an LDAP server. For example, a user might provide their username during authentication and the username might be inserted in an LDAP query during the authentication process. An attacker could use this input to inject additional commands into an LDAP query that could disclose sensitive information. For example, entering a * in the aforementioned query might return information about all users on the system. This attack is very similar to an SQL injection attack in that it manipulates a query to gather additional information or coerce a particular return value.
CAPEC-14 Client-side Injection-induced Buffer Overflow
This type of attack exploits a buffer overflow vulnerability in targeted client software through injection of malicious content from a custom-built hostile service. This hostile service is created to deliver the correct content to the client software. For example, if the client-side application is a browser, the service will host a webpage that the browser loads.
CAPEC-153 Input Data Manipulation
An attacker exploits a weakness in input validation by controlling the format, structure, and composition of data to an input-processing interface. By supplying input of a non-standard or unexpected form an attacker can adversely impact the security of the target.
CAPEC-182 Flash Injection
An attacker tricks a victim to execute malicious flash content that executes commands or makes flash calls specified by the attacker. One example of this attack is cross-site flashing, an attacker controlled parameter to a reference call loads from content specified by the attacker.
CAPEC-209 XSS Using MIME Type Mismatch
An adversary creates a file with scripting content but where the specified MIME type of the file is such that scripting is not expected. The adversary tricks the victim into accessing a URL that responds with the script file. Some browsers will detect that the specified MIME type of the file does not match the actual type of its content and will automatically switch to using an interpreter for the real content type. If the browser does not invoke script filters before doing this, the adversary's script may run on the target unsanitized, possibly revealing the victim's cookies or executing arbitrary script in their browser.
CAPEC-22 Exploiting Trust in Client
An attack of this type exploits vulnerabilities in client/server communication channel authentication and data integrity. It leverages the implicit trust a server places in the client, or more importantly, that which the server believes is the client. An attacker executes this type of attack by communicating directly with the server where the server believes it is communicating only with a valid client. There are numerous variations of this type of attack.
CAPEC-23 File Content Injection
An adversary poisons files with a malicious payload (targeting the file systems accessible by the target software), which may be passed through by standard channels such as via email, and standard web content like PDF and multimedia files. The adversary exploits known vulnerabilities or handling routines in the target processes, in order to exploit the host's trust in executing remote content, including binary files.
CAPEC-230 Serialized Data with Nested Payloads
Applications often need to transform data in and out of a data format (e.g., XML and YAML) by using a parser. It may be possible for an adversary to inject data that may have an adverse effect on the parser when it is being processed. Many data format languages allow the definition of macro-like structures that can be used to simplify the creation of complex structures. By nesting these structures, causing the data to be repeatedly substituted, an adversary can cause the parser to consume more resources while processing, causing excessive memory consumption and CPU utilization.
CAPEC-231 Oversized Serialized Data Payloads
An adversary injects oversized serialized data payloads into a parser during data processing to produce adverse effects upon the parser such as exhausting system resources and arbitrary code execution.
CAPEC-24 Filter Failure through Buffer Overflow
In this attack, the idea is to cause an active filter to fail by causing an oversized transaction. An attacker may try to feed overly long input strings to the program in an attempt to overwhelm the filter (by causing a buffer overflow) and hoping that the filter does not fail securely (i.e. the user input is let into the system unfiltered).
CAPEC-250 XML Injection
An attacker utilizes crafted XML user-controllable input to probe, attack, and inject data into the XML database, using techniques similar to SQL injection. The user-controllable input can allow for unauthorized viewing of data, bypassing authentication or the front-end application for direct XML database access, and possibly altering database information.
CAPEC-261 Fuzzing for garnering other adjacent user/sensitive data
An adversary who is authorized to send queries to a target sends variants of expected queries in the hope that these modified queries might return information (directly or indirectly through error logs) beyond what the expected set of queries should provide.
CAPEC-267 Leverage Alternate Encoding
An adversary leverages the possibility to encode potentially harmful input or content used by applications such that the applications are ineffective at validating this encoding standard.
CAPEC-28 Fuzzing
In this attack pattern, the adversary leverages fuzzing to try to identify weaknesses in the system. Fuzzing is a software security and functionality testing method that feeds randomly constructed input to the system and looks for an indication that a failure in response to that input has occurred. Fuzzing treats the system as a black box and is totally free from any preconceptions or assumptions about the system. Fuzzing can help an attacker discover certain assumptions made about user input in the system. Fuzzing gives an attacker a quick way of potentially uncovering some of these assumptions despite not necessarily knowing anything about the internals of the system. These assumptions can then be turned against the system by specially crafting user input that may allow an attacker to achieve their goals.
CAPEC-3 Using Leading 'Ghost' Character Sequences to Bypass Input Filters
Some APIs will strip certain leading characters from a string of parameters. An adversary can intentionally introduce leading "ghost" characters (extra characters that don't affect the validity of the request at the API layer) that enable the input to pass the filters and therefore process the adversary's input. This occurs when the targeted API will accept input data in several syntactic forms and interpret it in the equivalent semantic way, while the filter does not take into account the full spectrum of the syntactic forms acceptable to the targeted API.
CAPEC-31 Accessing/Intercepting/Modifying HTTP Cookies
This attack relies on the use of HTTP Cookies to store credentials, state information and other critical data on client systems. There are several different forms of this attack. The first form of this attack involves accessing HTTP Cookies to mine for potentially sensitive data contained therein. The second form involves intercepting this data as it is transmitted from client to server. This intercepted information is then used by the adversary to impersonate the remote user/session. The third form is when the cookie's content is modified by the adversary before it is sent back to the server. Here the adversary seeks to convince the target server to operate on this falsified information.
CAPEC-42 MIME Conversion
An attacker exploits a weakness in the MIME conversion routine to cause a buffer overflow and gain control over the mail server machine. The MIME system is designed to allow various different information formats to be interpreted and sent via e-mail. Attack points exist when data are converted to MIME compatible format and back.
CAPEC-43 Exploiting Multiple Input Interpretation Layers
An attacker supplies the target software with input data that contains sequences of special characters designed to bypass input validation logic. This exploit relies on the target making multiples passes over the input data and processing a "layer" of special characters with each pass. In this manner, the attacker can disguise input that would otherwise be rejected as invalid by concealing it with layers of special/escape characters that are stripped off by subsequent processing steps. The goal is to first discover cases where the input validation layer executes before one or more parsing layers. That is, user input may go through the following logic in an application: --> --> . In such cases, the attacker will need to provide input that will pass through the input validator, but after passing through parser2, will be converted into something that the input validator was supposed to stop.
CAPEC-45 Buffer Overflow via Symbolic Links
This type of attack leverages the use of symbolic links to cause buffer overflows. An adversary can try to create or manipulate a symbolic link file such that its contents result in out of bounds data. When the target software processes the symbolic link file, it could potentially overflow internal buffers with insufficient bounds checking.
CAPEC-46 Overflow Variables and Tags
This type of attack leverages the use of tags or variables from a formatted configuration data to cause buffer overflow. The adversary crafts a malicious HTML page or configuration file that includes oversized strings, thus causing an overflow.
CAPEC-47 Buffer Overflow via Parameter Expansion
In this attack, the target software is given input that the adversary knows will be modified and expanded in size during processing. This attack relies on the target software failing to anticipate that the expanded data may exceed some internal limit, thereby creating a buffer overflow.
CAPEC-473 Signature Spoof
An attacker generates a message or datablock that causes the recipient to believe that the message or datablock was generated and cryptographically signed by an authoritative or reputable source, misleading a victim or victim operating system into performing malicious actions.
CAPEC-52 Embedding NULL Bytes
An adversary embeds one or more null bytes in input to the target software. This attack relies on the usage of a null-valued byte as a string terminator in many environments. The goal is for certain components of the target software to stop processing the input when it encounters the null byte(s).
CAPEC-53 Postfix, Null Terminate, and Backslash
If a string is passed through a filter of some kind, then a terminal NULL may not be valid. Using alternate representation of NULL allows an adversary to embed the NULL mid-string while postfixing the proper data so that the filter is avoided. One example is a filter that looks for a trailing slash character. If a string insertion is possible, but the slash must exist, an alternate encoding of NULL in mid-string may be used.
CAPEC-588 DOM-Based XSS
This type of attack is a form of Cross-Site Scripting (XSS) where a malicious script is inserted into the client-side HTML being parsed by a web browser. Content served by a vulnerable web application includes script code used to manipulate the Document Object Model (DOM). This script code either does not properly validate input, or does not perform proper output encoding, thus creating an opportunity for an adversary to inject a malicious script launch a XSS attack. A key distinction between other XSS attacks and DOM-based attacks is that in other XSS attacks, the malicious script runs when the vulnerable web page is initially loaded, while a DOM-based attack executes sometime after the page loads. Another distinction of DOM-based attacks is that in some cases, the malicious script is never sent to the vulnerable web server at all. An attack like this is guaranteed to bypass any server-side filtering attempts to protect users.
CAPEC-63 Cross-Site Scripting (XSS)
An adversary embeds malicious scripts in content that will be served to web browsers. The goal of the attack is for the target software, the client-side browser, to execute the script with the users' privilege level. An attack of this type exploits a programs' vulnerabilities that are brought on by allowing remote hosts to execute code and scripts. Web browsers, for example, have some simple security controls in place, but if a remote attacker is allowed to execute scripts (through injecting them in to user-generated content like bulletin boards) then these controls may be bypassed. Further, these attacks are very difficult for an end user to detect.
CAPEC-64 Using Slashes and URL Encoding Combined to Bypass Validation Logic
This attack targets the encoding of the URL combined with the encoding of the slash characters. An attacker can take advantage of the multiple ways of encoding a URL and abuse the interpretation of the URL. A URL may contain special character that need special syntax handling in order to be interpreted. Special characters are represented using a percentage character followed by two digits representing the octet code of the original character (%HEX-CODE). For instance US-ASCII space character would be represented with %20. This is often referred as escaped ending or percent-encoding. Since the server decodes the URL from the requests, it may restrict the access to some URL paths by validating and filtering out the URL requests it received. An attacker will try to craft an URL with a sequence of special characters which once interpreted by the server will be equivalent to a forbidden URL. It can be difficult to protect against this attack since the URL can contain other format of encoding such as UTF-8 encoding, Unicode-encoding, etc.
CAPEC-664 Server Side Request Forgery

An adversary exploits improper input validation by submitting maliciously crafted input to a target application running on a server, with the goal of forcing the server to make a request either to itself, to web services running in the server’s internal network, or to external third parties. If successful, the adversary’s request will be made with the server’s privilege level, bypassing its authentication controls. This ultimately allows the adversary to access sensitive data, execute commands on the server’s network, and make external requests with the stolen identity of the server. Server Side Request Forgery attacks differ from Cross Site Request Forgery attacks in that they target the server itself, whereas CSRF attacks exploit an insecure user authentication mechanism to perform unauthorized actions on the user's behalf.

CAPEC-67 String Format Overflow in syslog()
This attack targets applications and software that uses the syslog() function insecurely. If an application does not explicitely use a format string parameter in a call to syslog(), user input can be placed in the format string parameter leading to a format string injection attack. Adversaries can then inject malicious format string commands into the function call leading to a buffer overflow. There are many reported software vulnerabilities with the root cause being a misuse of the syslog() function.
CAPEC-7 Blind SQL Injection
Blind SQL Injection results from an insufficient mitigation for SQL Injection. Although suppressing database error messages are considered best practice, the suppression alone is not sufficient to prevent SQL Injection. Blind SQL Injection is a form of SQL Injection that overcomes the lack of error messages. Without the error messages that facilitate SQL Injection, the adversary constructs input strings that probe the target through simple Boolean SQL expressions. The adversary can determine if the syntax and structure of the injection was successful based on whether the query was executed or not. Applied iteratively, the adversary determines how and where the target is vulnerable to SQL Injection.
CAPEC-71 Using Unicode Encoding to Bypass Validation Logic
An attacker may provide a Unicode string to a system component that is not Unicode aware and use that to circumvent the filter or cause the classifying mechanism to fail to properly understanding the request. That may allow the attacker to slip malicious data past the content filter and/or possibly cause the application to route the request incorrectly.
CAPEC-72 URL Encoding
This attack targets the encoding of the URL. An adversary can take advantage of the multiple way of encoding an URL and abuse the interpretation of the URL.
CAPEC-73 User-Controlled Filename
An attack of this type involves an adversary inserting malicious characters (such as a XSS redirection) into a filename, directly or indirectly that is then used by the target software to generate HTML text or other potentially executable content. Many websites rely on user-generated content and dynamically build resources like files, filenames, and URL links directly from user supplied data. In this attack pattern, the attacker uploads code that can execute in the client browser and/or redirect the client browser to a site that the attacker owns. All XSS attack payload variants can be used to pass and exploit these vulnerabilities.
CAPEC-78 Using Escaped Slashes in Alternate Encoding
This attack targets the use of the backslash in alternate encoding. An adversary can provide a backslash as a leading character and causes a parser to believe that the next character is special. This is called an escape. By using that trick, the adversary tries to exploit alternate ways to encode the same character which leads to filter problems and opens avenues to attack.
CAPEC-79 Using Slashes in Alternate Encoding
This attack targets the encoding of the Slash characters. An adversary would try to exploit common filtering problems related to the use of the slashes characters to gain access to resources on the target host. Directory-driven systems, such as file systems and databases, typically use the slash character to indicate traversal between directories or other container components. For murky historical reasons, PCs (and, as a result, Microsoft OSs) choose to use a backslash, whereas the UNIX world typically makes use of the forward slash. The schizophrenic result is that many MS-based systems are required to understand both forms of the slash. This gives the adversary many opportunities to discover and abuse a number of common filtering problems. The goal of this pattern is to discover server software that only applies filters to one version, but not the other.
CAPEC-8 Buffer Overflow in an API Call
This attack targets libraries or shared code modules which are vulnerable to buffer overflow attacks. An adversary who has knowledge of known vulnerable libraries or shared code can easily target software that makes use of these libraries. All clients that make use of the code library thus become vulnerable by association. This has a very broad effect on security across a system, usually affecting more than one software process.
CAPEC-80 Using UTF-8 Encoding to Bypass Validation Logic
This attack is a specific variation on leveraging alternate encodings to bypass validation logic. This attack leverages the possibility to encode potentially harmful input in UTF-8 and submit it to applications not expecting or effective at validating this encoding standard making input filtering difficult. UTF-8 (8-bit UCS/Unicode Transformation Format) is a variable-length character encoding for Unicode. Legal UTF-8 characters are one to four bytes long. However, early version of the UTF-8 specification got some entries wrong (in some cases it permitted overlong characters). UTF-8 encoders are supposed to use the "shortest possible" encoding, but naive decoders may accept encodings that are longer than necessary. According to the RFC 3629, a particularly subtle form of this attack can be carried out against a parser which performs security-critical validity checks against the UTF-8 encoded form of its input, but interprets certain illegal octet sequences as characters.
CAPEC-81 Web Server Logs Tampering
Web Logs Tampering attacks involve an attacker injecting, deleting or otherwise tampering with the contents of web logs typically for the purposes of masking other malicious behavior. Additionally, writing malicious data to log files may target jobs, filters, reports, and other agents that process the logs in an asynchronous attack pattern. This pattern of attack is similar to "Log Injection-Tampering-Forging" except that in this case, the attack is targeting the logs of the web server and not the application.
CAPEC-83 XPath Injection
An attacker can craft special user-controllable input consisting of XPath expressions to inject the XML database and bypass authentication or glean information that they normally would not be able to. XPath Injection enables an attacker to talk directly to the XML database, thus bypassing the application completely. XPath Injection results from the failure of an application to properly sanitize input used as part of dynamic XPath expressions used to query an XML database.
CAPEC-85 AJAX Footprinting
This attack utilizes the frequent client-server roundtrips in Ajax conversation to scan a system. While Ajax does not open up new vulnerabilities per se, it does optimize them from an attacker point of view. A common first step for an attacker is to footprint the target environment to understand what attacks will work. Since footprinting relies on enumeration, the conversational pattern of rapid, multiple requests and responses that are typical in Ajax applications enable an attacker to look for many vulnerabilities, well-known ports, network locations and so on. The knowledge gained through Ajax fingerprinting can be used to support other attacks, such as XSS.
CAPEC-88 OS Command Injection
In this type of an attack, an adversary injects operating system commands into existing application functions. An application that uses untrusted input to build command strings is vulnerable. An adversary can leverage OS command injection in an application to elevate privileges, execute arbitrary commands and compromise the underlying operating system.
CAPEC-9 Buffer Overflow in Local Command-Line Utilities
This attack targets command-line utilities available in a number of shells. An adversary can leverage a vulnerability found in a command-line utility to escalate privilege to root.

Notes

CWE-116 and CWE-20 have a close association because, depending on the nature of the structured message, proper input validation can indirectly prevent special characters from changing the meaning of a structured message. For example, by validating that a numeric ID field should only contain the 0-9 characters, the programmer effectively prevents injection attacks.


As of 2020, this entry is used more often than preferred, and it is a source of frequent confusion. It is being actively modified for CWE 4.1 and subsequent versions.
Concepts such as validation, data transformation, and neutralization are being refined, so relationships between CWE-20 and other entries such as CWE-707 may change in future versions, along with an update to the Vulnerability Theory document.
Input validation - whether missing or incorrect - is such an essential and widespread part of secure development that it is implicit in many different weaknesses. Traditionally, problems such as buffer overflows and XSS have been classified as input validation problems by many security professionals. However, input validation is not necessarily the only protection mechanism available for avoiding such problems, and in some cases it is not even sufficient. The CWE team has begun capturing these subtleties in chains within the Research Concepts view (CWE-1000), but more work is needed.

The "input validation" term is extremely common, but it is used in many different ways. In some cases its usage can obscure the real underlying weakness or otherwise hide chaining and composite relationships.

Some people use "input validation" as a general term that covers many different neutralization techniques for ensuring that input is appropriate, such as filtering, canonicalization, and escaping. Others use the term in a more narrow context to simply mean "checking if an input conforms to expectations without changing it." CWE uses this more narrow interpretation.


References

REF-6

Seven Pernicious Kingdoms: A Taxonomy of Software Security Errors
Katrina Tsipenyuk, Brian Chess, Gary McGraw.
https://samate.nist.gov/SSATTM_Content/papers/Seven%20Pernicious%20Kingdoms%20-%20Taxonomy%20of%20Sw%20Security%20Errors%20-%20Tsipenyuk%20-%20Chess%20-%20McGraw.pdf

REF-166

Input Validation with ESAPI - Very Important
Jim Manico.
https://manicode.blogspot.com/2008/08/input-validation-with-esapi.html

REF-45

OWASP Enterprise Security API (ESAPI) Project
OWASP.
http://www.owasp.org/index.php/ESAPI

REF-168

Hacking Exposed Web Applications, Second Edition
Joel Scambray, Mike Shema, Caleb Sima.

REF-48

Input validation or output filtering, which is better?
Jeremiah Grossman.
https://blog.jeremiahgrossman.com/2007/01/input-validation-or-output-filtering.html

REF-170

The importance of input validation
Kevin Beaver.
http://searchsoftwarequality.techtarget.com/tip/0,289483,sid92_gci1214373,00.html

REF-7

Writing Secure Code
Michael Howard, David LeBlanc.
https://www.microsoftpressstore.com/store/writing-secure-code-9780735617223

REF-1109

LANGSEC: Language-theoretic Security
http://langsec.org/

REF-1110

LangSec: Recognition, Validation, and Compositional Correctness for Real World Security
http://langsec.org/bof-handout.pdf

REF-1111

Curing the Vulnerable Parser: Design Patterns for Secure Input Handling
Sergey Bratus, Lars Hermerschmidt, Sven M. Hallberg, Michael E. Locasto, Falcon D. Momot, Meredith L. Patterson, Anna Shubina.
https://www.usenix.org/system/files/login/articles/login_spring17_08_bratus.pdf

REF-1287

Supplemental Details - 2022 CWE Top 25
MITRE.
https://cwe.mitre.org/top25/archive/2022/2022_cwe_top25_supplemental.html#problematicMappingDetails

Submission

Name Organization Date Date Release Version
7 Pernicious Kingdoms 2006-07-19 +00:00 2006-07-19 +00:00 Draft 3

Modifications

Name Organization Date Comment
Eric Dalci Cigital 2008-07-01 +00:00 updated Potential_Mitigations, Time_of_Introduction
Veracode 2008-08-15 +00:00 Suggested OWASP Top Ten 2004 mapping
CWE Content Team MITRE 2008-09-08 +00:00 updated Relationships, Other_Notes, Taxonomy_Mappings
CWE Content Team MITRE 2008-11-24 +00:00 updated Relationships, Taxonomy_Mappings
CWE Content Team MITRE 2009-01-12 +00:00 updated Applicable_Platforms, Common_Consequences, Demonstrative_Examples, Description, Likelihood_of_Exploit, Name, Observed_Examples, Other_Notes, Potential_Mitigations, References, Relationship_Notes, Relationships
CWE Content Team MITRE 2009-03-10 +00:00 updated Description, Potential_Mitigations
CWE Content Team MITRE 2009-05-27 +00:00 updated Related_Attack_Patterns
CWE Content Team MITRE 2009-07-27 +00:00 updated Relationships
CWE Content Team MITRE 2009-10-29 +00:00 updated Common_Consequences, Demonstrative_Examples, Maintenance_Notes, Modes_of_Introduction, Observed_Examples, Relationships, Research_Gaps, Terminology_Notes
CWE Content Team MITRE 2009-12-28 +00:00 updated Applicable_Platforms, Demonstrative_Examples, Detection_Factors
CWE Content Team MITRE 2010-02-16 +00:00 updated Detection_Factors, Potential_Mitigations, References, Taxonomy_Mappings
CWE Content Team MITRE 2010-04-05 +00:00 updated Related_Attack_Patterns
CWE Content Team MITRE 2010-06-21 +00:00 updated Potential_Mitigations, Research_Gaps, Terminology_Notes
CWE Content Team MITRE 2010-09-27 +00:00 updated Potential_Mitigations, Relationships
CWE Content Team MITRE 2010-12-13 +00:00 updated Demonstrative_Examples, Description
CWE Content Team MITRE 2011-03-29 +00:00 updated Observed_Examples
CWE Content Team MITRE 2011-06-01 +00:00 updated Applicable_Platforms, Common_Consequences, Relationship_Notes
CWE Content Team MITRE 2011-09-13 +00:00 updated Relationships, Taxonomy_Mappings
CWE Content Team MITRE 2012-05-11 +00:00 updated Demonstrative_Examples, References, Related_Attack_Patterns, Relationships
CWE Content Team MITRE 2012-10-30 +00:00 updated Potential_Mitigations
CWE Content Team MITRE 2013-02-21 +00:00 updated Relationships
CWE Content Team MITRE 2013-07-17 +00:00 updated Relationships
CWE Content Team MITRE 2014-02-18 +00:00 updated Demonstrative_Examples, Related_Attack_Patterns
CWE Content Team MITRE 2014-07-30 +00:00 updated Detection_Factors, Relationships, Taxonomy_Mappings
CWE Content Team MITRE 2015-12-07 +00:00 updated Relationships
CWE Content Team MITRE 2017-01-19 +00:00 updated Related_Attack_Patterns, Relationships
CWE Content Team MITRE 2017-05-03 +00:00 updated Related_Attack_Patterns, Relationships
CWE Content Team MITRE 2017-11-08 +00:00 updated Modes_of_Introduction, References, Relationships, Taxonomy_Mappings
CWE Content Team MITRE 2018-03-27 +00:00 updated References
CWE Content Team MITRE 2019-01-03 +00:00 updated Related_Attack_Patterns, Relationships
CWE Content Team MITRE 2019-06-20 +00:00 updated Related_Attack_Patterns, Relationships
CWE Content Team MITRE 2019-09-19 +00:00 updated Relationships
CWE Content Team MITRE 2020-02-24 +00:00 updated Potential_Mitigations, References, Related_Attack_Patterns, Relationships
CWE Content Team MITRE 2020-06-25 +00:00 updated Applicable_Platforms, Demonstrative_Examples, Description, Maintenance_Notes, Observed_Examples, Potential_Mitigations, References, Relationship_Notes, Relationships, Research_Gaps, Terminology_Notes
CWE Content Team MITRE 2020-08-20 +00:00 updated Potential_Mitigations, Related_Attack_Patterns, Relationships
CWE Content Team MITRE 2021-03-15 +00:00 updated Description, Potential_Mitigations
CWE Content Team MITRE 2021-07-20 +00:00 updated Related_Attack_Patterns, Relationships
CWE Content Team MITRE 2021-10-28 +00:00 updated Relationships
CWE Content Team MITRE 2022-04-28 +00:00 updated Relationships
CWE Content Team MITRE 2022-06-28 +00:00 updated Observed_Examples, Relationships
CWE Content Team MITRE 2022-10-13 +00:00 updated References, Relationships
CWE Content Team MITRE 2023-04-27 +00:00 updated References, Relationships
CWE Content Team MITRE 2023-06-29 +00:00 updated Mapping_Notes, Relationships
CWE Content Team MITRE 2023-10-26 +00:00 updated Observed_Examples
CWE Content Team MITRE 2024-07-16 +00:00 updated Observed_Examples
CWE Content Team MITRE 2024-11-19 +00:00 updated Relationships
Click on the button to the left (OFF), to authorize the inscription of cookie improving the functionalities of the site. Click on the button to the left (Accept all), to unauthorize the inscription of cookie improving the functionalities of the site.