Excessive Data Exposure: What It Is, How We Can Help

PUBLISHED ON August 12, 2022
LAST UPDATED October 26, 2023

2023 UPDATE: In the 2023 OWASP API Top 10 vulnerabilities list, Excessive Data Exposure and Mass Assignment are combined into a new entry, Broken Object Property Level Authorization. OWASP made this change to focus on the root cause of these two vulnerabilities: the lack of or improper authorization at the object property level. This can lead to unauthorized parties getting access to or manipulating sensitive data.  

Get details on Broken Object Property Level Authorization.

No. 3 on the OWASP API Top 10 vulnerabilities list is excessive data exposure (after BOLA and broken user authentication). OWASP says of this vulnerability, “Looking forward to generic implementations, developers tend to expose all object properties without considering their individual sensitivity, relying on clients to perform the data filtering before displaying it to the user.” 

How Do Excessive Data Exposure Exploits Work? 

Attackers can probe for excessive data exposure in a number of ways. They can analyze legitimate response traffic, looking for exposed sensitive data, or, more commonly, they can look for human patterns – development team practices – that indicate ways to attack an API. 

OWASP gives this example: 

The mobile team uses the /api/articles/{articleId}/comments/{commentId} endpoint in the articles view to render comments metadata. Sniffing the mobile application traffic, an attacker finds out that other sensitive data related to comment’s author is also returned. The endpoint implementation uses a generic toJSON() method on the User model, which contains PII, to serialize the object. 

How to Prevent Excessive Data Exposure 

It’s a common practice when building APIs for developers to simply serialize all the data related to a particular API resource, irrespective of that data’s sensitivity. This practice may seem like a common sense time-saving design pattern, but it can result in an info leak, where sensitive data is exposed to unauthorized clients, or bad actors. A more defensive practice is to clearly classify data in a system, and to define a separate data model for public interfaces such as APIs. 

Bottom line for developers: Be conservative about what data you return in API responses. It might seem like a great idea to “future-proof” an API, making it applicable for applications that were not originally envisioned by the application owner. But, a future fraught with data breaches isn’t on anyone’s bucket list. Instead, be conservative in resource representations and only include data necessary for well-understood use cases. This conservative approach dramatically decreases implementation effort, and also presents less attack surface that bad actors can use to map, undermine, and exploit your API.  

How ThreatX Can Help 

A common attack pattern we see is analogous to directory enumeration, only applied to an API’s namespace. If your developers name calls for human readability, they might, for instance, implement something like: /api/getAddress /{userId}/ 

A clever attacker might already see a potential pattern here, and might try: 

…/api/getSSN /{userId}/ 

If the application stored this sensitive piece of PII (pro-tip, it shouldn’t), an attacker might guess their way to an undocumented API call, and through that recon, they might gain access to PII. 

This is where the ThreatX Platform shines. In the attack referenced above, the user tried a few different patterns. Even if they guessed right, there was still visible signal of attack. We protect against that pattern of attack two ways.  

First, application owners who already have an OpenAPI Schema file for their API can upload that schema to the ThreatX Platform. ThreatX will then monitor the covered endpoints and report on any traffic coming through that doesn’t follow the patterns defined by the schema. 

Secondly, for application owners who do not yet have an OpenAPI Schema specification, ThreatX provides visibility into the API endpoints actually in use within your system, and gives you the tools to block excessive data exposure within those APIs.  

How Our Approach Is Unique   

Real-Time Blocking   

Some API security solutions simply highlight potential API vulnerabilities, leaving security teams to investigate and recommend code changes. Other API solutions can identify an attacking IP, but require security teams to try to model the complex behavior in a third-party WAF (or try to block one IP at a time after the fact). ThreatX doesn’t just tell you about your API vulnerabilities like excessive data exposure; we also block API attacks in real-time. ThreatX proxies and scans all inbound API traffic – in real time – identifying and blocking attacks.   

ThreatX can recognize attacker behavior indicative of an attempt to exploit excessive data exposure, then flag and watch that user. This real-time monitoring enables ThreatX to execute advanced threat engagement techniques, such as IP interrogation, fingerprinting, and tarpitting. When a series of user interactions cross our default (or your customized) risk threshold, we block the attack.    

Step One of N…   

One caution on the OWASP Top 10 list of API vulnerabilities – it’s a great way to categorize vulnerabilities, but it’s not necessarily an attacker playbook. They don’t try one simple thing. They attack APIs by stringing together a series of attacks over time, often using federated and sophisticated botnets. Countering this approach requires the ability to correlate attack traffic across multiple IPs, the use of advanced bot protection, and the ability to detect identifiers and techniques to associate the traffic to a unique attacker.  

Identifying Risk   

Attackers camouflage their attempts to exploit an API with excessive data exposure or other vulnerabilities by generating more suspicious or elevated application traffic. ThreatX detects and blocks potential threats based on behavior, but also identifies risky attributes being used in API traffic. ThreatX’s new API Dashboard details API endpoint usage and how it compares to expected behavior defined by an organization’s API schema specifications. With this visibility, customers can identify back doors and shut them against these sophisticated, multi-mode attacks that are becoming a common threat.   

Fewer False Positives   

As risk rises, ThreatX immediately blocks an attack – stopping the threat in its tracks. ThreatX’s blocking modes are designed to block malicious requests and deter suspicious entities from attacking your APIs, while allowing benign traffic and real users through. Legacy WAFs have always struggled with false positives because they make binary blocking decisions based on rule matches and signatures; but neither attackers nor legitimate users play by the rules. Sometimes a legitimate user who forgot their password looks like an attacker, and sometimes an attacker cycling through usernames and passwords looks like a legitimate user. ThreatX can tell the difference.   

Learn more in A Security Practitioner’s Introduction to API Attack Protection. Or request a demo of the ThreatX solution.   


About the Author

Bret Settle

Bret has served in multiple executive roles for Corporate Express/Staples and BMC Software and has extensive knowledge of the software development and security products industries. Bret has been responsible for enterprise security in multiple roles and has been an innovator throughout his career and has a proven track record of building and developing high performing organizations and dynamic cyber security teams.