Data Handling

Data handling is the standardized context in how we want SDKs help users filter data.

Sensitive Data

SDKs should not include PII or other sensitive data in the payload by default. When building an SDK we can come across to some API that can give useful information to debug a problem. In the event that API returns data considered PII, we guard that behind a flag called Send Default PII. This is an option in the SDK called send-default-pii and is disabled by default. That means that data that is naturally sensitive is not sent by default.

Some examples of data guarded by this flag:

When attaching HTTP requests to events
- Request Body: "raw" bodies (bodies which cannot be parsed as JSON or formdata) are removed
- HTTP Headers: known sensitive headers such as Authorization or Cookies are removed too.
- Note that if a user explicitly sets a request on the scope, nothing is stripped from that request. The above rules only apply to integrations that come with the SDK.
User-specific information (e.g. the current user ID according to the used web-framework) is not sent at all.
On desktop applications
- The username logged in the device is not included. This is often a person's name.
- The machine name is not included, for example Bruno's laptop
SDKs don't set {{auto}} as user.ip. This instructs the server to keep the connection's IP address.*

Specifically about IP address, it's important to note that it's standard to log IP address of incoming connecting in services on the Internet. This not only allows security tools and operations to understand abuse coming from a single IP, like spam bots and other issues. But also developers to understand if issues in their application are being triggered by a single malicious source.

Sentry server is always aware of the connecting IP address and can use it for logging in some platforms. Namely JavaScript and iOS/macOS/tvOS. All other platforms require the event to include user.ip={{auto}} which happens if sendDefaultPii is set to true.

Before sending events to Sentry, the SDKs should invokes callbacks. That allows users to remove any sensitive data client-side.

before-send and event-processors can be used to register a callback with custom logic to remove sensitive data.

Application State

App state can be critical to help developers reproduce bugs. For that reason, SDKs often collect app state and append to events through auto instrumentation.

When attaching data that could potentially include sensitive data or PII, it's important to:

Add a note on the docs to notify developers.
Mark that part of the protocol on Relay as such. This allows data scrubbing to run on those fields.

Some examples of auto instrumentation that could attach sensitive data:

A SQL integration that includes the query. If a user doesn't use parameterized queries, and appends sensitive data to it, the SDK could include that in the event payload.
Desktop apps including window title.
A Web framework routing instrumentation attaching route to and from.

Structuring Data

For better data scrubbing on the server side, SDKs should save data in a strucutured way, when possible. Starting point of the discussion was RFC-0038

Spans

This helps Relay to know what kind of data it receives and this helps with scrubbing sensitive data.

http spans containing urls:
The description of spans with op set to http must follow the format HTTP_METHOD scheme://host/path (ex. GET https://example.com/foo). If an authority is present in the URL (https://username:password@example.com), the authority must be omitted completely. If query strings or fragments are present in the URL, both are set into the data attribute of the span.
Copied
span.setData({ 'http.query': url.getQuery(), 'http.fragment: url.getFragment(), })
Additionally all semantic conventions of OpenTelementry for http spans should be set in the span.data if applicable: https://opentelemetry.io/docs/reference/specification/trace/semantic_conventions/http/
db spans containing database queries: (sql, graphql, elasticsearch, mongodb, ...)
The description fields should include the saniticed database command. All sensitive data should be removed and replaced with a placeholder.
Additionally all semantic conventions of OpenTelementry for database spans should be set in the span.data if applicable: https://opentelemetry.io/docs/reference/specification/trace/semantic_conventions/database/

Breadcrumbs

If the message in a breadcrumb contains an URL it should be formatted the same way as in http spans (see above). The query and the fragment should also be set in the data attribute like with http spans.

Variable Size

Fields in the event payload that allow user-specified or dynamic values are restricted in size. This applies to most meta data fields, such as variables in a stack trace, as well as contexts, tags and extra data:

Mappings of values (such as HTTP data, extra data, etc) are limited to 50 item pairs.
Event IDs are limited to 36 characters and must be valid UUIDs.
Tag keys are limited to 32 characters.
Tag values are limited to 200 characters.
Culprits are limited to 200 characters.
Context objects are limited to 8kB.
Individual extra data items are limited to 16kB. Total extra data is limited to 256kb.
Messages are limited to 8192 characters.
HTTP data (the body) is limited to 8kB. Always trim HTTP data before attaching it to the event.
Stack traces are limited to 50 frames. If more are sent, data will be removed from the middle of the stack.

Additionally, size limits apply to all store requests for the total size of the request, event payload, and attachments. Sentry rejects all requests exceeding these limits. Please refer the following resources for the exact size limits:

You can edit this page on GitHub.

Docs

General

Development

Application

Self-Hosted

Frontend

Backend

Services

SDK Development

Integrations

Resources

Meta Documentation