FAQ
General
Are you a data broker? Strictly speaking, no. We do not aim to redistribute others’ proprietary datasets standalone. Any dataset that we make available as a listing on Snowflake Marketplace meets at least some of the following criteria:
- Is transformed to match our centralized schema
- Has normalized entities that match our index of that entity
- Is combined with other datasets
- Is extrapolated or modeled to be representative of the real world
I have a dataset I would like to monetize, do you consult companies in monetization? Yes, although it depends on the dataset and your business model.
If your dataset is valuable on its own and you intend to monetize it directly, you likely do not need us: you can simply list your dataset on the Snowflake Marketplace. If your dataset would be more valuable if combined with other data assets or requires significant transformation to make it commercially viable, then we might be the right fit for a partnership. Please Contact Us.
What terms apply? Most of our listings (both free and paid) follow our standard terms of service. But sometimes we have custom terms for our paid products that will be included in the specific product listing when you purchase via the Snowflake Marketplace.
Snowflake Datasets & Updates
How do you determine which public data products are free and which are paid?
We aim to make all public domain data, typically from government releases, free of charge for internal use.
We structure that data to make it compatible across products with a common geo_id
. Often, government releases serve
as good benchmarks for external, proprietary data (for example, a proprietary real-time measure of inflation should
align to the monthly Bureau of Labor Statistics Inflation release).
For production use cases, the Snowflake Data Foundations paid product includes technical support, external derivative usage, point-in-time history, and backwards compatibility in addition to enterprise-only public datasets.
How often do datasets update and how do I receive updates? Each dataset (i.e. listing on the Snowflake Marketplace) updates at a different frequency, largely driven by the release time of the underlying data generating process. For instance, public domain datasets from government agencies update as frequently as the government releases new data. You can find release schedules from underlying sources on a dataset's documentation page in the "Data Sources and Release Frequency" section.
All updates across Snowflake datasets are tracked in our changelog.
How do you measure data quality? How will I know if there are data quality issues? For public domain data our intention is to pass through data, as accurately and quickly as possible, based on the government releases.
Our aim is to alert users to known issues. By default, the email used to mount a dataset in Snowflake Marketplace receives updates for known data issues.
If you see issues with data, please email us at snowflake-public-data@snowflake.com.
Queries on your dataset are slow; how do I fix this? We aim to optimize our datasets for the most common queries we anticipate users to run. Our data products are intended for broad use cases, so sometimes we may have missed optimizing for your specific needs. We commonly try to optimize performance on our datasets by clustering tables by fields that we anticipate will be used most frequently, but there may be use cases that we do not anticipate. See the Snowflake documentation here for more information.
You can also improve query performance by utilizing other optimization methods, specifically Query Acceleration Service (QAS) and Search Optimization. You will need to copy the data from the Snowflake shared schema (the schema created when you mounted the listing) into your own schema and enable these accelerations.
Finally, it is worth using Snowflake’s query profiler to understand bottlenecks. Typically, we can help if a datasets experienced a skewed join. If the majority of your query time is spent on table scans, you should increase the size of your warehouse.
If you have a query you are seeing poor performance on, please email us at snowflake-public-data@snowflake.com.
Can I re-distribute your data or use your data in my product? By default, our terms of service (the contract you agree to when mounting a listing) do not allow for data redistribution. Our data is intended for internal use. If you have a redistribution use case, please Contact Us.
How do you determine which public datasets will be added to the Snowflake Marketplace? We prioritize datasets based on customer feedback. Please contact us if there is a dataset you would like to have added to our pipeline.
I found a bug, can you fix it? Yes! Most likely. Please Contact Us. We would appreciate it if you could include details or a reproducible example of the bug.
Snowflake Marketplace
What is the Snowflake Marketplace? A marketplace to discover and access third-party data and services directly in the Snowflake Data Cloud. Data consumers securely access live and governed shared data sets directly from their Snowflake account, and receive automatic updates in real time.
What forms of payment are accepted for your products? The Snowflake Marketplace accepts payment through the following methods:
- Marketplace Capacity Drawdown (using your organization’s Capacity commitment with Snowflake)
- ACH payment
- Wire transfer
- SWIFT transfer
- Credit card
To find out more on paying for listings, see here.
How do you pay with Snowflake credits? Yes, your organization can use Marketplace Capacity Drawdown (see Snowflake documentation here). Depending on when the Snowflake agreement was executed, it may require an amendment to your organization’s service agreement to use this option for payment.