Performance and cost in Snowflake are directly related because:
Compute = Warehouses = Billing
Storage is inexpensive, but compute costs scale with warehouse usage.
Understanding how performance choices influence cost (and vice versa) is essential both for the exam and real-world usage.
Warehouse sizing determines how fast queries run and how much you pay.
The goal is to choose the right warehouse size and behavior for your workload.
Warehouse sizes range from X-SMALL to 4X-LARGE, each doubling compute resources.
When choosing the size, consider:
Query complexity
Heavy joins, large aggregations, and complex transformations benefit from larger sizes.
Query volume
More simultaneous work means more compute needed.
Concurrency
Many users working at the same time may require scale-out or multi-cluster.
Pros:
Faster execution for complex queries
Better performance for large datasets
Cons:
A larger warehouse does not fix everything—sometimes query tuning or clustering is more effective.
Instead of using one huge warehouse, use separate warehouses for different workloads.
Use different warehouses for:
ETL/ELT
Typically scheduled jobs or heavy transformations.
BI/Reporting
High concurrency, mostly read-heavy queries.
Data science / ad hoc
Spiky workloads, unpredictable query patterns.
Prevents workload interference
Makes tuning easier (each warehouse isolated)
Allows separate auto-suspend rules
Improves concurrency
Reduces total cost by suspending non-active warehouses
Do not use one giant warehouse for everything.
It creates contention, wastes money, and limits flexibility.
Auto-suspend turns off warehouses after N seconds of inactivity (common values: 60–300 seconds).
Benefits:
Avoids paying for idle compute
Warehouses stop billing immediately when suspended
When a new query arrives:
The warehouse automatically resumes
Resume is near-instant
Queries do not fail because of suspension
Snowflake automatically resumes a suspended warehouse when needed.
Suspended warehouses do not block queries.
Snowflake warehouses can run into queueing when concurrency is high.
A single warehouse:
Has limited compute threads
Can only process a certain number of queries at once
Will queue queries when overloaded
This leads to slow dashboards or delayed job execution.
Multi-cluster warehouses solve concurrency problems by adding more clusters behind a single warehouse.
You configure:
Minimum clusters
Maximum clusters
A scaling policy:
STANDARD (faster scaling)
ECONOMY (slower scaling, cheaper)
When Snowflake detects queueing:
It launches an additional cluster
Queries are distributed across clusters
When concurrency decreases, extra clusters automatically shut down
BI dashboards with many users
Heavy analyst workloads
Teams sharing the same logical warehouse
More clusters = higher cost.
Therefore, you should tune min/max cluster counts carefully.
This section affects both performance and cost, because better pruning = fewer scanned partitions = lower compute usage.
Micro-partitions are:
Internal storage units (~16 MB compressed)
Immutable
Automatically created by Snowflake
Each partition stores metadata such as:
Column min/max
Distinct value counts
Null counts
This metadata is used for pruning.
Pruning allows Snowflake to:
Skip scanning irrelevant partitions
Reduce compute needed
Improve query speed
Example:
If a query filters WHERE date = '2024-01-01', only micro-partitions overlapping that date are scanned.
Clustering determines how evenly and efficiently micro-partitions are organized.
Good clustering:
High pruning → fewer partitions scanned → faster queries → lower cost
Poor clustering:
Many partitions scanned → slower queries → higher cost
Snowflake naturally sorts data during ingestion.
For many tables, natural clustering is sufficient.
When natural clustering is insufficient, you define a clustering key on frequently filtered columns.
Typical keys:
date
customer_id
region
Snowflake will automatically recluster as data grows.
A special feature enabling much faster:
Point lookups
Highly selective queries
Contains / IN / LIKE queries
It is useful when normal clustering is not effective.
You must understand when to cluster and when to scale the warehouse:
Performance issue on single heavy queries → scale up
Too many scanned partitions → clustering
Too many queued queries → multi-cluster warehouse
This section focuses on SQL techniques and Snowflake tools that improve query performance.
Query Profile shows:
Amount of data scanned
Micro-partition pruning efficiency
Distribution of work
JOIN behavior
Bottlenecks
Learning to read Query Profile is essential for tuning.
Push filters as early as possible:
SELECT ...
FROM table
WHERE date > '2024-01-01';
This reduces the amount of data scanned.
Avoid SELECT *, especially in wide tables.
These operations force Snowflake to perform extra work:
Sorting
Deduplication
Large data movement
Only use them when necessary.
Correct data types:
Improve storage
Improve micro-partition metadata
Improve pruning
Reduce query cost
For loading:
Avoid extremely large files
Use many smaller files for parallelism
Example:
This allows Snowflake to parallelize loading across multiple threads.
Snowflake provides many ways to control and reduce cost.
Use:
Smallest warehouse that meets performance needs
Aggressive auto-suspend policies
ETL at night
BI during the day
Data science workloads bursty
Each warehouse suspends independently → reduces wasted compute.
Result cache reduces repeated query cost
Materialized views pre-compute expensive logic
Compress files
Use proper file sizes
Use COPY options effectively
Better ingestion = lower cost.
Important monitoring tools:
Account Usage views (SNOWFLAKE.ACCOUNT_USAGE)
Warehouse utilization metrics
Query history
These help identify:
Inefficient queries
Underutilized warehouses
Expensive transformations
Caching is a major contributor to Snowflake performance and cost efficiency. Each cache type operates at a different layer of Snowflake’s architecture and behaves differently.
The result cache stores the results of previously executed queries.
Key characteristics:
Stored in the Cloud Services layer
Shared across all warehouses in the same account
Returned instantly if the query text, role, session settings, and underlying data have not changed
Zero compute cost because no warehouse needs to run
This cache provides performance acceleration for repeated BI dashboards or recurring analytical queries.
The metadata cache stores statistics and metadata about micro-partitions, including:
Minimum and maximum values
Null counts
Distinct values
Partition location
Other structural statistics
Metadata cache lives in the Cloud Services layer. It enables partition pruning, which reduces the number of micro-partitions scanned. Pruning greatly reduces compute cost.
The data cache lives on the local SSD of each warehouse cluster.
Key behaviors:
Warehouse-specific (not shared between warehouses)
Lost when the warehouse suspends
Helps accelerate subsequent scans of the same micro-partitions
Although less durable, the data cache improves performance for repeated scanning workloads such as iterative analytics.
Query Profile is Snowflake’s main tool for diagnosing performance and compute cost.
Important pruning metrics include:
Micro-partitions scanned
Micro-partitions pruned
Percentage of total partitions pruned
These metrics indicate how efficiently Snowflake avoids unnecessary reads.
High pruning percentage indicates that clustering and partition organization are efficient
Low pruning percentage often means:
Poor clustering
High data fragmentation
Queries filtering on unclustered columns
Need for different clustering keys or search optimization
Efficient pruning directly reduces cost because fewer micro-partitions are scanned.
Snowflake offers two scaling strategies that serve different purposes.
Scale-up means selecting a larger warehouse size, such as moving from M to L.
Effects:
More compute resources per cluster
Faster performance for single large queries
No improvement in concurrency
Higher cost per second of execution
Use scale-up when a small number of heavy queries require more computational power.
Scale-out means adding additional clusters within a multi-cluster warehouse.
Effects:
Increased concurrency
No improvement in single-query performance
More cost only when additional clusters are activated
Scale-out is useful for BI dashboards, team-wide workloads, and spiky concurrency scenarios.
Materialized Views (MVs) are stored query results maintained incrementally.
Excellent for repeated expensive aggregations
Reduce compute by serving precomputed results
Provide significant performance gains for dashboards and repetitive workloads
Snowflake updates MVs incrementally whenever underlying micro-partitions change.
This maintenance process consumes compute resources.
Materialized Views introduce:
Additional storage for the materialized data
Additional compute cost for maintaining freshness
MVs are best for stable, well-defined workloads with predictable patterns.
Dynamic Tables support declarative data pipelines by automatically refreshing based on a defined freshness target.
Each Dynamic Table specifies how fresh its data must be.
Examples:
1 minute
5 minutes
1 hour
Lower freshness values cause more frequent refresh operations, increasing compute cost
Higher freshness values reduce refresh frequency and compute cost
Dynamic Tables enable cost-tuned ELT pipelines where freshness and compute cost must be balanced carefully.
Query Acceleration Service supplements warehouse compute with additional temporary resources.
QAS is designed for:
Irregularly heavy operations
Highly complex queries
Queries that do not justify scaling the warehouse permanently
Automatically activated for qualifying queries when enabled
Billed separately from warehouse compute
Does not require administrators to change warehouse size
QAS is best for unpredictable workloads where sizing a warehouse to peak demand would be costly.
Automatic Clustering continuously reorganizes micro-partitions to maintain efficient clustering.
Improves pruning efficiency
Reduces query scan cost
Maintains clustering depth as data grows or becomes disordered
Automatic Clustering uses compute under the hood. This cost is billed to the user’s account.
Recommended for:
Very large tables
Tables where queries frequently filter on specific clustering keys
Tables whose clustering depth significantly deteriorates over time
Admins must evaluate cost versus performance gains.
File size impacts parallelism and throughput of Snowflake’s bulk loading process.
For compressed files:
This size balances parallelism and efficiency.
Files smaller than roughly 10 MB:
Cause too many small tasks
Reduce load parallelism
Increase overhead
Files that are too large:
Decrease throughput
Increase load latency
Reduce the effectiveness of warehouse parallelism
Use many evenly sized files to maximize the number of parallel threads available in the warehouse.
Why might a query be slow even when using a large virtual warehouse?
Query slowness can result from inefficient SQL, poor pruning, lack of clustering, or data skew rather than insufficient compute.
Increasing warehouse size improves compute but does not fix inefficient query plans. Issues like scanning unnecessary micro-partitions or missing filters often dominate performance problems. A common mistake is scaling up compute without analyzing query execution plans.
Demand Score: 90
Exam Relevance Score: 95
What is auto-suspend and why is it important?
Auto-suspend automatically shuts down a virtual warehouse after a period of inactivity to reduce compute costs.
Since Snowflake charges per-second compute usage, leaving warehouses running wastes credits. Proper configuration minimizes idle cost. A common mistake is setting long suspend times or disabling it entirely.
Demand Score: 85
Exam Relevance Score: 92
What is auto-resume in Snowflake?
Auto-resume automatically starts a suspended warehouse when a query is submitted.
It ensures seamless user experience while still benefiting from cost savings. Without auto-resume, queries fail if the warehouse is suspended. A common mistake is disabling it, causing manual intervention.
Demand Score: 82
Exam Relevance Score: 90
How does micro-partition pruning improve performance?
Snowflake skips scanning irrelevant micro-partitions based on metadata, reducing data scanned and improving query speed.
Each micro-partition stores min/max metadata. Queries with filters can eliminate unnecessary partitions. A common mistake is not using selective filters, leading to full table scans.
Demand Score: 87
Exam Relevance Score: 94
When should multi-cluster warehouses be used?
They should be used when there is high concurrency with many simultaneous queries.
Multi-cluster warehouses scale out to handle multiple users. They do not improve single-query performance. A common mistake is enabling multi-cluster for performance tuning instead of concurrency.
Demand Score: 84
Exam Relevance Score: 92
What is the main factor affecting Snowflake cost?
Compute usage (virtual warehouse runtime) is the primary cost driver.
Storage is relatively inexpensive, while compute consumes credits. Efficient warehouse sizing and auto-suspend settings significantly impact cost. A common mistake is focusing only on storage optimization.
Demand Score: 88
Exam Relevance Score: 95