snowflakeMasters White

Please Provide valid credentials to access the demo video!

Snowflake interview questions

for

freshers & experienced 2024


Most asked snowflake questions

1. What is a Snowflake cloud data warehouse?

Snowflake is an analytic data warehouse implemented as a SaaS service. It is built on a new SQL database engine with a unique architecture built for the cloud. This cloud-based data warehouse solution was first available on AWS as software to load and analyze massive volumes of data. The most remarkable feature of Snowflake is its ability to spin up any number of virtual warehouses, which means the user can operate an unlimited number of independent workloads against the same data without any risk of contention.

2. Is Snowflake an ETL tool?

Yes, Snowflake is an ETL tool. It’s a three-step process, which includes:

• Extracts data from the source and creates data files. Data files support multiple data formats like JSON, CSV, XML, and more.

• Loads data to an internal or external stage. Data can be staged in an internal, Microsoft Azure blob, Amazon S3 bucket, or Snowflake managed location.

• Data is copied into a Snowflake database table using the COPY INTO command

3. Explain Snowflake ETL?

The full form of ETL is Extract, Transform, and Load. ETL is the process that we use for extracting the data from multiple sources and loading it to a particular database or data warehouse. The sources are third party apps, databases, flat files, etc.

Snowflake ETL is an approach to applying the ETL process for loading the data into the Snowflake data warehouse or database. Snowflake ETL also includes extracting the data from the data sources, doing the necessary transformations, and loading the data into Snowflake.

4. How is data stored in Snowflake?

Snowflakes store the data in multiple micro partitions which are internally optimized and compressed. The data is stored in a columnar format in the cloud storage of Snowflake. The data objects stored by Snowflake cannot be accessed or visible to the users. By running SQL query operations on Snowflake, you can access them.

ENROLL FOR FREE DEMO

ENROLL FOR FREE DEMO


5. How is Snowflake distinct from AWS?

Snowflake offers storage and computation independently, and storage cost is similar to data storage. AWS handles this aspect by inserting Redshift Spectrum, which enables data querying instantly on S3, yet not as continuous as Snowflake.

6. What type of database is Snowflake?

Snowflake is built entirely on a SQL database. It’s a columnar-stored relational database that works well with Excel, Tableau, and many other tools. Snowflake contains its query tool, supports multi-statement transactions, role-based security, etc., which are expected in a SQL database.

7. Can AWS glue connect to Snowflake?

Definitely. AWS glue presents a comprehensive managed environment that easily connects with Snowflake as a data warehouse service. These two solutions collectively enable you to handle data ingestion and transformation with more ease and flexibility.


8. Explain Snowflake editions.

Snowflake offers multiple editions depending on your usage requirements.

• Standard edition - Its introductory level offering provides unlimited access to Snowflake’s standard features.

• Enterprise edition - Along with Standard edition features and services, offers additional features required for large-scale enterprises.

• Business-critical edition - Also, called Enterprise for Sensitive Data (ESD). It offers high-level data protection for sensitive data to organization needs.

• Virtual Private Snowflake (VPS) - Provides high-level security for organizations dealing with financial activities.

9. Define the Snowflake Cluster

In Snowflake, data partitioning is called clustering, which specifies cluster keys on the table. The method by which you manage clustered data in a table is called re-clustering.

10. Explain Snowflake architecture

Snowflake is built on an AWS cloud data warehouse and is truly Saas offering. There is no software, hardware, ongoing maintenance, tuning, etc. needed to work with Snowflake.

Three main layers make the Snowflake architecture - database storage, query processing, and cloud services.

• Data storage - In Snowflake, the stored data is reorganized into its internal optimized, columnar, and optimized format. 

• Query processing - Virtual warehouses process the queries in Snowflake.

• Cloud services - This layer coordinates and handles all activities across the Snowflake. It provides the best results for Authentication, Metadata management, Infrastructure management, Access control, and Query parsing.

11. What are the features of Snowflake? 

Unique features of the Snowflake data warehouse are listed below:

• Database and Object Closing

• Support for XML

• External tables

• Hive meta store integration

• Supports geospatial data

• Security and data protection

• Data sharing

• Search optimization service

• Table streams on external tables and shared tables

• Result Caching

12. Why is Snowflake highly successful?

Snowflake is highly successful because of the following reasons:

• It assists a wide variety of technology areas like data integration, business intelligence, advanced analytics, security, and governance.

• It offers cloud infrastructure and supports advanced design architectures ideal for dynamic and quick usage developments.

• Snowflake supports predetermined features like data cloning, data sharing, division of computing and storage, and directly scalable computing.

• Snowflake eases data processing.

• Snowflake provides extendable computing power.

• Snowflake suits various applications like ODS with the staged data, data lakes with data warehouse, raw marts, and data marts with acceptable and modelled data.

13. Tell me something about Snowflake AWS?

For managing today’s data analytics, companies rely on a data platform that offers rapid deployment, compelling performance, and on-demand scalability. Snowflake on the AWS platform serves as a SQL data warehouse, which makes modern data warehousing effective, manageable, and accessible to all data users. It enables the data-driven enterprise with secure data sharing, elasticity, and per-second pricing.

ENROLL FOR FREE DEMO

ENROLL FOR FREE DEMO


14. Describe Snowflake computing.

Snowflake cloud data warehouse platform provides instant, secure, and governed access to the entire data network and a core architecture to enable various types of data workloads, including a single platform for developing modern data applications.  



15. What is the schema in Snowflake?



Schemas and databases used for organizing data stored in the Snowflake. A schema is a logical grouping of database objects such as tables, views, etc. The benefits of using Snowflake schemas are it provides structured data and uses small disk space.


16. What are the benefits of the Snowflake Schema?

• In a denormalized model, we use less disk space.

• It provides the best data quality.


17. Differentiate Star Schema and Snowflake Schema?


Both Snowflake and Star Schemas are identical, yet the difference exists in dimensions. In Snowflake, we normalise only a few dimensions, and in a star schema, we denormalise the logical dimensions into tables.


18. What kind of SQL does Snowflake use?

Snowflake supports the most common standardized version of SQL, i.e., ANSI for powerful relational database querying.

19. What are the cloud platforms currently supported by Snowflake?

• Amazon Web Services (AWS)

• Google Cloud Platform (GCP)

• Microsoft Azure (Azure)

20. What ETL tools do you use with Snowflake?

Following are the best ETL tools for Snowflake

• Matillion

• Blendo

• Hevo Data

• StreamSets

• Etleap

• Apache Airflow 

21. Explain zero-copy cloning in Snowflake?

In Snowflake, Zero-copy cloning is an implementation that enables us to generate a copy of our tables, databases, schemas without replicating the actual data. To carry out zero-copy in Snowflake, we have to use the keyword known as CLONE. Through this action, we can get the live data from the production and carry out multiple actions.

22. Explain “Stage” in the Snowflake?

In Snowflake, the Stage acts as the middle area that we use for uploading the files. Snowpipe detects the files once they arrive at the staging area and systematically loads them into the Snowflake.

Following are the stages supported by the snowflake:

• Table Stage

• User Stage

• Internal Named Stage

23. Explain data compression in Snowflake?

All the data we enter into the Snowflake gets compacted systematically. Snowflake utilizes modern data compression algorithms for compressing and storing the data. Customers have to pay for the packed data, not the exact data.

24. How do we secure the data in the Snowflake?

Data security plays a prominent role in all enterprises. Snowflake adapts the best-in-class security standards for encrypting and securing the customer accounts and data that we store in the Snowflake. It provides the industry-leading key management features at no extra cost:

25. Explain Snowflake Time Travel?

Snowflake Time Travel tool allows us to access the past data at any moment in the specified period. Through this, we can see the data that we can change or delete. Through this tool, we can carry out the following tasks:

Restore the data-associated objects that may have lost unintentionally.

For examining the data utilization and changes done to the data in a specific time period.

Duplicating and backing up the data from the essential points in history.

26. What is the database storage layer?

Whenever we load the data into the Snowflake, it organizes the data into the compressed, columnar, and optimized format. Snowflake deals with storing the data that comprises data compression, organization, statistics, file size, and other properties associated with the data storage. All the data objects we store in the Snowflake are inaccessible and invisible. We can access the data objects by executing the SQL query operation through Snowflake.


27. Explain Fail-safe in Snowflake?

Fail-safe is a modern feature that exists in Snowflake to assure data security. Fail-safe plays a vital role in the data protection lifecycle of the Snowflake. Fail-safe provides seven days of additional storage even after the time travel period is completed.

28. Explain Virtual warehouse?

In Snowflake, a Virtual warehouse is one or more clusters endorsing users to carry out operations like queries, data loading, and other DML operations. Virtual warehouses approve users with the necessary resources like temporary storage, CPU for performing various snowflake operations.

 29. Explain Data Shares

Snowflake Data sharing allows organizations to securely and immediately share their data. Secure data sharing enables sharing of the data between the accounts through Snowflake secure views, database tables.

30. What are the various ways to access the Snowflake Cloud data warehouse?

We can access the Snowflake data warehouse through:

• ODBC Drivers

• JDBC Drivers

• Web User Interface

• Python Libraries

• SnowSQL Command-line Client

31. Explain Micro Partitions?



Snowflake comes along with a robust and unique kind of data partitioning known as micro partitioning. Data that exists in the Snowflake tables are systematically converted into micro partitions. Generally, we perform Micro partitioning on the Snowflake tables.



32. Explain Columnar database?



The columnar database is opposite to the conventional databases. It saves the data in columns in place of rows, eases the method for analytical query processing and offers more incredible performance for databases. Columnar database eases analytics processes, and it is the future of business intelligence.



33. How to create a Snowflake task?



To create a Snowflake task, we have to use the “CREATE TASK” command. Procedure to create a snowflake task:

CREATE TASK in the schema.

USAGE in the warehouse on task definition.

Run SQL statement or stored procedure in the task definition.



34. How do we create temporary tables?



To create temporary tables, we have to use the following syntax:

Create temporary table mytable (id number, creation_date date);



35. Where do we store data in Snowflake?



Snowflake systematically creates metadata for the files in the external or internal stages. We store metadata in the virtual columns, and we can query through the standard “SELECT” statement.



ENROLL FOR FREE DEMO

ENROLL FOR FREE DEMO


36. Does Snowflake use Indexes?



No, Snowflake does not use indexes. This is one of the aspects that set the Snowflake scale so good for the queries.



37. How is Snowflake distinct from AWS?



Snowflake offers storage and computation independently, and storage cost is similar to data storage. AWS handles this aspect by inserting Redshift Spectrum, which enables data querying instantly on S3, yet not as continuous as Snowflake.



38. How do we execute the Snowflake procedure?



Stored procedures allow us to create modular code comprising complicated business logic by adding various SQL statements with procedural logic. For executing Snowflake procedure, carry out the below steps:

• Run a SQL statement

• Extract the query results

• Extract the result set metadata



39. Does Snowflake maintain stored procedures?



Yes, Snowflake maintains stored procedures. The stored procedure is the same as a function; it is created once and used several times. Through the CREATE PROCEDURE command, we can create it and through the “CALL” command, we can execute it. In Snowflake, stored procedures are developed in Javascript API. These APIs enable stored procedures for executing the database operations like SELECT, UPDATE, and CREATE.



40. Is Snowflake OLTP or OLAP?



Snowflake is developed for the Online Analytical Processing(OLAP) database system. Subject to the usage, we can utilize it for OLTP(Online Transaction processing) also.



41. How is Snowflake distinct from Redshift?



Both Redshift and Snowflake provide on-demand pricing but vary in package features. Snowflake splits compute storage from usage in its pricing pattern, whereas Redshift integrates both.



42. What is the use of the Cloud Services layer in Snowflake?



The services layer acts as the brain of the Snowflake. In Snowflake, the Services layer authenticates user sessions, applies security functions, offers management, performs optimization, and organizes all the transactions.



43. What is the use of the Compute layer in Snowflake?



In Snowflake, Virtual warehouses perform all the data handling tasks. Which are multiple clusters of the compute resources. While performing a query, virtual warehouses extract the least data needed from the storage layer to satisfy the query requests.



44. What is Unique about Snowflake Cloud Data Warehouse?



Snowflake is cloud native (built for the cloud).So, It takes advantage of all the good things about the cloud and brings exciting new features like,

• Auto scaling

• Zero copy cloning

• Dedicated virtual warehouses

• Time travel

• Military grade encryption and security

• Robust data protection features

Snowflake is a poetry. It's beautifully crafted with smart defaults -

• All the data is compressed by default

• All the data is encrypted

• Its Columnar, thereby making the column level analytical operations a lot faster

Not to mention the number of innovations in the product - eg. Intelligent Services layer, data shares, tasks & streams. Snowflake also has a simple and transparent pricing, which makes it very easier even for smaller businesses to afford a cloud data warehouse



45. What is Snowflake Architecture ?



Snowflake is built on a patented, multi-cluster, shared data architecture created for the cloud. Snowflake architecture is comprised of storage, compute, and services layers that are logically integrated but scale infinitely and independent from one another



46. What does the Storage Layer do in Snowflake ?



The storage layer stores all the diverse data, tables and query results in Snowflake. The Storage Layer is built on scalable cloud blob storage (uses the storage system of AWS, GCP or Azure). Maximum scalability, elasticity, and performance capacity for data warehousing and analytics are assured since the storage layer is engineered to scale completely independent of compute resources



47. What does the Compute Layer do in Snowflake ?



All data processing tasks within Snowflake are performed by virtual warehouses, which are one or more clusters of compute resources. When performing a query, virtual warehouses retrieve the minimum data required from the storage layer to full fil the query requests



48. What does the Cloud Services Layer do in Snowflake ?



The services layer is the brain of Snowflake. The services layer for Snowflake authenticates user sessions, provides management, enforces security functions, performs query compilation and optimization, and coordinates all transactions



49. What is a Columnar database and what are its benefits ?



Columnar databases organize data at Column level instead of the conventional row level. All Column level operations will be much faster and consume less resources when compared to a row level relational database



50. What is Snowflake Caching ?



Snowflake caches the results of every query you ran and when a new query is submitted, it checks previously executed queries and if a matching query exists and the results are still cached, it uses the cached result set instead of executing the query. Snowflake Cache results are global and can be used across users



51. What are the different types of caching in Snowflake ?



  1. Query Results Caching
  2. Virtual Warehouse Local Disk Caching
  3. Metadata Cache


52 Name the types of catches in Snowflake? 


  • Query Results Caching
  • Metadata Cache
  • Virtual Warehouse Local Disk Caching


53 What is Snowflake Time Travel?


Snowflake Time Travel tool enables you to access historical data at any given point within a defined time period. Using this you can see the data that has been deleted or changed. Using this tool you can perform the below tasks:

  • Restore data-related objects (Schemas, tables, and databases) that might have lost accidentally.
  • To examine data usage and changes made to data with a time period
  • Backing up and duplicating data from key points in the past.


54 What is Fail-safe in Snowflake?


Fail-safe is an advanced feature available in Snowflake to ensure data protection. This plays an important role in Snowflake’s data protection lifecycle. Fail-safe offers 7 days extra storage even after the time travel period is over.



55 Why fail-safe instead of Backup?


To minimize the risk factor, DBA’s traditionally execute full and incremental data backups at regular intervals. This process occupies more storage space, sometimes it may be double or triple. Moreover, the data recovery process is costly, takes time, requires business downtime, and more.

Snowflake comes with a multi-datacenter, redundant architecture that has the capability to minimize the need for traditional data backup. Fail-safe features in Snowflake is an efficient and cost-effective way that substitutes the traditional data backup and eliminates the risks and scales along with your data.



56 What is the Data retention period in Snowflake?


Data retention is one of the key components of Snowflake and the default data retention period for all snowflake accounts is 1 day (24 hours). This is a default feature and applicable for all Snowflake accounts.



57 Explain data shares in Snowflake?




The data shares option in snowflake allows the users to share the data objects in a database in your account with other snowflake accounts in a secured way. All the database objects shared between snowflake accounts are only readable and one can not make any changes to them.

Following are the sharable database objects in Snowflake:

  • Tables
  • Secure views
  • External tables
  • Secure UDFs
  • Secure materialized views


58 What are the data sharing types in Snowflake?


Following are the 3 types of data sharing types:

  • Sharing Data between functional units. 
  • Sharing data between management units.
  • Sharing data between geographically dispersed location


59 What do you know about zero-copy cloning in Snowflake?


Zero copy cloning is a snowflake implementation that allows you to create a copy of your schemas, tables, databases without copying the actual data. In order to perform zero-copy in Snowflake, you need to use a keyword called CLONE. With this option, you can get real-time data from production and perform multiple actions.



60 Name the cloud platforms supported by Snowflake?


Following are the cloud providers supported by Snowflake:

  • Google Cloud Platform (GCP)
  • Amazon Web Services (AWS)
  • Microsoft Azure (Azure)


If you're interested in learning more about this course, consider enrolling in a Snowflake Training in Hyderabad. These courses give you practical experience and in-depth instruction to help you grasp snowflakes.

61 What are the different Snowflake editions?


Following are the various Snowflake editions available:

  • Standard Edition
  • Enterprise Edition
  • Business Critical Edition
  • Virtual Private Snowflake (VPS) Edition.


62 What are the different Connectors and Drivers available in Snowflake?


Below mentioned are the various connectors and drivers available in Snowflake:

  • Snowflake Connector for Python
  • Snowflake Connector for Kafka
  • Snowflake Connector for Spark
  • Go Snowflake Driver
  • Node.js Driver
  • JDBC Driver
  • .NET Driver
  • ODBC Driver
  • PHP PDO Driver for Snowflake


63 What is “Stage” in Snowflake?


A stage in Snowflake is defined as an intermediate area used to upload files. Snowpipe Identifies the files as soon as they enter the staging area and automatically loads them into a snowflake.

Following are the three different stages supported by Snowflake:

  • User Stage
  • Table Stage
  • Internal Named Stage


64 What is Snowpipe in Snowflake?


Snowpipe is a continuous, and cost-effective service used to load data into Snowflake. The Snowpipe automatically loads the data from files once they are available on stage. This process simplifies the data loading process by loading data in micro-batches and makes data ready for analysis.



65 What are the benefits of using Snowpipe:


Following are the major advantages of using Snowpipe:

  • Real-time insights
  • Ease of use
  • Cost-effective
  • Flexibility
  • Zero Management


66 What is a virtual warehouse in Snowflake?


A Virtual warehouse in Snowflake is defined as one or more compute clusters supporting users to perform operations like data loading, queries, and many other DML operations. Virtual warehouses support users with the required resources such as CPU, temporary storage, memory, etc, to perform different Snowflake operations.



67 Explain the features of Snowflake?


Following are some of the notable features of Snowflake:

  • Database Storage
  • Cloud Services
  • Compute layer
  • Concurrency and Accessibility
  • Supports structured and unstructured data
  • Easy data sharing
  • High-speed performance
  • Availability and Security 


68 What are the programming languages supported by Snowflake?


Snowflake supports different programming languages like Go, Java, .NET, Python, C, Node.js, etc.



69 What are micro partitions in Snowflake?


Snowflake comes with a unique and powerful form of data partitioning called micro-partitioning. Data resided in all snowflake tables is automatically converted into micro partitions. In general Micro partitioning is performed on all Snowflake tables.



70 What is Clustering in Snowflake?


Clustering in Snowflake is defined as grouping a bunch of values into a record or file to enhance query performance. 



71 What is a Clustering key?


The clustering key in Snowflake is a subset of columns in a table that helps us in co-locating data within the table. It is best suitable for situations where tables are extensive; the order was not perfect due to DML.



72 What is Amazon S3?


Amazon S3 is a storage service that offers high data availability and security. It provides a streamlined process for organizations of all sizes and industries to store their data. 



73 What is a Snowflake Schema?


The Snowflake schema is defined as a logical representation of tables in a multidimensional database. A fact table represents it in the middle with diversified connected dimensions. Snowflake schema’s primary goal is to normalize data.



76 What are the advantages of Snowflake Schema?


Following are the core advantages of Snowflake Schema:

  • Uses less disk space
  • Minimal data redundancy.
  • Eliminates data integration challenges
  • Less maintenance
  • Executes complex queries
  • Supports many-to-many relationships


77 What is Materialized view in Snowflake?


A materialized view in Snowflake is a pre-computed data set derived from a query specification. As the data is pre-computed, it becomes far easier to query materialized view than a non-materialized view from the view’s base table.

In simple words, materialized views are designed to enhance the query performance for common and repetitive query patterns. Materialized Views are primary database objects and speedup projection, expensive aggregation, and selection operations for queries that run on larger data sets.



78 What are the advantages of Materialized Views?


Following are the distinct advantages of Materialized Views:

  • Improves query performance
  • Snowflake automatically manages materialized Views.
  • Materialized views provide updated data.


79 What is the use of SQL in Snowflake?


SQL stands for Structured Query Language and is the common language used for data communication. Within SQL, common operators are clubbed into DML (Data Manipulation Language) & DDL (Data Definition Language) to perform various statements such as  SELECT, UPDATE, INSERT, CREATE, ALTER, DROP, etc.

Snowflake is a data warehouse platform and supports the standard version of SQL. Using SQL in Snowflake, we can perform the typical data warehousing operations like create, insert, alter, update, delete, etc.



80 What are the ETL tools supported by Snowflake?


Following are the top ETL tools supported by Snowflake.

  • Matillion
  • Infromatica
  • Tableau
  • Talend, etc.


81 Where the metadata gets stored in Snowflake?


In snowflake, the Metadata is stored in virtual columns that can be easily queried using the SELECT statement and loaded into a table using the COPY INTO

command. 



82 What is Auto-scaling in Snowflake?


Autoscaling is an advanced feature in Snowflake that starts and stops clusters based on the requirement to support workloads on the warehouse.



83 What is the use of Stored Procedures in Snowflake?


A stored procedure is a group of database statements that can be written using SQL  JavaScript.



84 Which command is used to create a stored procedure?


In Snowflake “CREATE PROCEDURE” command is used to create a stored procedure.



85 What are the advantages of stored procedures in Snowflake?


Following are the benefits of using Stored procedures:

  • Supports procedural logic
  • Allows dynamic creation and execution of SQL statements
  • Helps in error handling
  • Allows the Stored procedure owner to delegate the power to users
  • Eliminates the need for multiple SQL statements to perform a task.



86 Snowflake Stored procedures are written in?


Snowflake Stored procedures are written in JavaScript.



87 What is Secure data sharing in Snowflake?


Secure data sharing helps users to share the selected objects in the database with other snowflake accounts.



88 Name a few Snowflake database objects that can be shared using Secure data sharing?


  • Tables
  • Secure views
  • External tables
  • Secure UDFs
  • Secure materialized views


89 What are the internal and external stages in Snowflake?


Following are the two stages supported by Snowflake to store data files and to load/unload data:

Internal Stage: Here the files are stored within the Snowflake account

External Stage: Here the files are stored in an external location. For instance AWS S3.



90 What is “Continuous Data Protection” (CDP) in Snowflake?


Continuous Data Protection (CDP) is an essential feature offered by Snowflake to protect data stored in snowflake from events like malicious attacks, human error, and software or hardware failovers. This CDP feature makes your data accessible and recoverable at all the stages of the data life cycle even if you lost it accidentally.

 

1. Difference between copy and put command.

2. Steps involved in creation of stages.

3. Different type of stages.

4. Rejected/bad records- how to view them.

5. Considering a scenario where you need to load files as soon as it arrives, how would you implement it.

6. Loading semi structured data - through variant column.

7. How does snowflake always load the incremental data files ignoring the ones already loaded.

8. How would you force to load all the files in a given folder/directory.

9. Can you query data on external storage without loading them, if yes how ?

10. Bulk vs Continuous loading.

1. Snowflake Architecture and it's various components

2. Data loading and challenges faced

3. Data loading include copy command, how do you create stages, various stage - table/internal/external.

4. How does stages differ from each other.

5. How to create external stage referencing S3 bucket/ADSL

6. Scenarios like how do you handle duplicates as integrity constraints are not enforced in SF.

7. What are warehouses and how do you design it.

8. Performance optimization - Auto Scaling, Caching and clustering

9. Scenarios where you have used time travel feature.

10. UDF and how have you used it.

11. Restrictions in SF as compared to RDBMS database.

12. If you have worked on migration what are challenges faced and how did you overcome it.

13. Task, streams implementation scenario.

14. Structured/semi- structured files how have you loaded them.

15. Volume of data you have handled.

16. Architecture of your project.

17. Orchestration of SF pipelines - tools being used for it.

18. How do you handle transactional transactions in Snowflake.

19. Snowpipe

20.Data Unloading.

21. Role based access management - important to have knowledge on this.

22. How does normal views and materials views differ in Snowflake



91 What is Snowflake?


Snowflake is a cloud data warehouse provided as a software-as-a-service (SaaS). It consists of unique architecture to handle multiple aspects of data and analytics. Snowflake sets itself apart from all other traditional data warehouse solutions with advanced capabilities like improved performance, simplicity, high concurrency and cost-effectiveness.

Snowflake’s shared data architecture physically separates the computation and storage which is not possible by the traditional offerings. It streamlines the process for businesses to store and analyze massive volumes of data using cloud-based tools. Snowflake has transformed the data warehouse industry by making it possible to bring all data together into a centralized system.



92 What is unique about Snowflake Architecture?


Snowflake has come up with an advanced and unique architecture that is a combination of shared-disk and shared-nothing architectures. It uses a central data repository to store data consistently and makes it available to access from all compute nodes in the platform. Similar to shared-nothing architecture, Snowflake also executes queries by using MPP (massively parallel processing) compute clusters where every node in the cluster stores a certain amount of the whole data set locally.

This architecture simplifies data management with shared-disk architecture and adds performance and scalability advantages with shared-nothing architecture. Snowflake unique architecture consists of three layers which are database storage, Query processing, and Cloud services.



93 Explain the Database storage layer in Snowflake?


Whenever the data gets loaded into Snowflake it arranges the data into optimized, compressed and columnar format. After this process, the data gets optimized and then stored in cloud storage.

Snowflake looks after how the data gets stored which include data organization, compression, structure, file size, statistics, metadata, and many other aspects related to data storage. All the data objects stored in Snowflake are invisible and inaccessible. One can only access the data objects by running SQL query operations using Snowflake.



94 What is the Query Processing layer in Snowflake architecture?


All the query executions are performed in this processing layer. Snowflake uses “virtual warehouses to process queries. Each virtual warehouse is an MPP (massively parallel processing) compute cluster which consists of multiple nodes allotted by snowflake from a cloud provider.

Each virtual warehouse in the query processing layer is independent and does not share its computational resources with any other virtual warehouses. This makes each virtual warehouse independent and shows no impact on the other virtual warehouses in case of any failover.



95 What is the Cloud Services layer in Snowflake architecture?


The Cloud Services layer consists of a set of services that coordinates multiple tasks across the Snowflake platform. All these services tie together and work with great co-ordination to process user requests, from login to query dispatch. This layer also executes compute instances assigned by Snowflake from the cloud manager.

Following are the various services managed under this layer:

  • Authentication
  • Metadata management
  • Infrastructure management
  • Access control
  • Optimization and query parsing