Skip to main content
Log inGet a demo

How Do the 6 Snowflake Data Types Work?

Learn about the 6 different data types that you can use in Snowflake.

Craig Dennis.

Craig Dennis

February 24, 2023

15 minutes

The 6 snowflake data types.
  • NUMBER: stores whole numbers with the default settings as precision is 38 and scale is 0.
  • DECIMAL: synonymous with NUMBER.
  • NUMERIC: synonymous with NUMBER.
  • INT, INTEGER, BIGINT, SMALLINT: synonymous with NUMBER but doesn’t allow for precision and scale to be changed. It is set to the default (precision is 38 and scale is 0.)
  • FLOAT, FLOAT4, FLOAT8: Snowflake treats all three of these are 64-bit floating-point numbers and supports special values such as Not a Number NaN, infinity inf, and negative infinity -inf.
  • DOUBLE, DOUBLE PRECISION, REAL: All three of these are synonymous with NUMBER.
  • Numeric Constants: Snowflake supports numerical constants, also referred to as literals. The format of how to use there looks like this:
  • [+-][digits][.digits][e[+-]digits]

    You can look at how you can create different data types and how they behave if you change the precision and scale with example below.

    With the code below, you can create a table with customer info for an E-commerce company.

    Note: Throughout this article, the code examples may not represent what your customer data might look like in the real world. These are just to show as examples.

    create 
    or replace table customer_data(
      customer_ID number, 
      total_amount_spent number(10, 2), 
      number_of_times_logged_in float, 
      age numeric(3, 0), 
      telephone_number number(5, 0)
    );                                  
    

    In the table, you’re creating multiple numeric data types. You can see how you create a column by naming it and then specfiying the data type. The numbers in brackets are the precision and scale, respectively.

    You can put data into the table with the code below and see how it reacts to the data types we created.

    insert into customer_data 
    values 
      (1234.12, 10.434, 6, 10, 78884);
    

    The first thing you’ll notice if you run this is you’ll get an error message. This error message is due to the data you’re trying to enter in the telelphone_number field. The precision was only set to five digits, where you’re trying to enter a 10-digit number.

    When you created the table, you set the customer_id as a NUMBER, but the entered data had decimal places. As you can see, because a scale wasn’t specified (the default is 0) the decimals are dropped.

    A screenshot of a table showing the numeric snowflake data types

    You can also see a number was entered into a total_amount_spent with three decimal places, but the scale was set to two, so it dropped the last decimal place. Both the number_of_times_logged_in and age weren’t changed because they were correct values for the data types we created. And the altered telelphone_number was entered without any issues.

    How to Integrate with Third-Party APIs

    Read our whitepaper to learn how you can build and maintain custom integrations and avoid potential pitfalls.

    String and Binary Data Types

    String and Binary data types are one of the most commonly used. They represents a sequence of digits, characters, or even symbols. An example could be a name “Sam Husselback” or a phone number “123-123-123”.

    The following are some of the different types of String and Binary data types.

    • VARCHAR: can hold Unicode UTF characters and is limited to a maximum length of 16MB. When declaring a VARCHAR, you have an optional parameter that allows you to select the maximum number of characters to store.
    • CHAR, CHARACTER: is the same as VARCHAR but only allows the maximum length of one character.
    • STRING: Is the same as VARCHAR.
    • TEXT: Is the same as VARCHAR.
    • BINARY: The maximum length of a BINARY is 8MB and doesn’t have any notion of Unicode characters, so the length is measured in terms of bytes.
    • VARBINARY: Is the same as VARBINARY.
    • String Constants: refer to fixed data values and are always enclosed between delimiter characters. Snowflake has two methods to delimit string values.
      • Single-Quotes String Constants: A string constant can be enclosed between single quote delimiters like this (‘I am a string’).
      • Dollar-Quoted String Constants: If a string constant contains single quote characters, backslash characters, or newline characters, you can enclose them with dollar symbols.

    You can look at these data types by creating a new customer_data table with just String and Binary types with the code below.

    create 
    or replace table customer_data(
      first_name VARCHAR(5), 
      last_name STRING(7), 
      gender CHAR
    );
    
    

    In the table, the data has specified the length of some of the columns we created. You can enter data with the code below and see how it responds.

    insert into customer_data 
    values 
      ('Alexander', 'Smith', 'female');
    

    The first thing you’ll notice is an error occurs. This is because the first_name column was set only to have five characters, and the entered data was too long. You can change Alexander to Alex instead to remove the error. If you run the code again, another error will show.

    You’ll get a message similar to the above. This time, it’s because “Female” was entered into a CHAR data type. CHAR data type only accepts one character. If you change “Female” to “F” and rerun the code, everything is ok, and the data is inserted.

    A screenshot of a table showing the string snowflake data types

    Logical Data Types

    Snowflake supports a single logical data type, BOOLEAN, which can have a TRUE or FALSE value. BOOLEAN can also have an “unknown” value displayed as NULL. The BOOLEAN data type can also support Ternary Logic.

    You can create a table with the code below to look at Logical data types in action.

    create 
    or replace table customer_data(
      logged_in BOOLEAN, has_downloaded_white_paper BOOLEAN, 
      has_paid BOOLEAN, has_viewed_over_4_webpages BOOLEAN
    );
    

    You can use the code below to insert some data into the customer_data table.

    insert into customer_data 
    values 
      (
        'customer is logged in', 1, 'true', 
        'off'
      );
    

    The first thing you’ll notice is an error message. This is because you’re trying to enter the string ‘customer is logged in,’ which is not recognized as a BOOLEAN value. Change that to TRUE and see if you get any errors.

    You would think you would get more errors due to the other entered strings, but Snowflake can explicitly convert strings and numerical values using readily available functions.

    A screenshot of a table showing the logical snowflake data types

    The table shows that ‘1’ and ‘true’ have been converted to TRUE, and ‘off’ has been converted to FALSE. You can find more information on what can be converted to a BOOLEAN value in the Snowflake documentation.

    Date and Time Data Types

    Dates and Time help you track when an event happened. Snowflake has a range of different formats of date and times that are usable such as “2023-01-01”, “01-Jan-2023”, or “ '2023-01-01 01:00:00”.

    Below are the date and time data types that Snowflake supports.

    • DATE: allows you to store just the date without any time elements and supports most of the common forms such as YYYY-MM-DD and DD-MON-YYYY.
    • DATETIME: is an alias for TIMESTAMP_NTZ.
    • TIME: allows you to store the time in the form of HH:MI:SS, and you have the option to set the precision, which is set to a default of 9. The TIME value must be between 00:00:00 and 23:59:59.999999999.
    • TIMESTAMP: is a user-specified alias associated with one of the TIMESTAMP_* variations. In operations where TIMESTAMP is used, the associated TIMESTAMP* variation is used automatically, and the TIMESTAMP data type is never stored in tables.
    • TIMESTAMP_LTZ, TIMESTAMP_NTZ, TIMESTAMP_TZ: Snowflake supports three different versions of TIMESTAMP:
      • TIMESTAMP_LTZ: stores UTC time with specified precision, and all operations performed in the current session’s time zone are controlled by the TIMEZONE session parameter.
      • TIMESTAMP_NTZ: stores “wallclock” time with specified precision with all operations performed without taking any time zone into account.
      • TIMESTAMP_TZ: internally stores UTC time with an associated time zone offset. The session time zone offset is used if a time zone is not provided.

    You can create a table with the code below to look at Data and Time data types in action.

    create 
    or replace table customer_data(
      last_logged_in DATE, time_of_purchase TIME, 
      account_created DATETIME
    );
    

    You can use the code below to insert some data.

    insert into customer_data 
    values 
      (
        '2022', '12', '2021-01-01 01:00:00'
      );
    

    Thankfully there are no errors, but you might find some things wrong with the data.

    A screenshot of a table showing the data and time snowflake data types.

    Strangely the last_logged_in date is 1970-01-01. This is because a unrecognised Date format was entered, so it has reverted to the start of the Unix epoch, 1970-01-01.

    The data entered as the time_of_purchase was “12”, so it has converted into a TIME data format, assuming that 12 seconds was entered.

    At first glance, the account_created value appears to be wrong. A DATETIME was entered, but it’s only displaying a date. This is only a display issue. By default, it shows a date, but you can change how you want to display the DATETIME by changing the Date/time format.

    A screenshot of how to change the format of the data and time data type.

    By doing so, you can see that the correct value is now displayed.

    A screenshot showing the correct format of the data and time snowflake data types.

    Semi-structured Data Types

    Snowflake supports semi-structured data types, so you can load and operate on data such as JSON, Parquet, and XML. The semi-structured data types include:

    • VARIANT: can store values of any other types, including OBJECT and ARRAY, and has a maximum length of 16 MB.
    • OBJECT: can store key-value pairs which in other programming languages are often called “dictionary,” “hash,” or “map.” The key is a non-empty string, and the value is a VARIANT type.
    • ARRAY: is similar to an array in other programming languages. It can contain 0 or more pieces of data, and each element is accessed by specifying its position in the array.

    The special Snowflake column data type variant is very powerful, because it allows you to store semi-structured data without having to pre-define a specific schema. However you can easily query this table later on as if it were structured data. Let’s go through an example to see how this works.

    First you need to create a table with the variant type:

    create table json_sales_data (v variant);
    

    You can now load JSON files into this table using Snowflake’s data loading steps, typically by batch loading via a stage. Manually, you can do the following:

    insert into json_sales_data 
    select 
      parse_json(
        ‘{ "store" : { "location" : "CA-SanFrancisco-013", 
        "latitude" : "37.773972", 
        "longitude" : "-122.431297" }, 
        "sale_obsTime" : "2023-2-17T10:03:15.000Z", 
        "total_amount" : "321" } ');
    

    Now when you run a select * , you get one column (V) where the objects will be shown as above, in JSON structure.

    To make it easier to query, you can create a view from this table that will put structure onto the semi-structured data. Views don’t copy data, the only thing they do is present data to the end user in a cleaner manner.

    create 
    or replace view json_sales_data_view as 
    select 
      v : location :: string as store_location, 
      v : latitude :: float as store_lat, 
      v : longitude :: float as store_long, 
      v : sales_obsTime :: timestamp as sales_time, 
      v : total_amount :: int as sales_total_amount, 
    from 
      json_sales_data;
    

    The SQL dot notation, as shown above (v:) pulls out values at lower levels within the JSON object hierarchy and will then treat the fields as if they were a relational table column, without changing the underlying table.

    By now querying the view you created instead of the original table, you will get a structured table results with the above columns store_location, city_lat, city_long, sales_time, and sales_total_amount:

    select 
      * 
    from 
      json_sales_data_view;
    

    You can even query this view as if it were structured and/or join with other tables.

    Geospatial Data Types

    Snowflake supports Geospatial data types in two formats.

    • GEOGRAPHY: models the Earth as a perfect sphere and follows the WGS 84 standard. This is where the points on the Earth are represented as degrees of longitude (from -180 degrees to +180 degrees) and latitude (-90 to +90).
    • GEOMETRY: represents features in the planar coordinate system (Euclidean, Cartesian). The coordinates are represented as pairs of real numbers (x, y) and are determined by the spatial reference system (SRS). Currently, only 2D coordinates are supported.

    Both of these data types support the following objects:

    • Point
    • MultiPoint
    • LineString
    • MultiLineString
    • Polygon
    • MultiPolygon
    • GeometryCollection
    • Feature
    • FeatureCollection

    Unsupported Data Types in Snowflake

    There are a few data types that Snowflake doesn’t support. Large Objects (LOB) such as BLOB and CLOB aren’t supported. However, you can use BINARY instead of BLOB and VARCHAR instead of CLOB.

    Other data types that aren’t currently supported are ENUM and user-defined data types.

    Snowflake Programming languages

    Previously Snowflake only ran on SQL, but with Snowpark, you can now leverage other development languages like Python, Java, and Scala in Snowflake.

    Snowpark enhance the developer experience by letting you code in your preferred language and run that code directly on Snowflake. Using a language other than SQL, means you have better tools to solve more complex problems since all these languages have extensive open-source packages and libraries you can use to tackle specific use cases.

    Using Snowpark also allows you to utilize the computing power of Snowflake. Rather than exporting the data to other environments, you can ship your code to the data, which can help with optimization.

    Snowpark also lets you write unit tests and leverage CI/CD development pipelines that can push to Snowflake to create more dependable applications and Snowpark also allows for software application to work better on Snowflake with application such as Dataiku, dbt, and Matillion, enabling complex computations to push down to Snowflake.

    There are three main workflows that Snowpark benefits the most.

    • Data science and machine learning: Python is one of the most widely used programming languages for machine learning, making Snowpark Python framework a perfect fit. The Snowpark Dataframe API allows you to interact with Snowflake data, and the Snowpark UDFs let you use Snowflake computing power to run batch training and inference.
    • Data-intensive applications: If you’re developing dynamic application that run on data, Snowpark can let you run those applications on Snowpark. Snowpark can also be combined with Snowflake’s Native App and Secure Dat Sharing capabilities, so you can process customer data in a secure manner.
    • Complex data transformations: SQL can inflate the complexity of data cleansing and ELT workloads. Snowpark Python can take the functional programing paradigms, such as bringin in external libraries and code refactoring, and run on the Snowflake compute so there isn’t any needs ot ship data to an external environment.

    Final Thoughts

    Knowing which data types Snowflake supports can help you capture and ingest your data in the right format while also ensuring that you’re not compromising on data quality. Having a deep understanding of the different Snowflake data types will not only help you struture your data, but it will also help you tackle your analytics and activation use cases more effectively.

    How to Integrate with Third-Party APIs

    Read our whitepaper to learn how you can avoid potential pitfalls when integrating with third-party APIs

    • Challenges with APIs
    • Concerns with data systems
    • Architectural patterns for data ingestion

    More on the blog

    • What is Reverse ETL? The Definitive Guide .
  • Friends Don’t Let Friends Buy a CDP.
  • Snowflake

    Marketplace Partner of the Year

    Gartner

    Cool Vendor in Marketing Data & Analytics

    Fivetran

    Ecosystem Partner of the Year

    G2

    Best Estimated ROI

    Snowflake

    One to Watch for Activation & Measurement

    G2

    CDP Category Leader

    G2

    Easiest Setup & Fastest Implementation

    Activate your data in less than 5 minutes