Cloud data warehouse provider Snowflake Inc. today announced it’s adding support for the Python programming language favored by data scientists to its Data Cloud and tightening integration with the collaboration platform it picked up with the acquisition of Streamlit Inc. in March.
Snowpark for Python incorporates Python into the Snowpark development framework, which provides application program interfaces for processing data in Snowflake without moving data to the same platform as the application. In the five months since the public preview was announced, Snowpark for Python has amassed hundreds of customers, the company said. Python now joins Java, Scala and SQL as supported languages.
Streamlit integration, which had previously been in public preview, is now generally available, allowing developers to build data analytics and machine learning models in Python on top of Snowflake. Developers can take advantage of the platform’s security and governance features and share their applications with others.
“You leverage Snowflake’s global performance engine to scalably deploy your apps across the organization,” said Adrien Treuille, the former Streamlit chief executive who now heads Streamlit operations within Snowflake. “You click a button and boom; you get a URL that lets you share your app with others outside.”
Snowflake is also releasing Snowpark-optimized warehouses, initially on the Amazon Web Services Inc. cloud, so Python developers can run large-scale machine learning training models and other memory-intensive operations directly in Snowflake. The company is also releasing a private preview of Python Worksheets, which can be used to develop applications, data pipelines and machine learning models inside the data warehouse.
In addition, Snowflake is adding support for Anaconda, a distribution of Python for scientific computing that is intended to simplify package management and deployment. Anaconda’s open-source Python libraries will be accessible to Snowflake users without requiring manual installation.
Snowflake is also enhancing data pipeline development by making data onboarding faster with Schema Inference, a new feature that is now in private preview. Native dynamic tables, a new feature that is now in private preview, automates incremental processing through declarative data pipeline development.
A new cross-cloud feature is intended to make pipeline failover transparent. “If someone is transforming data and a disaster occurs and they want to switch to a different region or cloud we will be able to seamlessly support that knowing that there’s no data duplicated,” said Christian Kleinerman, senior vice president of product.
Observability and alerts
Native observability and developer experience features are being enhanced with alerting, logging, event tracing, task graphs and history, all of which are available in either public or private preview. “You can send log lines you create from your custom code, inject trace events and control it through different tracing levels,” said Torsten Grabs, director of product management. “That becomes a key building block for allowing you to monitor the health of an application that you’re running on Snowflake.”
The new alerting feature enables developers to define conditions under which an alert is supposed to fire and send the notification via email. Integration with other messaging platforms is planned, Grabs said.
Performance enhancements to the Snowflake platform that are being announced today include a query acceleration service that speeds up outsized queries by providing additional resources without the need for users to deploy more overall compute capacity. Query efficiency is improved with join elimination and there have been improvements to the integrated search engine. Customers can now more easily run a cost-benefit analysis to determine the magnitude and impact of data loads or modifications on tables through history views.
Kleinerman said Snowflake is steadily improving the performance of its platform in a manner that is seamless to customers. “Performance improvements translate to better economics because we charge for compute time,” he said, “so if workflows are taking less time, we are charging less time, and price/performance for our customers continues to get better.”
Photo: Robert Hof/SiliconANGLE
Show your support for our mission by joining our Cube Club and Cube Event Community of experts. Join the community that includes Amazon Web Services and Amazon.com CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.