After introducing a number of new services for machine learning and database analytics on Tuesday, Amazon Web Services Inc. doubled down today with a host of new tools aimed at providing what one top executive described as a “future-proof” data strategy.
In a keynote address led by Swami Sivasubramanian, vice president of database, analytics and machine learning at AWS, the cloud giant introduced enhancements for file ingestion, workload scaling and managing overall data quality. Sivasubramanian noted at the start of his presentation that the latest innovations were a continuation of Amazon’s lengthy history of data management.
“We have been in the data business long before AWS came into existence,” Sivasubramanian said. “We used data to anticipate our customers’ need for expanded storage which paved the way for AWS. We built our business on data.”
New analytics capabilities
To help other enterprises build on data, AWS focused this week on providing tools designed to address a number of enterprise pain points by making it faster and easier to manage and analyze data at petabyte scale. The company unveiled several new database and analytics capabilities on Wednesday to support this approach.
Amazon OpenSearch Serverless will help run search and analytics workloads without requiring the configuration or management of underlying infrastructure. Workload support also got a boost through the introduction of Amazon DocumentDB Elastic Clusters to scale customer files with millions of writes per second and multi-petabyte storage of data.
AWS also rolled out Amazon Athena for Apache Spark. This new feature will reduce the time to use Spark for interactive analytics from minutes to less than a second, according to Sivasubramanian.
Building on its AWS Glue serverless data integration service, the company announced Glue Data Quality which will reduce time for data analysis by automatically measuring and monitoring data quality across pipelines. And the Amazon Redshift cloud data warehouse will now support high availability configuration across multiple AWS Availability Zones. The goal is to deliver high reliability and availability to support mission critical analytics workloads, balancing data security with the need for faster recovery.
“While these security solutions are critical, we also believe they should not slow you down,” Sivasubramanian said.
In addition to the availability zone enhancement for Redshift, AWS announced Centralized Access Controls for Redshift Data Sharing which governs access using AWS Lake Formation, and a new ability to auto-copy files into Redshift from S3.
Amazon SageMaker, the company’s cloud machine learning platform, was clearly a focus this week. AWS unveiled eight new SageMaker capabilities on Wednesday, including a Model Dashboard for tracking machine learning model performance, a Role Manager solution for defining access and permissions, and a streamlined data preparation capability in SageMaker Studio Notebooks.
The company also showcased an expanded capability for SageMaker in supporting the use of geospatial data for customers. SageMaker will now simplify the generation of geospatial machine learning predictions and speed up model development.
“Geospatial datasets are typically massive and unstructured, and the tools are really limited,” Sivasubramanian said. “We are making it easier for customers to unlock the value of geospatial data. These types of innovations demonstrate the impact that data can have on organizations and the world.”
On Wednesday, AWS announced general availability of Trusted Language Extensions for PostgreSQL, a new open-source development kit. The company already supported more than 85 PostgreSQL extensions in Amazon Aurora and Amazon RDS, and this latest release was in response to customers’ interest in flexibility to build and run their own extensions for PostgreSQL database instances.
Aurora also received additional support through the addition of Amazon GuardDuty RDS Protection using machine learning to identify threats to data stored in Aurora databases. This single-click functionality will be available to AWS customers at no additional cost during the preview period.
Sivasubramanian outlined a vision in his keynote of an enterprise world in which data becomes the connective tissue threaded across organizations. This will create a lasting culture of innovation, according to the AWS executive, built around data and the tools to maximize its value.
“It’s individuals who create these sparks of innovations, but it is the leaders who must create a data driven culture to help them get there,” Sivasubramanian said.