Personal Picks: Data Product News (April 16, 2025)

Note: This is an English translation of this original Japanese article. Hello, this is Sagara. As a Modern Data Stack consultant, I've noticed that there's a constant flow of information in the Modern Data Stack ecosystem. In this article, I'll summarize the notable Modern Data Stack related information that caught my attention over the past two weeks. Disclaimer: This article doesn't cover all the latest information about the products mentioned. It only includes information that **I personally found interesting. Modern Data Stack in General How AI will Disrupt BI As We Know It Tristan, the CEO of dbt Labs, published an article analyzing how AI might fundamentally transform Business Intelligence (BI). Traditional BI tools have three main functions: "modeling," "exploratory data analysis (EDA)," and "presentation." The article mentions that with MCPs (Model Control Protocols), AI can now access enterprise data and potentially take over the EDA function, which is a core capability of BI tools. https://roundup.getdbt.com/p/how-ai-will-disrupt-bi-as-we-know Data Extract/Load Fivetran Managed Data Lake Service now supports Google Cloud Storage Fivetran's Managed Data Lake Service, which manages catalogs and updates tables using Open Table Formats (Iceberg, Delta Lake), now supports Google Cloud Storage. For catalogs, the Fivetran Iceberg REST Catalog is used by default, but users can optionally choose to use the user-managed BigQuery metastore. https://www.fivetran.com/blog/unlock-interoperability-with-fivetran-managed-data-lake-service-for-googles-cloud-storage https://fivetran.com/docs/destinations/managed-data-lakes-service Airbyte Airbyte version 1.6 released Airbyte has released its latest version, 1.6. Key updates include: Dashboard to visualize sync success and failure Timeline events to record schema changes Ability to easily copy connector configuration JSON objects from the GUI Connector Builder support for Asynchronous Streams Data loading across multiple clouds and regions (Self-Managed Enterprise only) https://docs.airbyte.com/release_notes/v-1.6 https://github.com/airbytehq/airbyte/releases/tag/v1.6.0 Data Warehouse/Data Lakehouse Snowflake terraform-provider-snowflake to be GA on 4/23 The ROADMAP.md in the official repository has been updated, indicating that terraform-provider-snowflake will become GA on 4/23. Additionally, v2.0.0, which includes some breaking changes such as default value changes for resources and adding sensitive flags, will be released soon. https://github.com/snowflakedb/terraform-provider-snowflake/blob/main/ROADMAP.md#ga-release Snowpipe AUTO_INGEST and directory table AUTO_REFRESH now available for named internal stages In the upcoming v9.10 release, Snowpipe AUTO_INGEST and directory table AUTO_REFRESH will be available for named internal stages (as a preview feature). Previously, Snowpipe for named internal stages could only be updated via API, so this feature will be particularly useful for requirements that involve utilizing internal stages. https://docs.snowflake.com/release-notes/2025/9_10#data-loading-unloading-updates BigQuery Google Cloud Next'25 announced many new features for the "Autonomous Data to AI" platform Google Cloud Next'25 was held, and many new features were announced for the "Autonomous Data to AI" platform. With "Autonomous" as a key concept, Google Cloud announced Data Agents, a new feature that autonomously develops data pipelines on Google Cloud: Data Engineering Agent Data Governance Agent Data Science Agent Conversational Agent for Business Users This is already marked as GA and appears to align with the Conversational Agent described in this article. Our company has written a blog post about the session that explained these new features, so please check it out as well. https://dev.classmethod.jp/articles/20250410-next25-session-report/ Additionally, Google Cloud announced the Agent Development Kit and Agent2Agent (A2A) Protocol. For more information, please refer to the official recap article and our company blog. https://cloud.google.com/blog/topics/google-cloud-next/google-cloud-next-2025-wrap-up?hl=en https://dev.classmethod.jp/referencecat/google-cloud-next-25/ MotherDuck/DuckDB Data pipeline development using MCP, DuckDB, and dbt MotherDuck's official blog published an article about the benefits and examples of data pipeline development using MCP, DuckDB, and dbt. The typical workflow with generative AI has been "create prompt → AI generates code → test with data." However, when AI generates inaccurate code, you need to modify the prompt, regenerate the code, and retest. By using MCP, you can query data through an MCP server for DuckDB/MotherDuck, and with that data, AI can handle both code generation and testing, allowing for faster

Apr 16, 2025 - 03:43
 0
Personal Picks: Data Product News (April 16, 2025)

Note: This is an English translation of this original Japanese article.

Hello, this is Sagara.

As a Modern Data Stack consultant, I've noticed that there's a constant flow of information in the Modern Data Stack ecosystem. In this article, I'll summarize the notable Modern Data Stack related information that caught my attention over the past two weeks.

Disclaimer: This article doesn't cover all the latest information about the products mentioned. It only includes information that **I personally found interesting.

Modern Data Stack in General

How AI will Disrupt BI As We Know It

Tristan, the CEO of dbt Labs, published an article analyzing how AI might fundamentally transform Business Intelligence (BI).

Traditional BI tools have three main functions: "modeling," "exploratory data analysis (EDA)," and "presentation." The article mentions that with MCPs (Model Control Protocols), AI can now access enterprise data and potentially take over the EDA function, which is a core capability of BI tools.

https://roundup.getdbt.com/p/how-ai-will-disrupt-bi-as-we-know

Data Extract/Load

Fivetran

Managed Data Lake Service now supports Google Cloud Storage

Fivetran's Managed Data Lake Service, which manages catalogs and updates tables using Open Table Formats (Iceberg, Delta Lake), now supports Google Cloud Storage.

For catalogs, the Fivetran Iceberg REST Catalog is used by default, but users can optionally choose to use the user-managed BigQuery metastore.

https://www.fivetran.com/blog/unlock-interoperability-with-fivetran-managed-data-lake-service-for-googles-cloud-storage

https://fivetran.com/docs/destinations/managed-data-lakes-service

Airbyte

Airbyte version 1.6 released

Airbyte has released its latest version, 1.6.

Key updates include:

  • Dashboard to visualize sync success and failure
  • Timeline events to record schema changes
  • Ability to easily copy connector configuration JSON objects from the GUI
  • Connector Builder support for Asynchronous Streams
  • Data loading across multiple clouds and regions (Self-Managed Enterprise only)

https://docs.airbyte.com/release_notes/v-1.6

https://github.com/airbytehq/airbyte/releases/tag/v1.6.0

Data Warehouse/Data Lakehouse

Snowflake

terraform-provider-snowflake to be GA on 4/23

The ROADMAP.md in the official repository has been updated, indicating that terraform-provider-snowflake will become GA on 4/23.

Additionally, v2.0.0, which includes some breaking changes such as default value changes for resources and adding sensitive flags, will be released soon.

https://github.com/snowflakedb/terraform-provider-snowflake/blob/main/ROADMAP.md#ga-release

Snowpipe AUTO_INGEST and directory table AUTO_REFRESH now available for named internal stages

In the upcoming v9.10 release, Snowpipe AUTO_INGEST and directory table AUTO_REFRESH will be available for named internal stages (as a preview feature).

Previously, Snowpipe for named internal stages could only be updated via API, so this feature will be particularly useful for requirements that involve utilizing internal stages.

https://docs.snowflake.com/release-notes/2025/9_10#data-loading-unloading-updates

BigQuery

Google Cloud Next'25 announced many new features for the "Autonomous Data to AI" platform

Google Cloud Next'25 was held, and many new features were announced for the "Autonomous Data to AI" platform.

With "Autonomous" as a key concept, Google Cloud announced Data Agents, a new feature that autonomously develops data pipelines on Google Cloud:

  • Data Engineering Agent
  • Data Governance Agent
  • Data Science Agent
  • Conversational Agent for Business Users
    • This is already marked as GA and appears to align with the Conversational Agent described in this article.

Our company has written a blog post about the session that explained these new features, so please check it out as well.

https://dev.classmethod.jp/articles/20250410-next25-session-report/

Additionally, Google Cloud announced the Agent Development Kit and Agent2Agent (A2A) Protocol. For more information, please refer to the official recap article and our company blog.

https://cloud.google.com/blog/topics/google-cloud-next/google-cloud-next-2025-wrap-up?hl=en

https://dev.classmethod.jp/referencecat/google-cloud-next-25/

MotherDuck/DuckDB

Data pipeline development using MCP, DuckDB, and dbt

MotherDuck's official blog published an article about the benefits and examples of data pipeline development using MCP, DuckDB, and dbt.

The typical workflow with generative AI has been "create prompt → AI generates code → test with data." However, when AI generates inaccurate code, you need to modify the prompt, regenerate the code, and retest.

By using MCP, you can query data through an MCP server for DuckDB/MotherDuck, and with that data, AI can handle both code generation and testing, allowing for faster iterations of the "create prompt → AI generates code → test with data" workflow.

https://motherduck.com/blog/faster-data-pipelines-with-mcp-duckdb-ai/

Data Transform

dbt

Summary of new dbt Cloud features

An article summarizing new dbt Cloud features was published:

  • Model query history is now GA
  • Power BI and dbt Semantic Layer integration is in beta
  • dbt Copilot is generally available
  • Next-generation dbt engine incorporating the acquired SDF is in private beta
  • dbt supports DataFrames on BigQuery, usable in dbt Python models

https://www.getdbt.com/blog/whats-new-in-dbt-cloud-april-2025

dbt Cloud can now be hosted on Google Cloud and is available on the Marketplace

dbt Cloud can now be hosted in Google Cloud's US East region, and dbt Cloud is now available on the Google Cloud Marketplace.

https://www.getdbt.com/blog/dbt-cloud-google-cloud

https://www.getdbt.com/blog/dbt-labs-launches-on-google-cloud-and-google-cloud-marketplace

Business Intelligence

Looker

Latest Looker information revealed at Google Cloud Next'25

Google Cloud Next'25 included a Looker session, and a blog summarizing the latest information was published.

The blog mentions that Gemini in Looker is now available for all Looker instances regardless of hosting, Looker reports (likely similar to what was previously called Studio in Looker), and updates to LookML CI/CD capabilities due to the acquisition of Spectacles.dev.

https://cloud.google.com/blog/products/data-analytics/looker-bi-platform-gets-ai-powered-data-exploration?hl=en

The session materials from Google Cloud Next'25 are available at the following links.

Personally, I found the Embedded Looker Reports and Embedded Looker Conversational Analytics on page 17 particularly interesting.

https://cloud.withgoogle.com/next/25/session-library?session=BRK1-024

https://content-cdn.sessionboard.com/content/RZwjyxNXR5qIQ46vfqt1_BRK1-024.pdf

Version 25.6 release notes published

Release notes for the latest Looker version, 25.6, have been published. The update primarily consists of minor updates such as driver updates.

https://cloud.google.com/looker/docs/release-notes#April_09_2025

Tableau

Tableau Conference 2025 is being held

Tableau Conference 2025 is being held from April 15 to April 17, 2025 (local time).

https://www.salesforce.com/tableau-conference/?utm_source=tableau&utm_medium=web-login-promo

A blog summarizing the new features announced at the conference has also been published.

The conference introduced the concept of "Agentic Analytics," which enables humans to collaborate with AI agents for data analysis and action execution through Tableau Next (formerly Tableau Einstein).

  • Tableau Next is a new BI platform that runs on the Salesforce platform
  • With Tableau Next, Data Cloud functionality allows access to data from data warehouses like Snowflake without copying
  • "Tableau Semantics," included in Tableau Next, allows defining data models with contextual information on the Salesforce platform
  • Tableau Next enables visualization through a UI similar to the traditional Tableau
  • Tableau Next offers many AI-powered features such as Data Pro for automating data transformation, Concierge for natural language interaction with data, and Inspector for data monitoring and alerts

https://www.tableau.com/ja-jp/blog/agentic-analytics-new-paradigm-for-business-intelligence

If you want to understand what Tableau Next is and its positioning, the following article provides detailed information.

https://www.tableau.com/ja-jp/blog/tableau-next-faq

Hex

Announced Embedded Analytics feature

Hex announced a new Embedded Analytics feature that allows embedding Hex "Apps" (a feature that lets you publish selected notebook cells together) into applications.

https://hex.tech/blog/introducing-embedded-analytics/

According to the documentation below, embedding notebooks or Explore is not supported, so please be aware of this limitation.

View/edit the notebook - no near term plans to support
Comments - no near term plans to support
Explore - no near plans to support
Snapshots - no near term plans to support

https://learn.hex.tech/docs/share-insights/embedding/signed-embedding

Lightdash

"Preview Projects" released, allowing testing on Lightdash with any branch

Lightdash released a new feature called "Preview Projects," which allows testing on Lightdash with any branch before merging to main.

This is useful for checking how dimensions and metrics will be visualized in branches before merging.

https://changelog.lightdash.com/

https://docs.lightdash.com/references/preview-projects#lightdash-app

Steep

Released new features including applying existing filter settings to filter value candidates and month-over-month/year-over-year comparisons

Steep released new features including Smarter filtering (applying existing filter settings to filter value candidates) and month-over-month/year-over-year comparisons.

https://steep.app/blog/17-powerful-upgrades

Data Catalog

Secoda

Summary article on new Secoda features announced at Data Leaders Forum

An article summarizing the new Secoda features announced at the Data Leaders Forum hosted by Secoda was published.

The article mentions four major features (summary created with generative AI):

Personally, I'm very interested in "Access requests"! Until now, features for managing access rights to various resources seemed to be available only in tools like DataZone and Immuta, so I'm very pleased that a data catalog in the MDS space has taken on this functionality.

  • Policies

    • Feature to define and implement data governance standards in an automated, extensible way
    • Specify resources, compliance conditions, and actions when resources don't meet standards
    • Provides templates aligned with regulatory frameworks like GDPR, HIPAA, SOC 2
    • Features customizable severity levels, conditions for required documentation and owner assignment, and automated remediation steps
  • Access requests

    • Feature to request access permissions directly from within Secoda
    • Users request access from Secoda or Slack, and administrators manage and approve through a centralized interface
    • Enables resource selection, period and purpose specification, role assignment, and access scope and expiration settings
    • Prevents long-term permission sprawl with automatic expiration and tracks all requests, approvals, and changes in audit logs
  • Custom roles

    • Design organization-specific roles to precisely control who can do what and where
    • Define permissions in four areas: user management, resource access, feature access, and workspace settings
    • Addresses various use cases for data engineering teams, finance departments, consulting teams, analysts, etc.
    • Impersonation feature to preview what users in each role can see
  • Logs and version history

    • Enhanced logging capabilities to fully understand changes across the workspace
    • All automatic or manual updates are tracked at the resource level, filterable by user, date, action type, and affected assets
    • Version history for all resources, allowing metadata restoration to previous states
    • View activity logs per resource, restore to previous versions, compare versions, and track user activity

https://www.secoda.co/blog/secoda-spring-25-keynote-launch

Data Mesh

Nextdata

Announced Nextdata OS

Nextdata, founded by Zhamak Dehghani who conceptualized Data Mesh, announced "Nextdata OS." It appears to be an integrated platform for Data Products, but details about what kind of product it will be are not yet clear.

https://www.globenewswire.com/news-release/2025/04/08/3057901/0/en/Founded-by-Zhamak-Dehghani-Nextdata-Launches-Nextdata-OS-to-Help-Enterprises-Automate-Data-Management-for-Agents-Analytics-and-Apps.html

Nextdata's official website has also been updated. There's a simple product introduction video available.

https://www.nextdata.com/

A launch event will be held on April 22 (local time).

https://www.eventbrite.com/e/nextdata-os-launch-event-introducing-autonomous-data-products-tickets-1316828271809?aff=Website