Teradata 基本介紹 (8)

基本介紹

教學目標

初步了解多種基於 Hadoop 進行查詢處理的特性。

重點概念

1. Raw Map Reduce

  • Native Map Reduce processing.
  • Direct commands to Hadoop and HDFS.
  • Programming and Map Reduce skills required.
  • Batch processing focused.
  • Full flexibility to operate on any data in HDFS.
  • “Data Maniputation” more than “Query Processing”.

Example: Apache Hadoop

2. Query Engine Using HDFS Files

  • SQL query engine on Hadoop cluster.
    • Standard data dictionary/meta data.
    • Standard data format within HDFS files.
    • Data types may be limited.
  • SQL language, but standards compatibility varies.
  • Query engine maturity varies
  • Data “portable” and can be read by other systems/engines.

Example: Cloudera Impala

3. RDBMS Orchestrating Queries With Remote Access to Hadoop/Hive

  • External RDBMS sends (part of) queries to engine on Hadoop.
    • Standard data dictionary/meta data within Hadoop cluster.
    • Standard data format within HDFS files.
    • Data types may be limited by engine on Hadoop and external RDBMS.
    • SQL query engine capabilities combination of external and internal Hadoop engines.
    • Combines data and analytics in two systems.
  • SQL language, standards compatibility generally high.
  • Query engine generally mature.
  • Data “portable” and can be read by other systems/engines.

Example: Teradata QueryGrid

相關資源