DataStage Training:

Course Name : IBM Datastage

Course duration: 30 Hours

Classes: Classroom Training / Online Training

Data Warehouse Fundamentals:

Unit 1: Basic of Data Warehouse

  • An introduction to Data Warehousing
  • Purpose of Data Warehouse
  • Characteristics of DWH
  • Data Warehouse Life cycle
  • Data Warehouse Architecture
  • Different Approaches of DWH (Kimball Vs Inmon)
  • Operational Data Store
  • OLTP Vs OLAP Databases
  • OLTP Vs Warehouse Applications
  • Data Marts
  • Data marts Vs Data Warehouses
  • Fact Table Vs Dimension Table
  • Concepts of Schemas (Star schema & Snowflake schema)
  • Industry leading ETL and Reporting tools

Unit -2: Data Modeling

  • Introduction to Data Modeling
  • Entity Relationship model (E-R model)
  • Data Modeling for Data Warehouse, Normalization process
  • Dimensions and fact tables
  • Star Schema and Snowflake Schemas

Unit -3: ETL Design Process

  • Introduction to Extraction, Transformation & Loading
  • Types of ETL Tools
  • Key tools in the market

Unit – 4: Introduction to Data stage Version 7.5×2 & 8.1& 8.5

  • Data stage introduction
  • IBM information Server architecture
  • Data stage components
  • Data Stage main functions
  • Client components- Adding different Servers to our workspace

Unit – 5: Data stage Administrator

  • Data stage project Administration
  • Editing projects and Adding Projects
  • Deleting projects Cleansing up project files
  • Environmental Variables
  • Environment management
  • Auto purging
  • Runtime Column Propagation(RCP)
  • Add checkpoints for sequencer
  • NLS configuration
  • Generated OSH (Orchestra Engine)
  • System formats like data, timestamp
  • Projects protect – Version details

Unit – 6: Data stage Director

  • Introduction to Data stage Director
  • Validating Data stage Jobs
  • Executing Data stage jobs
  • Job execution status
  • Monitoring a job
  • Job log view
  • Job scheduling
  • Creating Batches
  • Scheduling batches

Unit – 7: Data stage Designer

  • Introduction to Data stage Designer
  • Importance of Parallelism
  • Pipeline Parallelism
  • Partition Parallelism
  • Partitioning and collecting(In depth coverage of partitioning and collective techniques)
  • Symmetric Multi Processing (SMP)
  • Massively Parallel Processing (MPP)
  • Introduction to Configuration file
  • Editing a Configuration file
  • Partition techniques
  • Data stage Repository Palette
  • Passive and Active stages
  • Job design overview
  • Designer work area
  • Annotations
  • Creating jobs
  • Importing flat file definitions
  • Managing the Metadata environment
  • Dataset management
  • Deletion of Dataset
  • Routines

Unit – 8: Working with Parallel Job Stages

Database Stages

  • Oracle
  • ODBC

Dynamic RDBMS

  • File Stages
  • Sequential file
  • Dataset
  • File set
  • Lookup file set

Processing Stages

  • Copy
  • Filter
  • Funnel
  • Sort
  • Remove duplicate
  • Aggregator
  • Switch
  • Pivot stage
  • Lookup
  • Join
  • Merge
  • Difference between look up, join and merge
  • Change capture
  • External Filter
  • Surrogate key generator
  • Transformer

Real time scenarios using different Processing Stages – Implementing different logics using Transformer

Debug Stages

  • Head
  • Tail
  • Peek
  • Column generator
  • Row generator
  • Write Range Map Stage

Real Time Stages

  • XML input
  • XML output
  • Local and Shared containers
  • Routines creation
  • Extensive usage of Job parameters, Parameter Sets, Environmental variables in jobs Introduction to predefined Environmental variables creating userdefined Environmental variables and implementing the same in parallel jobs

Unit – 9: Advanced Stages in Parallel Jobs (Version 8.1)

  • Explanation of Type1 and Type 2 processes
  • Implementation of Type1 and Type2 logics using Change Capture stage and SCD Stage
  • Range Look process
  • Surrogate key generator stage
  • FTP stage
  • Job performance analysis
  • Resource estimation
  • Performance tuning

Unit – 10: Job Sequencers

  • Arrange job activities in Sequencer
  • Triggers in Sequencer
  • Restablity
  • Recoverability
  • Notification activity
  • Terminator activity
  • Wait for file activity
  • Start Loop activity
  • Execute Command activity
  • Nested Condition activity
  • Exception handling activity
  • User Variable activity
  • End Loop activity
  • Adding Checkpoints
  • Jobs used in different real time scenarios
  • Explanation of Sequence Job stages through different Jobs

Unit – 11: IBM Information Server Administration Guide

  • IBM Web Sphere Data stage administration
  • Opening the IBM Information Server Web console
  • Setting up a project ion the console
  • Customizing the project dashboard
  • Setting up security
  • Creating users in the console
  • Assigning security roles to users and groups
  • Managing licenses
  • Managing active sessions
  • Managing logs
  • Managing schedules
  • Backing up and restoring IBM Information Server

Additional Features

  • Performance Tuning of Parallel Jobs
  • Data stage Installation process and setup
  • Project Explanation