Quantcast
Channel: colmap – Oracle DBA – Tips and Techniques

Oracle Goldengate Tutorial 8 – Filtering and Mapping data

$
0
0

Oracle GoldenGate not only provides us a replication solution that is Oracle version independent as well as platform independent, but we can also use it to do data transformation and data manipulation between the source and the target.

So we can use GoldenGate when the source and database database differ in table structure as well as an ETL tool in a Datawarehouse type environment.

We will discuss below two examples to demonstrate this feature – column mapping and filtering of data.

In example 1, we will filter the records that are extracted on the source and applied on the target – only rows where the JOB column value equals ‘MANAGER” in the MYEMP table will be considered for extraction.

In example 2, we will deal with a case where the table structure is different between the source database and the target database and see how column mapping is performed in such cases.

Example 1

Initial load of all rows which match the filter from source to target. The target database MYEMP table will only be populated with rows from the EMP table where filter criteria of JOB=’MANAGER’ is met.

On Source

GGSCI (redhat346.localdomain) 4> add extract myload1, sourceistable
EXTRACT added.

GGSCI (redhat346.localdomain) 5> edit params myload1

EXTRACT myload1
USERID ggs_owner, PASSWORD ggs_owner
RMTHOST devu007, MGRPORT 7809
RMTTASK replicat, GROUP myload1
TABLE scott.myemp, FILTER (@STRFIND (job, “MANAGER”) > 0);

On Target

GGSCI (devu007) 2> add replicat myload1, specialrun
REPLICAT added.

GGSCI (devu007) 3> edit params myload1

“/u01/oracle/software/goldengate/dirprm/myload1.prm” [New file]
REPLICAT myload1
USERID ggs_owner, PASSWORD ggs_owner
ASSUMETARGETDEFS
MAP scott.myemp, TARGET sh.myemp;

On Source – start the initial load extract

GGSCI (redhat346.localdomain) 6> start extract myload1

Sending START request to MANAGER …
EXTRACT MYLOAD1 starting

On SOURCE

SQL> select count(*) from myemp;

COUNT(*)
———-
14

SQL> select count(*) from myemp where job=’MANAGER’;

COUNT(*)
———-
9

On TARGET

SQL> select count(*) from myemp where job=’MANAGER’;

COUNT(*)
———-
9

Create an online change extract and replicat group using a Filter

GGSCI (redhat346.localdomain) 10> add extract myload2, tranlog, begin now
EXTRACT added.

GGSCI (redhat346.localdomain) 11> add rmttrail /u01/oracle/software/goldengate/dirdat/bb, extract myload2
RMTTRAIL added.

GGSCI (redhat346.localdomain) 11> edit params myload2

EXTRACT myload2
USERID ggs_owner, PASSWORD ggs_owner
RMTHOST 10.53.200.225, MGRPORT 7809
RMTTRAIL /u01/oracle/software/goldengate/dirdat/bb
TABLE scott.myemp, FILTER (@STRFIND (job, “MANAGER”) > 0);

On Target

GGSCI (devu007) 2> add replicat myload2, exttrail /u01/oracle/software/goldengate/dirdat/bb
REPLICAT added.

GGSCI (devu007) 3> edit params myload2

“/u01/oracle/software/goldengate/dirprm/myload2.prm” [New file]
REPLICAT myload2
ASSUMETARGETDEFS
USERID ggs_owner, PASSWORD ggs_owner
MAP scott.myemp, TARGET sh.myemp;

On Source – start the online extract group

GGSCI (redhat346.localdomain) 13> start extract myload2

Sending START request to MANAGER …
EXTRACT MYLOAD2 starting

GGSCI (redhat346.localdomain) 14> info extract myload2

EXTRACT MYLOAD2 Last Started 2010-02-23 11:04 Status RUNNING
Checkpoint Lag 00:27:39 (updated 00:00:08 ago)
Log Read Checkpoint Oracle Redo Logs
2010-02-23 10:36:51 Seqno 214, RBA 103988

On Target

GGSCI (devu007) 4> start replicat myload2

Sending START request to MANAGER …
REPLICAT MYLOAD2 starting

GGSCI (devu007) 5> info replicat myload2

REPLICAT MYLOAD2 Last Started 2010-02-23 11:05 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:08 ago)
Log Read Checkpoint File /u01/oracle/software/goldengate/dirdat/bb000000
First Record RBA 989

On Source we now insert two rows into the MYEMP table – one which has the JOB value of ‘MANAGER’ and the other row which has the job value of ‘SALESMAN’


On SOURCE

SQL> INSERT INTO MYEMP
2 (empno,ename,job,sal)
3 VALUES
4 (1234,’GAVIN’,’MANAGER‘,10000);

1 row created.

SQL> commit;

Commit complete.

SQL> INSERT INTO MYEMP
2 (empno,ename,job,sal)
3 VALUES
4 (1235,’BOB’,’SALESMAN‘,1000);

1 row created.

SQL> commit;

Commit complete.

SQL> select count(*) from myemp;
COUNT(*)
———-
16

SQL> select count(*) from myemp where job=’MANAGER’;

COUNT(*)
———-
10

On Target, we will see that even though two rows have been inserted into the source MYEMP table, on the target MYEMP table only one row is inserted because the filter has been applied which only includes the rows where the JOB value equals ‘MANAGER’.

SQL> select count(*) from myemp;

COUNT(*)
———-
10

Example 2 – source and target table differ in column structure

In the source MYEMP table we have a column named SAL whereas on the target, the same MYEMP table has the column defined as SALARY.

Create a definitions file on the source using DEFGEN utility and then copy that definitions file to the target system

GGSCI (redhat346.localdomain) > EDIT PARAMS defgen

DEFSFILE /u01/oracle/ggs/dirsql/myemp.sql
USERID ggs_owner, PASSWORD ggs_owner
TABLE scott.myemp;

[oracle@redhat346 ggs]$ ./defgen paramfile /u01/oracle/ggs/dirprm/defgen.prm

***********************************************************************
Oracle GoldenGate Table Definition Generator for Oracle
Version 10.4.0.19 Build 002
Linux, x64, 64bit (optimized), Oracle 11 on Sep 18 2009 00:09:13

Copyright (C) 1995, 2009, Oracle and/or its affiliates. All rights reserved.

Starting at 2010-02-23 11:22:17
***********************************************************************

Operating System Version:
Linux
Version #1 SMP Wed Dec 17 11:41:38 EST 2008, Release 2.6.18-128.el5
Node: redhat346.localdomain
Machine: x86_64
soft limit hard limit
Address Space Size : unlimited unlimited
Heap Size : unlimited unlimited
File Size : unlimited unlimited
CPU Time : unlimited unlimited

Process id: 14175

***********************************************************************
** Running with the following parameters **
***********************************************************************
DEFSFILE /u01/oracle/ggs/dirsql/myemp.sql
USERID ggs_owner, PASSWORD *********
TABLE scott.myemp;
Retrieving definition for SCOTT.MYEMP

Definitions generated for 1 tables in /u01/oracle/ggs/dirsql/myemp.sql

If we were to try and run the replicat process on the target without copying the definitions file, we will see an error as shown below which pertains to the fact that the columns in the source and target database are different and GoldenGate is not able to resolve that.

2010-02-23 11:31:07 GGS WARNING 218 Aborted grouped transaction on ‘SH.MYEMP’, Database error 904 (ORA-00904: “SAL”: invalid identifier).

2010-02-23 11:31:07 GGS WARNING 218 SQL error 904 mapping SCOTT.MYEMP to SH.MYEMP OCI Error ORA-00904: “SAL”: invalid identifier (status = 904), SQL .

We then ftp the definitions file from the source to the target system – in this case to the dirsql directory located in the top level GoldenGate installed software directory

We now go and make a change to the original replicat parameter file and change the parameter ASSUMEDEFS to SOURCEDEFS which provides GoldenGate with the location of the definitions file.

The other parameter which is included is the COLMAP parameter which tells us how the column mapping has been performed. The ‘USEDEFAULTS’ keyword denotes that all the other columns in both tables are identical except for the columns SAL and SALARY which differ in both tables and now we are mapping the SAL columsn in source to the SALARY column on the target.

REPLICAT myload2
SOURCEDEFS /u01/oracle/software/goldengate/dirsql/myemp.sql
USERID ggs_owner, PASSWORD ggs_owner
MAP scott.myemp, TARGET sh.myemp,
COLMAP (usedefaults,
salary = sal);

We now go and start the originall replicat process myload2 which had abended because of the column mismatch (which has now been corrected via the parameter change) and we see that the process now is running without any error.

now go and start the process which had failed after table modification

GGSCI (devu007) 2> info replicat myload2

REPLICAT MYLOAD2 Last Started 2010-02-23 11:05 Status ABENDED
Checkpoint Lag 00:00:03 (updated 00:11:44 ago)
Log Read Checkpoint File /u01/oracle/software/goldengate/dirdat/bb000000
2010-02-23 11:31:03.999504 RBA 1225

GGSCI (devu007) 3> start replicat myload2

Sending START request to MANAGER …
REPLICAT MYLOAD2 starting

GGSCI (devu007) 4> info replicat myload2

REPLICAT MYLOAD2 Last Started 2010-02-23 11:43 Status RUNNING
Checkpoint Lag 00:00:00 (updated 00:00:03 ago)
Log Read Checkpoint File /u01/oracle/software/goldengate/dirdat/bb000000
2010-02-23 11:31:03.999504 RBA 1461

Coming Next! – Monitoring the GoldenGate environment …..


Using GoldenGate Tokens with the COLMAP clause

$
0
0

We can use the @TOKEN function to extract data which is stored in what is called the user token area of the GoldenGate trail file record header.

We can populate this Token data from information stored in the header portion of trail records using the GGHEADER option of the GETENV function or by capturing information about the GoldenGate environment obtained via the GGENVIRONMENT option of GETENV function. We can also populate the tokens with data obtained from some database queries or functions.

To define a token, use the TOKENS option of the TABLE parameter in the Extract parameter file as shown in the example below.

We can then use this information in the tokens to populate columns in target tables by using the @TOKEN column conversion function in the COLMAP clause in a Replicat parameter file.

In the example below, the source table has two columns (SAL and COMM) and the target table has some other columns in addition to these two columns which we will populate using Tokens and the @DATENOW function which will populate the column with the current timestamp.

Source table

SQL> desc mytest
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 SAL                                                NUMBER(10)
 COMM                                               NUMBER(10)

Target table


SQL> desc mytest
 Name                                      Null?    Type
 ----------------------------------------- -------- ----------------------------
 SAL                                                NUMBER(10)
 COMM                                               NUMBER(10)
 HOSTNAME                                           VARCHAR2(20)
 OSUSER                                             VARCHAR2(10)
 DBNAME                                             VARCHAR2(10)
 TRAN_DATE                                          DATE

The column hostname is populated by the token TK_HOST which obtains the hostname information via the @GETENV function. Similarly the database name is obtained via the DBENVIRONMENT option of the GETENV function using the token TK_DBNAME

Finally we populate the date column TRAN_DATE using the @DATENOW function.

These are the contents of the Extract parameter file.

EXTRACT myext
USERID idit_prd, PASSWORD idit_prd
RMTHOST idb02, MGRPORT 7809
RMTTRAIL ./dirdat/yy
TABLE idit_prd.mytest, TOKENS ( TK_HOST = @GETENV("GGENVIRONMENT" , "HOSTNAME"), TK_OSUSER = @GETENV ("GGENVIRONMENT" , "OSUSERNAME"), TK_DBNAME = @GETENV("DBENVIRONMENT" , "DBNAME" ));

These are the contents of the Replicat parameter file

REPLICAT myrep
ASSUMETARGETDEFS
USERID idit_prd,PASSWORD idit_prd
MAP idit_prd.mytest, TARGET idit_prd.mytest,
COLMAP (USEDEFAULTS,
hostname = @token ("tk_host"),
osuser= @token ("tk_osuser"),
dbname= @token ("tk_dbname"),
tran_date = @DATENOW());

Let us now test this.

On the source database we insert a record which populates the two columns SAL and COMM.

SQL> insert into mytest
  2  values
  3   (1000,5000);

1 row created.

SQL> commit;

SQL> select * from mytest;

       SAL       COMM HOSTNAME             OSUSER     DBNAME     TRAN_DATE
---------- ---------- -------------------- ---------- ---------- ---------
      1000       5000 db01             oracle     GGDB1      24-MAR-11

Customizing GoldenGate processing using SQLEXEC and GETVAL

$
0
0

Let us see how we can use the SQLEXEC parameter of GoldenGate to execute both an SQL query as well as a stored procedure and then using the @GETVAL function, we can populate a column in the target database which is not present on the source table.

Using a simple example to illustrate this, let us suppose we have two tables – one a lookup table called COUNTRY_CODES which has the country_name and country_id columns and another table called CUSTOMERS which only has the country_id column.

We would like to customize the GoldenGate processing and also display the country_name along with the country_id in the CUSTOMERS table itself on the target database.

Let us look at two ways of doing this – one using a SQL query and the other case where we use a stored procedure and pass a parameter to the stored procedure.

Case 1 – using SQL Query

Here we will use a SQL statement to obtain the value for the column COUNTRY_NAME in the CUSTOMERS table on the target database.

This is our Extract parameter file:

EXTRACT gavinext
USERID idit_prd, PASSWORD idit_prd
RMTHOST indb02, MGRPORT 7809
RMTTRAIL ./dirdat/xx
TABLE idit_prd.customers;

This is the Replicat parameter file:

REPLICAT gavinrep
SETENV (NLS_LANG=”AMERICAN_AMERICA.WE8ISO8859P1″)
SETENV (ORACLE_SID=GGDB2)
ASSUMETARGETDEFS
USERID idit_prd,PASSWORD idit_prd
MAP idit_prd.customers, TARGET idit_prd.customers, &
SQLEXEC (ID lookup, &
QUERY “select country_name cname from country_code where country_id =:v_country_id”,&
PARAMS (v_country_id = country_id)),&
COLMAP (USEDEFAULTS, country_name = @GETVAL (lookup.cname) );

Case 2 – Using a database stored procedure

We have a procedure called GET_COUNTRY which accepts the COUNTRY_ID value as a parameter and returns the COUNTRY_NAME as an OUT parameter.

This is the source code nof the database procedure, GET_COUNTRY:

create or replace procedure get_country
(v_country_id IN number, v_country_name OUT varchar2 )
is
begin
select country_name into v_country_name from country_code where country_id= v_country_id;
end;
/

We we call this procedure from GoldenGate using the SQLEXEC parameter in the Replicat parameter file and we see how by passing the parameter to the variable v_country_id and using the @GETVAL function, the COUNTRY_NAME column is being populated in the target database.

REPLICAT gavinrep
SETENV (NLS_LANG=”AMERICAN_AMERICA.WE8ISO8859P1″)
SETENV (ORACLE_SID=GGDB2)
ASSUMETARGETDEFS
USERID idit_prd,PASSWORD idit_prd
MAP idit_prd.customers, TARGET idit_prd.customers, &
SQLEXEC (SPNAME GET_COUNTRY, &
PARAMS (v_country_id = country_id)),&
COLMAP (USEDEFAULTS, country_name = @getval (GET_COUNTRY.V_COUNTRY_NAME) );

Using GoldenGate Tokens with the COLMAP clause

$
0
0

We can use the @TOKEN function to extract data which is stored in what is called the user token area of the GoldenGate trail file record header.

We can populate this Token data from information stored in the header portion of trail records using the GGHEADER option of the GETENV

You need to be logged in to see this part of the content. Please Login to access.

Customizing GoldenGate processing using SQLEXEC and GETVAL

$
0
0

Let us see how we can use the SQLEXEC parameter of GoldenGate to execute both an SQL query as well as a stored procedure and then using the @GETVAL function, we can populate a column in the target database which is not present on the source table.

Using a simple

You need to be logged in to see this part of the content. Please Login to access.

Using GoldenGate Tokens with the COLMAP clause

$
0
0
You need to be logged in to see this part of the content. Please Login to access.

Customizing GoldenGate processing using SQLEXEC and GETVAL

$
0
0
You need to be logged in to see this part of the content. Please Login to access.