Before we begin to code transformations, let us spend some time understanding the features/views available in the ECL IDE, the tool used to write ECL code:
- Builder - Use the builder to edit your ECL code, build and submit it for execution.
- Submit/Compile - Is used to compile an ECL code file and submit it as a job for execution on the cluster
- Output Results - Executed ECL code results can be viewed here.
- Syntax Errors - Check if your ECL code is free of syntax errors using the compile option (F7). The Syntax Errors view displays design time syntax errors.
- Runtime Errors - The error log view displays the errors that occur when ECL code is executed on the cluster.
- Workunits - Displays all the ECL jobs that have been executed on a cluster. It is conveniently categorized by days, months and years.
- Repository - This synonymous to projects in other IDEs. Shows location of files on local storage. For me, it can we found on the hard disk at "C:\Users\Public\Documents\HPCC Systems". It can be configured to point elsewhere by changing the IDE preferences.
- Workspace - Is a logical work environment that can be used to enhance your programming experience.
- Datasets - List the available data sets on the cluster. It is convent to select the data set and copy the label so as to use it in the code
Now back to coding transformations. For the transformation example, we are going to work with the OriginalPerson dataset from Part I and transform the data to create a new TransformedPerson dataset, which is a copy of the OriginalPerson dataset with the First, Middle and Last names converted to upper case.
Open a new builder window (CTRL+N) and type in the following code:
IMPORT Std;
//Declare the format of the source and destination record
Layout_People := RECORD
STRING15 FirstName;
STRING25 LastName;
STRING15 MiddleName;
STRING5 Zip;
STRING42 Street;
STRING20 City;
STRING2 State;
END;
//Declare reference to source file
File_OriginalPerson :=
DATASET('~tutorial::AC::OriginalPerson',Layout_People,THOR);
//Write the Transform code
Layout_People toUpperPlease(Layout_People pInput)
:= TRANSFORM
SELF.FirstName := Std.Str.ToUpperCase(pInput.FirstName);
SELF.LastName := Std.Str.ToUpperCase(pInput.LastName);
SELF.MiddleName := Std.Str.ToUpperCase(pInput.MiddleName);
SELF.Zip := pInput.Zip;
SELF.Street := pInput.Street;
SELF.City := pInput.City;
SELF.State := pInput.State;
END ;
//Apply the transformation
TransformedPersonDataset :=
PROJECT(File_OriginalPerson,toUpperPlease(LEFT));
//Output it as a new Dataset
OUTPUT(TransformedPersonDataset,,'~tutorial::AC::TransformedPerson',
OVERWRITE);
The important step is a call to the Project function. In this particular case it means:
"Transform Dataset File_OriginalPerson to TransformedPersonDataset By applying transformation toUpperPlease for each record of LEFT dataset = File_OriginalPerson"
LEFT is analogous to the LEFT join syntax in SQL. In this case it is the File_OriginalPerson.
Compile and Submit the code. View the results in the Output Results view.
This is some powerful code. ECL lets you solve complex data manipulation problems using simple and concise code. This is only tip of the iceberg. Read the ECL programmers guide and ECL Language reference to discover ECLs immense power.
1 comment:
Hello Arjuna,
I am your follower working here in LN. The way you have tried to depict your understanding is awesome hats off for you.
Thanks
Sanjay
Post a Comment