Données. Partout. Tous sages dans leur propre environnement.
Mais qui les laisse réellement se parler ? Vous le faites. Avec intégration de données. Devenez un expert des données et ajoutez de la valeur avec ETL et vos nouvelles connaissances !
Talend Open Studio est une solution ouverte et flexible solution d’intégration de données. Vous construisez vos processus avec un éditeur graphique et plus de 600 composants offrent de la flexibilité.
Chaque section contient un exemple pratique et vous recevrez ce document complet sur le début du cours. Ainsi, vous pouvez non seulement visualiser chaque section, mais également la comparer à votre propre solution. De nombreux scénarios pratiques sont également inclus. Vous serez donc bien équipé pour la pratique !
Quels sont les sujets les plus importants auxquels vous pouvez vous attendre ?
- Installation sur différents systèmes d’exploitation (Windows, Linux, Mac )
- comprendre et utiliser les types de données importants
- lire et écrire à partir de bases de données
- traiter différents formats de fichier, comme Excel, XML, JSON, délimité, positionnel
- créer et utiliser métadonnées
- construire des schémas
- utiliser des raccourcis clavier utiles
- récupérer données de WebServices/ REST
- connectez-vous à GoogleDrive et récupérez les données
- à l’aide d’itérations et de boucles
- convertir les flux de données en itérations
- construire et comprendre les hiérarchies de tâches
- Toutes les transformations majeures : mapper, joindre, normaliser, faire pivoter et agréger les données
- créer et extraire XML et JSON
- utiliser des expressions régulières
- Orchestrer les composants dans les processus
- Vérifier et améliorer la qualité des données
- Utiliser la correspondance approximative et la correspondance par intervalle
- Utiliser des variables pour différents environnements
- Effectuer la validation du schéma
- Gérer les données de rejet séparément
- Recherchez et corrigez rapidement les erreurs
- Écrivez des journaux significatifs
- Incluez et réagissez aux avertissements et aux abandons
- Créer des hiérarchies de postes et transmettre des données entre différents niveaux
- implémenter et tester vos propres hypothèses
- configurez votre projet pour la journalisation, la gestion des versions et le chargement du contexte
- apprenez les meilleures pratiques et établissez votre propre document
- éléments et faire générer la documentation
Qu’attendez-vous ? À bientôt dans le cours !
Course Overview
You will get a quick overview of the course structure.
See your first 10 Minutes ETL in Talend Open Studio
Know the key points to get the most out of the course.
Data Integration and ETL
You will see why data integration is important.
Test your knowledge on some basics of data integration
Setup For The Course
You will get an overview of this section.
We review the steps of installing Talend Open Studio on Windows, Mac and Linux.
Change the interface language of Talend Open Studio to English, because translations are incomplete and the whole course project is in English.
Import the complete project for this course. This way you can compare any setting of any job of any video throughout the complete course.
With the archive attached to this lecture you obtain all files for the course.
We take a look at the structure of the course project.
"Hello World" Example
You will get an overview of this section.
Let's create a very easy "hello world" example to get you going.
The User Interface (UI)
You will get an overview of this section.
Get a brief overview of the UI structure in Talend Open Studio.
Get to know some useful shortcuts and functionalities to start working effectively with Talend.
Some components need external libraries which can easily be installed.
Get some tips and tricks on how to resolve errors easier in Talend.
Shows you how to restore a tab once (accidentally) closed
Your First ETL Job
You will get an overview of this section.
We will build a more complete job together. It contains data creation, transformation and output.
Process Files
You will get an overview of this section.
Read many types of files easily in Talend.
We will see how to write a Excel file. But you can create many other formats as well.
List folder content by using search patterns. This way you can easily process several files with the same structure.
We will read, write and iterate over files. This will be extended by using variables and checking file existence before doing the actual work.
Properties In Talend
You will get an overview of this section.
You learn the differences between "BuiltIn" and "Repository" information.
Learn what a "schema" is in Talend and how to use it effectively.
You will get an overview of some important data types.
You will get an overview of the different row and trigger connection types and their usage in Talend.
Learn the most important key shortcut and how to use it properly.
Test your knowledge about Talend Open Studio properties
Processing Databases
You will get an overview of this section.
Lets see which database you'll use for the following examples.
You will learn how to connect to a database using Talend.
You will write some data to a table in a database.
You will read some data from a table in a database.
You will list some some database objects of a certain type.
This scenario shows you how to make a lookup and output the result into a database.
Process Cloud, REST, JSON, XML, etc.
You will get an overview of this section.
You will learn how to read a JSON file with and without loops.
You will learn how to read XML files with loops.
You will learn how to access Google Drive folders using Talend.
You will learn how to access Google Cloud storage using Talend.
Learn how to use a REST service in Talend.
You will learn how to use Talend to download files from the web.
You will learn how to read RSS feeds in Talend.
Do you know which component is for what?
Variables
You will get an overview of this section.
Learn an easy shortcut to create context variables in Talend.
Create variables and environments for variables (= "contexts") in your job.
You will learn what data types variables can have.
Learn how to store your variables in context groups in a central place.
You will learn to create context groups for metadata items very easily.
Talend allows you to load variables from an external source. Learn how to do that.
With an easy setting you can load variables implicitly. Learn how!
Learn what global variables are and how to use them.
Test your knowledge about variables!
Transformations
You will get an overview of this section.
You will see how to filter columns and rows in Talend.
Learn how to sort data of type date, numeric and alphanumeric.
You will learn how to create sums, averages and other aggregate values with Talend.
Learn some important data type conversions and some common traps.
Learn how to split data row-wise or data-wise in Talend
You will learn how to normalize and de-normalize values in Talend.
You will learn how to use tJoin in Talend for simple joins.
You will see how to install Sakila sample database. We will use this data for the next couple of examples.
Learn how to join data over more than 2 levels using tMap.
Learn how to filter data in tMap.
Learn how to transform data in a tMap.
We will do the PreJob and PostJob setup for this scenario.
You will learn how to use tExractJSONFields component to parse JSON data.
You will learn how to use tExractXMLFields component to parse XML data.
You will learn how to use tExractPositionalFields component to parse positional data.
You will learn how to use tExractDelimitedFields component to parse delimited data.
You will learn how to use tExractRegexFields component to parse data using regular expressions.
We will make a short summary of the field extraction components scenario.
Learn how to create JSON and XML fields using the corresponding components.
Learn two ways to generate sample data in Talend.
Learn how to pivot data in Talend.
Test your knowledge by answering this transformations quiz!
Data Quality
You will get an overview of this section.
Learn how to separate uniques from duplicates based on your definition.
Learn how to match data against interval references.
Learn how to substitute data in different ways easily.
Learn how to check your data against schemas and separate inconsistent values.
Learn how to easily create a redundnacy key for regular duplicate checks.
Learn how to use fuzzy algorithms to match data against reference data.
File Management
You will get an overview of this section.
You will learn how to do basic file operations wih Talend, such as create, copy and delete.
You will learn how to compare file contents in Talend.
You will learn how to get file properties in Talend.
You will learn how to create lists of files for directories wih Talend.
You will learn how to compress files wih Talend.
You will learn how to create and use temporary files with Talend.
Job Orchestration
You will get an overview of this section.
You will learn how to use PreJob and PostJob to execute steps at the very beginning or the very end of your process.
You will learn how to use tMessageBox.
You will learn how to copy data flows and unite several data flows into one.
You will learn how to convert a data flow into an iteration and the other way around.
You will learn how to use and combine different types of loops in Talend.
You will learn how to make your job wait for events and mesaure time taken for some processing.
You will learn how to interact with the operating system in Talend.
You will learn how to build and show job hierarchies in Talend.
You will learn how to use subjob triggers in Talend jobs.
You will learn how to use component triggers in Talend jobs.
You will learn how to use IF triggers in Talend jobs.
Test your knowledge of orchestration in Talend!
Logging
You will get an overview of this section.
Learn how to run your jobs in debug mode and see the data as its being transformed.
You will learn how to check your assumptions in jobs with tAssert.
You will learn how to measure data volumes using tFlowMeter.
You will learn how to log errors and warnings using tDie and tWarn.
You will learn how to log job executions in general as well as individual components.
You will learn how to configure your jobs for a comprehensive logging.
You will learn how to configure your jobs better for a comprehensive logging.
You will learn how to configure your projects for a comprehensive logging.
Test your logging knowledge!
Best Practices & Documentation
You will get an overview of this section.
Learn the most important best practices for most Talend projects.
Learn how to create documentation in Talend, with subjob title, component view tab, note component, doc in repo, business models, create and export job doc, job properties, component arrangement and many more.
Deployment
You will get an overview of this section.
You will learn how to run a Talend job standalone - without the studio.
You will learn how to change the parameters for the job execution.
Project Handling
You will get an overview of this section.
You will learn how to create, delete and switch projects.
You will learn how to export and import Talend objects.
Find Items using the repository filter and learn to search for components
You will learn how to access and edit project settings.
Use Cases
You will get an overview of this section.
Transform a text file to Excel.
Read a multi-schema XML and split its data into separate streams.
Write data from several steps of your process to the same file by using "append".
Read credentials for your Talend job from HashiCorp Vault.
Course Summary
This is the closing video of the course. Please don't forget to send me your questions, suggestions and other feedback.
Extra Material
Get a PDF summarizing the most important topics from this course.
Test yourself with this small test.
Test yourself with this big test.
This can be the basis for your job designs.