Top Azure Data Factory Interview Questions (2024) | CodeUsingJava
















Most frequently asked Azure Data Factory Interview Questions


  1. What is Azure Data Factory?
  2. What are the steps invovled in ETL process?
  3. What is Cloud Computing?
  4. What are the advantages of Cloud Computing?
  5. Limit on the number of integration runtimes?
  6. What are the components Azure Data Factory?
  7. How to Monitor and manage Azure Data Factory pipelines?
  8. How to remove data sets in bulk?
  9. What is Azure Table Storage?
  10. What are Azure Storage Types?
  11. Difference between mapping and wrangling data flows?


What is Azure Data Factory?

Azure Data Factory ia an integration service that allows us to create data driven workflows in the cloud for automating and orchestrating all the data movement and transformation.It helps us to create and schedule the data-driven workflows that will ingest data from disparate data stores.
Azure Data Factory with the help of compute services like HDInsight Hadoop, Spark, Azure Data Lake Analytics, and Azure Machine Learning, can process and transform the data.

Azure


What are the steps invovled in ETL process?

ETL process generally involves four steps:
  • Connect & Collect - Helps in moving the data from on-premises and cloud source data stores.
  • Transform - Helps in collecting the data by using compute services such as HDInsight Hadoop, Spark, etc.
  • Publish - Helps in loading the data into Azure Data Warehouse, Azure SQL Database, and Azure Cosmos DB, etc.
  • Monitor - Helps in supporting the pipeline monitoring via Azure Monitor, API, PowerShell, Log Analytics, and health panels on the Azure portal.

What is Cloud Computing?

Cloud computing allows all businesses and individuals for comsuming computing resources such as virtual machines, databases, processing, memory, services, storage, or event number of calls or events and pay-as-you-go.It is culmination of numerous attempts at a large scale computing with seamless access.
Cloud Computing is scalable and reliable as there is no limit on the number of users of resources.As it increases processing and resources.

What are the advantages of Cloud Computing?

Scalability
Agility
High Availability
Latency
Moving from Capex to Opex
Fault Tolerance


Limit on the number of integration runtimes?

An Azure subscription can have one or more Azure Data Factory instances.

What are the components Azure Data Factory?

  • Pipeline
  • Activity
  • Mapping Data Flow
  • Dataset
  • Linked Service
  • Trigger
  • Control flow

Can we Monitor and manage Azure Data Factory pipelines?

Here are the steps to follow:
We need to click on Monitor & Manage on the Data Factory tab.
Secondly we have to click on Resource Explorer.
We will find - pipelines, datasets, linked services in a tree format.


How to remove data sets in bulk?

We can operate the data sets by using PowerShell snippet.
Get-AzureRmDataFactory -ResourceGroupName  -Name  Get-AzureRmDataFactoryDataset | Remove-AzureRmDataFactoryDataset


What is Azure Table Storage?

Azure Table Storage is a service used across many projects which helps us to store structured data in the cloud and also provides a key store with a schemaless design.It is fast and cost effective for many applications.
Table storage can store flexible datasets like user data for a web application or any other device information or any other types of metadata which your service requires.We can store any number of entities in the table.

What are Azure Storage Types?

Blobs
Tables
Files
Queues


Difference between mapping and wrangling data flows?

Mapping
It provides ways to transform data at scale without coding.
Data flow is great.
Helps in transforming data with both known and unknown schemas in the sinks and sources.
Wrangling
It allows us to do agile data preparation using Power Query Online mashup editor at scale via spark execution.
Data flow is less formal.
Helps in model based analytics scenarios.