Top Apache Airflow Interview Questions (2023) | CodeUsingJava
















Most frequently Asked Apache Airflow Interview Questions


  1. What is Apache Airflow?
  2. What are the Principles of Apache Airflow?
  3. What are the components used by Airflow?
  4. What are the Types of Executor in Apache Airflow?
  5. What is XComs?
  6. What are Jinja Templates?
  7. How can we use Airflow XComs in Jinja templates?
  8. What is DAG in Airflow?
  9. What are the alternatives to Airflow?
  10. How do we Instantiate a DAG?
  11. How do we Importing Modules in Airflow?
  12. How can we delete a DAG?

What is Apache Airflow?

Apache Airflow is an open source workflow management that helps us by managing workflow Orchestration with the help of DAGs(Directed Acyclic Graphs).It is written in Python language and the workflow are created through python scripts.Airflow is designed by the principle of Configuration as Code.
Apache Airflow began at Airbnb for managing platform and the company's icreasing complex workflows.It is known as data transformation pipeline Extract,Transform,Load(ETL) workflow Orchestration Tool.

Airflow


What are the Principles of Apache Airflow?


Airflow


What are the components used by Airflow?

Components used by Airflow are:
  • Web Server - used for tracking the status of our jobs and in reading logs from a remote File Store.
  • Scheduler - used for scheduling our jobs and is a multithreaded python process which use DAGb object.
  • Executor - used for getting the tasks done.
  • Metadata Database - used for storing the Airflow States.

What are the Types of Executor in Apache Airflow?

There are 4 types of Executor in Apache Airflow:

Airflow

Local Executor
Helps in running multiple tasks at one time.
Sequential Executor
Helps by running only one task at a time.
Celery Executor
Helps by running distributed asynchronous Python Tasks.
Kubernetes Executor
Helps in running tasks in an individual kubernet pod.

What is XComs?

XComs(Cross Communication) means a mechanism that lets tasks talk with each other, all the default tasks are isolated and can be running on different machines. They can be identified by a Key and also by task_id and dag_id.

What are Jinja Templates?

Jinja Templates helps by providing pipeline authors containing a set of built-in Parameters and Macros. It is normally a template that contains the following:
Variables and Expressions

How can we use Airflow XComs in Jinja templates?

We can use Airflow XComs in Jinja templates by using the following command:
SELECT * FROM {{ task_instance.xcom_pull(task_ids='foo', key='Table_Name') }}


What is DAG in Airflow?

Directed Acyclic Graph(DAG) helps in maintaining tasks dependencies at a particular time. It is also used for maintaining tasks relations and for ensuring those tasks are executed in an expected Order.

What are the alternatives to Airflow?

Some alternatives to Airflow are as follows:
Luigi
Apache NiFi
Jenkins
AWS Step Functions
Pachyderm
Kubeflow
Argo
Kafka


How do we Instantiate a DAG?

We can Instantiate a DAG by using the following command:
dag = DAG(
    'tutorial', default_args=default_args, schedule_interval=timedelta(days=10))


How do we Import Modules in Airflow?

Apache Airflow Pipeline is a Python Script which is used to define an Airflow DAG Object.
The DAG object; we need this instantiate
from airflow to import DAG
Operators; we need this for operating!
from airflow.operators.bash_operator import BashOperator


How can we delete a DAG?

We can Delete a DAG by using the following command:
For CLI:
airflow delete_dag my_dag_id
REST API (running webserver locally):

For REST API:
curl -X "delete"