# Data Pipelines

Data pipelines are essential to the NextBee Loyalty Program as they facilitate the movement and integration of data from different systems and applications. By leveraging data pipelines, businesses can ensure that their loyalty program operates efficiently and effectively.

Data pipelines enable businesses to integrate data from multiple sources, such as CRMs, ERPs, and social media platforms. This integrated data can then be used to gain insights into customer behavior, preferences, and purchase history. With this information, businesses can personalize their loyalty program and reward offerings to better engage customers and drive loyalty.

In addition, data pipelines allow businesses to perform data transformations, cleansing, and enrichment to ensure that the data is of high quality and accuracy. This is critical for businesses to make informed decisions and take effective actions based on the data.

Data pipelines also enable businesses to schedule data integration workflows, ensuring that data is integrated in a timely and efficient manner. By automating data integration tasks, businesses can save time and resources while ensuring the accuracy and reliability of the data.

Overall, data pipelines are a critical component of the NextBee Loyalty Program, enabling businesses to integrate, transform, and manage their data in a user-friendly and efficient manner. By leveraging data pipelines, businesses can gain valuable insights into customer behavior and preferences, personalize their loyalty program offerings, and drive engagement and loyalty.

# List of Data Pipelines

The List of Data Pipelines provides an overview of all the data pipelines configured in the NextBee Loyalty Program Dashboard. The list has several columns, including ID, Name, Data Link, #Runs, Status, Action, and Logs.

The ID column provides a unique identifier for each data pipeline, while the Name column indicates the name of the data pipeline. The Data Link column shows the source of the data pipeline, such as the CRM or ERP system.

The #Runs column shows the number of times the data pipeline has been run, while the Status column indicates the current status of the data pipeline, such as running, stopped, or completed.

The Action column contains a button that takes the user to the Data Pipeline Details view, where they can view more detailed information about the data pipeline, including its configuration, data mapping, transformations, and scheduling.

The Logs icon, when clicked, displays the process logs of the data pipeline process. This can be useful for debugging and troubleshooting any issues with the data pipeline.

The List of Data Pipelines also includes a top right corner button to create a new data pipeline. This button takes the user to the Add New Pipeline form, where they can configure a new data pipeline.

Finally, the list includes pagination and filtering features, allowing users to minimize the list of data pipelines by selected criteria and navigate through multiple pages of the list.

# Create New Data Pipeline

Creating a new data pipeline is an important step in configuring and customizing the NextBee Loyalty Program Dashboard. It involves several steps, including setting the objective of the data pipeline, setting up the data connectors, mapping the data, filtering and transforming the data, branching the data, and scheduling the data pipeline to run at specific times.

Each step in creating a data pipeline requires careful consideration and planning, as it can have a significant impact on the success and effectiveness of the pipeline. The objective of the pipeline should be clearly defined, and the data connectors should be carefully selected to ensure they can integrate with the desired data sources.

Mapping the data requires knowledge of the structure and format of the data, and the filtering and transformation steps can involve complex logic and programming. Branching the data can involve creating separate paths for different data streams, while scheduling the pipeline requires knowledge of when and how often the pipeline should run.

# Set Objective

The Set Objective step of creating a data pipeline involves selecting the main objective of the pipeline from a list of available options. This step captures the name of the data pipeline and allows the user to select one of several options, including Post Activity, Export Points Data, Invite Members, Referral Activity, Update Object, Product Purchase, and Create Business Team Member.

Selecting one of these options using radio buttons and clicking the Create button will navigate the user to the next step, Set Connector. This step is important because it sets the foundation for the rest of the data pipeline and determines the type of data sources and connectors that will be used in the pipeline. Choosing the right objective for the pipeline is critical to ensuring its success and requires careful consideration and planning.

# Set Connector

The Set Connector step is where the user attaches the data source connectors to the data pipeline. The form presents a list of available connectors to choose from, including popular platforms like NextBee, SFTP and Salesforce, among others. If the user is not familiar with data connectors or how to configure them, there is a link provided to the relevant documentation under Business Configuration/ Connectors chapter.

Depending on the type of data connector selected, additional fields may be displayed below the form. These fields are used to configure the connector, such as providing Data Objects, File/Folder Paths, Header Names list, Selection Criteria etc. Once the connector is properly configured, the data pipeline can start to extract data from the connected source.

It is important to note that selecting the right data connectors and configuring them correctly is essential to the success of the data pipeline. The connectors ensure that data flows seamlessly from the source to the data pipeline, where it can be transformed, filtered, and mapped to the user's requirements. Therefore, careful attention should be given to selecting the right connectors and configuring them accurately.

Data connectors are used to connect different systems or applications to the NextBee Loyalty Program Dashboard. These connectors can be configured to pull data from a variety of sources, such as CRMs, ERPs, or social media platforms.

# SFTP Connector

The Set Connector step of the data pipeline creation process allows users to choose and configure a connector to attach to the pipeline. If the user selects the SFTP connector, the screen will display fields for Headers Name, File Name, and Batch Limit. Additionally, there is an option for defining Custom Columns.

The Header Name field is used to specify the header name that will be used in the exported CSV file. This is helpful when users want to import data into other systems that have specific column naming requirements.

The File Name field allows users to specify the name of the file that will be generated by the pipeline.

The Batch Limit field allows users to specify the maximum number of records that should be included in each file generated by the pipeline. This is useful for managing large data sets and ensuring that each file is a manageable size.

Custom Columns option allows users to specify additional columns that should be included in the exported CSV file. This feature is useful when users need to export data with specific column requirements that are not covered by the default fields.

# SalesForce Connector

When the user selects the SalesForce connector option, the Set Connector screen will appear as shown in the picture.

The first step is to select an object or write a query. If you only require data from a single object, simply select the object name, such as Account or Contact. However, if you need to join multiple objects with complex logic to fetch data, you can type the required query in the provided text box.

Once you select the object, the system will automatically fetch the columns from the object and list them with checkboxes. You can select the required fields from the list.

After selecting the fields, you can also select a Trigger Event from the following three options:

  • New Record
  • Updated Record
  • Created/Updated a specific column

If you select the third option, you will also need to select the column from the given dropdown of columns.

Finally, there is a field for Batch Limit where you can specify the number of records processed at a time. The default value is 50.

With these steps, you can set up the SalesForce connector on the Set Connector screen and begin ingesting data.

# NextBee Connector

If you select the NextBee connector option, the Set Connector screen will appear as shown in the picture.

The first step is to provide the CSV Headers. This will ensure that the system knows what type of data is being ingested.

Next, the form provides fields to select one or more criteria to filter the input data. This allows you to select only the data that you need for processing.

You can also specify a custom Batch Limit or let the system default to 50. This controls how many records are processed at a time.

In addition, you can add custom columns based on some constants or custom-made functions. This is useful when you need to add calculated columns or data from external sources to the input data.

With these steps, you can set up the NextBee connector on the Set Connector screen and begin processing the data from your uploaded CSV files.

# AWS S3 Connector

If you select the AWS S3 connector option, the Set Connector screen will appear as shown in the picture.

The first step is to provide the CSV Headers. This will ensure that the system knows what type of data is being ingested or exported.

Next, we need to provide the S3 file key that contains the CSV file. This key is used by the system to locate and access the file.

You can also specify a custom Batch Limit or let the system default to 50. This controls how many records are processed at a time.

In addition, you can add custom columns based on some constants or custom-made functions. This is useful when you need to add calculated columns or data from external sources to the input or output data.

With these steps, you can set up the AWS S3 connector on the Set Connector screen and begin ingesting or exporting data from your S3 bucket.

# Transformation

After setting up the connector, the form navigates to the Transformation screen. Here, the user can add any number of transformations to the ingested data. The transformations include the following options:

# JSON Parse

This transformation allows you to parse JSON strings into structured data. This is useful when dealing with APIs that return data in JSON format.

# Changing Date Format

This transformation allows you to convert dates from one format to another. This is useful when dealing with data from different systems that use different date formats.

# String Replace with Another

This transformation allows you to replace specific strings in the data with other strings. This is useful when dealing with data that has typos or inconsistencies.

# Convert Case

This transformation allows you to change the case of the data to upper case, lower case, or title case. This is useful when dealing with data that has inconsistent cases.

# Truncate Last Few Characters

This transformation allows you to truncate the last few characters from the data. This is useful when dealing with data that has a specific pattern at the end.

# Trim White Space

This transformation allows you to remove any leading or trailing white space from the data. This is useful when dealing with data that has extra space.

# Apply Ceiling of a Decimal

This transformation allows you to round up decimal values to the nearest whole number.

# Apply Floor to a Decimal

This transformation allows you to round down decimal values to the nearest whole number.

# Custom Written Function by NextBee Team

This transformation allows you to apply custom transformations to the data. The NextBee team can help you write these functions to meet your specific needs.

You can add any number of these transformations to the data pipeline to transform the ingested data as needed.

# Map Data

The Data Mapping step allows businesses to map data from one system to another. This step is critical for ensuring that the data is properly integrated and can be used effectively by the business.

In this screen, we have all the columns from the ingested data listed to the left, and a selection box to the right of each column to select the mapping column from NextBee data types. This allows you to map the data from the source system to the destination system.

For example, if you are ingesting data from a SalesForce object, you can map the columns to corresponding columns in another system, such as a database or an email marketing tool. This ensures that the data is properly integrated and can be used effectively by the business.

You can select the mapping column from the NextBee data types, such as text, date, number, or boolean. This allows you to map the data to the correct data type in the destination system.

With the Data Mapping step, you can ensure that the data is properly mapped and integrated between systems, ensuring that the business can make the most of its data.

After saving the mappings, the form navigates to the Filter Data section. Here, you can apply filters to the data to extract the specific data you need for processing.

# Filter Data

The Filter Data section allows you to select the column and apply filters based on conditions such as equal to, not equal to, greater than, less than, and so on. This allows you to extract the specific data you need for processing.

For example, if you are processing data for a specific date range, you can apply filters to the date column to extract only the data within that range. This ensures that you are processing only the data you need and can help to speed up the processing time.

You can apply multiple filters to the data and also use logical operators such as AND and OR to combine them. This allows you to create complex filters that can extract the specific data you need for processing.

With the Filter Data section, you can ensure that you are processing only the data you need, which can help to improve the accuracy and efficiency of your data processing pipeline.

# Branch Data

The Branch Data step allows you to take criteria to branch the logic of your data processing pipeline. This allows you to create separate activities for each branch and customize the processing logic for each branch.

You can define the criteria for branching the logic, such as a specific value in a column or a combination of values in multiple columns. This allows you to create multiple branches based on different criteria.

For each branch, you can create a separate activity and define the processing logic for that activity. This allows you to customize the processing logic for each branch based on the specific needs of that branch.

You can also define whether a new member can be created if they do not already exist in the system. This allows you to create new members as needed and ensure that all relevant data is processed.

Additionally, you can create custom transaction IDs based on a given pattern. The ability to create custom transaction IDs allows you to align the transaction IDs with your business process. This is important for tracking and auditing purposes, as well as ensuring that the data is properly integrated into the system. Custom transaction IDs can help to ensure that the data is easily traceable and can be used effectively by the business.

Finally, you can add branch-level transformations to customize the processing logic for each branch. This allows you to transform the data as needed for each branch and ensure that the data is properly integrated into the system.

With the Branch Data step, you can customize the processing logic for each branch of your data processing pipeline and ensure that the data is properly integrated into the system.

# Scheduling

The Scheduling step allows you to schedule the execution of your data processing pipeline based on specific triggers. This allows you to ensure that the pipeline runs at the appropriate times to process the data effectively.

The scheduling step has the following options:

# Trigger Once

This option allows you to trigger the pipeline only once at a specific time.

# Trigger Every Hour

This option allows you to trigger the pipeline every hour.

# Trigger Every Day At

This option allows you to trigger the pipeline every day at a specific time. If you select this option, you will be prompted to select the hours at which the pipeline should be triggered.

# Trigger Every Week At

This option allows you to trigger the pipeline every week on specific days and times. If you select this option, you will be prompted to select the days of the week on which the pipeline should be triggered, and then the hours at which the pipeline should be triggered.

# Trigger Every Month At

This option allows you to trigger the pipeline every month on specific days and times. If you select this option, you will be prompted to select the days of the month on which the pipeline should be triggered, and then the hours at which the pipeline should be triggered.

Depending on the trigger you select, you will be prompted to provide additional inputs. For example, if you select Trigger Every Day At, you will be prompted to select the hours at which the pipeline should be triggered. Similarly, if you select Trigger Every Week At, you will be prompted to select the days of the week and then the hours at which the pipeline should be triggered. If you select Trigger Every Month At, you will be prompted to select the days of the month and then the hours at which the pipeline should be triggered.

With the Scheduling step, you can ensure that your data processing pipeline runs at the appropriate times to process the data effectively.

# Details of Data Pipeline

The Data Pipeline Details View is a comprehensive view that shows all the settings made to a data pipeline. It contains the following sections:

# Pipeline Details

This section shows the basic details of the pipeline, such as the pipeline name, description, and status. It also shows the creation date and last updated date of the pipeline.

# Connector

This section shows the connector used to ingest or export the data. It shows the object or query used to select the data, the fields selected, and the trigger event used to initiate the pipeline.

# Transformation

This section shows the transformations applied to the data. It shows the type of transformation, such as JSON parse or date format change, and any custom functions applied to the data.

# Map Data

This section shows the data mapping settings used to map the data from the source system to the destination system. It shows the source columns and the mapped destination columns, along with the data type mapping.

# Filter Data

This section shows the filters applied to the data to extract the specific data needed for processing. It shows the column used for filtering and the filter conditions applied.

# Branch Data

The Branch Data section displays the settings for branching the logic of the data processing pipeline and creating separate activities for each branch.

# Schedule

The Schedule section displays the settings for scheduling the data processing pipeline to run at specific times.

In the top right corner of the Details View, there is an Edit button that allows you to modify the settings of the data pipeline. When clicked, an edit form will appear, allowing you to make any necessary changes to the data pipeline. Please note that this Edit feature is only available if the Data Pipeline is in draft mode. Once published, the data pipeline cannot be edited.

# Data Pipeline Logs

The Process Logs View of the Data Pipeline displays information about the processing of the data pipeline. It shows the start time, end time, and status of the data processing.

The status of the data processing can be one of several options, such as Success, Error, or Info. Additionally, beside the status, there is a pull-down icon that, when clicked, displays a detailed view of the process log.

The detailed view of the process log includes the Dropbox ID, total records, processed records, failed records, filtered records, skipped records, and invalid records. It also displays each step log with a timestamp, operation details, result of the operation, and status.

With the Process Logs View of the Data Pipeline, you can view the detailed information about the processing of the data pipeline. This allows you to identify any errors or issues that may have occurred during the processing of the data and take corrective action as needed. Additionally, the detailed view provides valuable information about the processing of the data, which can be used to optimize the data processing pipeline for future use.